Running Jobs on Bem: Różnice pomiędzy wersjami

Z KdmWiki
Przejdź do nawigacji Przejdź do wyszukiwania
(Utworzono nową stronę "{{Prace-user-guide}} <small>< Bem User Guide < Running Jobs on Supernova</small> The Bem Cluster uses PBS Pro to schedule jobs. Writing a submissio...")
 
Linia 30: Linia 30:
 
To submit a job you just type:
 
To submit a job you just type:
 
   qsub test.sh
 
   qsub test.sh
It submits your <code>test.sh</code> script as a 24 h job (<code>-l walltime=24:00:00</code> with <code>job-name test-job</code> (<code>-N</code> parameter) requesting 1CPU and 100MB RAM in one chunk (<code>-l</code> option). PBS server will notice you on job beginning (<code>-m b</code>) and end (<code>-m e</code>).
+
It submits your <code>test.sh</code> script as a 24 h job (<code>-l walltime=24:00:00</code>) with <code>job-name test-job</code> (<code>-N</code> parameter) requesting 1CPU and 100MB RAM in one chunk (<code>-l</code> option). PBS server will notice you on job beginning (<code>-m b</code>) and end (<code>-m e</code>).
  
 
After the end of the job it's output will be saved in a file "<code>test-job.o$JOBID</code>". <code>$JOBID</code> is the job number printed by qsub after successful job submission. File "<code>test-job.e$JOBID</code>" will contain <code>STDERR</code>. If your application's <code>STDOUT</code> will exceed 30MB please redirect it to a file in your home directory using standard bash redirection:
 
After the end of the job it's output will be saved in a file "<code>test-job.o$JOBID</code>". <code>$JOBID</code> is the job number printed by qsub after successful job submission. File "<code>test-job.e$JOBID</code>" will contain <code>STDERR</code>. If your application's <code>STDOUT</code> will exceed 30MB please redirect it to a file in your home directory using standard bash redirection:

Wersja z 10:04, 5 kwi 2016

< Bem User Guide < Running Jobs on Supernova

The Bem Cluster uses PBS Pro to schedule jobs. Writing a submission script is typically the most convenient way to submit your job to the batch system. Example submission scripts (with explanations) are provided below for a few job types.

Qsub

To run a simple interactive job with default resources (1CPU, 2000MB RAM) use the -I option and the -l walltime=6:00:00 parameter (to set a walltme limit):

 qsub -I -l walltime=6:00:00

Resources are requested by using -l option. The resources are assigned to chunks, and you may request different resources to every chunk. To request interactive job with 2 chunks with 2 CPUs and 300MB of memory plus 3 chunks with 1 CPU and 100MB of memory execute:

 qsub -I -l walltime=6:00:00 -l select=2:ncpus=2:mem=300mb+3:ncpus=1:mem=100mb

Qsub manual pages describes a lot of other interesting options (such as job dependencies, mail events etc). To see manual type:

 man qsub

Job submission scripts

If you are using the same software many times you probably want to use some scripting. Qsub allows you to run your application from a script with all options defined inside. The example script "test.sh" defines PBS options, loads prace module and executes some commands:

 #!/bin/bash
 #PBS -l walltime=24:00:00
 #PBS -l select=1:ncpus=1:mem=100MB
 #PBS -m be
 #PBS -N test-job
 
 module load prace
 cd $dir
 pwd
 date

To submit a job you just type:

 qsub test.sh

It submits your test.sh script as a 24 h job (-l walltime=24:00:00) with job-name test-job (-N parameter) requesting 1CPU and 100MB RAM in one chunk (-l option). PBS server will notice you on job beginning (-m b) and end (-m e).

After the end of the job it's output will be saved in a file "test-job.o$JOBID". $JOBID is the job number printed by qsub after successful job submission. File "test-job.e$JOBID" will contain STDERR. If your application's STDOUT will exceed 30MB please redirect it to a file in your home directory using standard bash redirection:

 app_exec_command > output_filename

We usually write scripts that prepare some environment and then run qsub. The simple sub-test script to change job execution directory to the one the job was submitted from:

 #!/bin/bash
 
 C=$PWD

 cat << EOF | qsub
 #!/bin/bash
 #PBS -l walltime=24:00:00
 #PBS -l select=1:ncpus=1:mem=100MB
 #PBS -m be
 #PBS -N test-job
 
 module load prace
 cd $C
 pwd
 EOF

Checking status of jobs

To see all your waiting and running jobs type:

 qstat -u $USER

Scratch directory

If your application is I/O intensive use your scratch directory for storing and accessing the data during computation:

 /lustre/scratch/$USER

This directory is available only on worker nodes, so if you want to access its contents, please do it in an interactive job.

< Bem User Guide < Running Jobs on Bem