Running Jobs on Bem

Z KdmWiki
Przejdź do nawigacji Przejdź do wyszukiwania

< Bem User Guide < Running Jobs on Bem

The Bem Cluster uses PBS Pro to schedule jobs. Writing a submission script is typically the most convenient way to submit your job to the batch system. Example submission scripts (with explanations) are provided below for a few job types.

Qsub

To run a simple interactive job with default resources (1CPU, 2000MB RAM) use the -I option and the -l walltime=6:00:00 parameter (to set a walltme limit):

 qsub -I -l walltime=6:00:00

Resources are requested by using -l option. The resources are assigned to chunks, and you may request different resources to every chunk. To request interactive job with 2 chunks with 2 CPUs and 300MB of memory plus 3 chunks with 1 CPU and 100MB of memory execute:

 qsub -I -l walltime=6:00:00 -l select=2:ncpus=2:mem=300mb+3:ncpus=1:mem=100mb

Qsub manual pages describes a lot of other interesting options (such as job dependencies, mail events etc). To see manual type:

 man qsub

Job submission scripts

If you are using the same software many times you probably want to use some scripting. Qsub allows you to run your application from a script with all options defined inside. The example script "test.sh" defines PBS options, loads prace module and executes some commands:

 #!/bin/bash
 #PBS -l walltime=24:00:00
 #PBS -l select=1:ncpus=1:mem=100MB
 #PBS -m be
 #PBS -N test-job
 
 module load prace
 cd $dir
 pwd
 date

To submit a job you just type:

 qsub test.sh

It submits your test.sh script as a 24 h job (-l walltime=24:00:00) with job-name test-job (-N parameter) requesting 1CPU and 100MB RAM in one chunk (-l option). PBS server will notice you on job beginning (-m b) and end (-m e).

After the end of the job it's output will be saved in a file "test-job.o$JOBID". $JOBID is the job number printed by qsub after successful job submission. File "test-job.e$JOBID" will contain STDERR. If your application's STDOUT will exceed 30MB please redirect it to a file in your home directory using standard bash redirection:

 app_exec_command > output_filename

We usually write scripts that prepare some environment and then run qsub. The simple sub-test script to change job execution directory to the one the job was submitted from:

 #!/bin/bash
 
 C=$PWD

 cat << EOF | qsub
 #!/bin/bash
 #PBS -l walltime=24:00:00
 #PBS -l select=1:ncpus=1:mem=100MB
 #PBS -m be
 #PBS -N test-job
 
 module load prace
 cd $C
 pwd
 EOF

Checking status of jobs

To see all your waiting and running jobs type:

 qstat -u $USER

Scratch directory

Directories for temporary files generated by computational programs are in the file system /lustre/scratch on Bem.
The Lustre file system is available from all Bem cluster computing nodes and the archive node archiwum.wcss.pl.
System plików Lustre jest dostępny ze wszystkich węzłów obliczeniowych klastra oraz na węźle do archiwizacji archiwum.wcss.pl.
Directory backups in temporary space are not executed.
User have access to the directories of your tasks /lustre/scratch/tmp/pbs.JobID and your personal directory /lustre/scratch/$USER

  • /lustre/scratch/tmp/pbs.JobID
    Directories are created at the start of a task and deleted at the end of the task.
    Environment variable: $TMPDIR
    User have the right to write to your directories only in temporary space.
  • /lustre/scratch/$USER
    Personal directories on /lustre/scratch are created after the user has made a reasoned request, which requires the administrator to review. Requests for such a directory can be sent to kdm @ wcss.pl.
    The contents of these directories are not deleted at the end of the computational task.
    You should control the occupancy of these directories so that redundant files do not take up unnecessary space.


This directories would be available only on worker nodes, so if you want to access its contents, please do it in an interactive job.

< Bem User Guide < Running Jobs on Bem