Running Jobs on Bem: Różnice pomiędzy wersjami

Z KdmWiki
Skocz do: nawigacji, wyszukiwania
m (Scratch directory)
m
 
(Nie pokazano 4 wersji utworzonych przez 2 użytkowników)
Linia 1: Linia 1:
 
{{Prace-user-guide}}
 
{{Prace-user-guide}}
<small>< [[Bem User Guide]] < Running Jobs on Supernova</small>
+
<small>< [[Bem User Guide]] < Running Jobs on Bem</small>
  
 
The [[Bem overview|Bem Cluster]] uses PBS Pro to schedule jobs. Writing a submission script is typically the most convenient way to submit your job to the batch system. Example submission scripts (with explanations) are provided below for a few job types.
 
The [[Bem overview|Bem Cluster]] uses PBS Pro to schedule jobs. Writing a submission script is typically the most convenient way to submit your job to the batch system. Example submission scripts (with explanations) are provided below for a few job types.
 +
 +
== Queues ==
 +
To submit a job you do not need to provide the queue name, the system will submit your job to the proper queue '''based of the walltime''' parameter
 +
qsub -l walltime
 +
 +
To see all the queues please run:
 +
qstat -q
 +
 +
To check the parameters of the queue run:
 +
qstat -Qf queue_name
 +
 +
You may submit your job to a dedicated queue '''prace''', but in current configuration the jobs in this queue still compete with other jobs for resources.
 +
 +
If you need dedicated resources, in case the cluster is heavy loaded, please [[Helpdesk|contact us]]. We will configure the prace queue with different parameters or priority for your project.
  
 
== Qsub ==
 
== Qsub ==
Linia 59: Linia 73:
 
== Scratch directory ==
 
== Scratch directory ==
  
If your application is <code>I/O</code> intensive you may ask (please write to the prace-support@wcss.pl) for setting up your scratch directory for storing and accessing the data during computation:
+
Directories for temporary files generated by computational programs are in the file system <code>/lustre/scratch</code> on Bem. <br> The Lustre file system  is available from all Bem cluster computing nodes and the archive node archiwum.wcss.pl. <br> System plików Lustre jest dostępny ze wszystkich węzłów obliczeniowych klastra  oraz na węźle do archiwizacji archiwum.wcss.pl. <br>Directory backups in temporary space are not executed.
  /lustre/scratch/$USER
+
<br> User have access to the directories of your tasks <code>/lustre/scratch/tmp/pbs.JobID</code> and your personal directory <code>/lustre/scratch/$USER</code>
This directory would be available only on worker nodes, so if you want to access its contents, please do it in an interactive job.
+
* '''/lustre/scratch/tmp/pbs.JobID''' <br>Directories are created at the start of a task and deleted at the end of the task. <br> Environment variable: $TMPDIR <br> User have the right to write to your directories only in temporary space.
 +
* '''/lustre/scratch/$USER''' <br> Personal directories on <code>/lustre/scratch</code> are created after the user has made a reasoned request, which requires the administrator to review. <br> Requests for such a directory can be sent to: '''kdm @ wcss.pl''' <br> The contents of these directories are not deleted at the end of the computational task. <br> User should control the occupancy of these directories so that redundant files do not take up unnecessary space.
 +
<br> This directories would be available only on worker nodes, so if you want to access its contents, please do it in an interactive job.
  
 
<small>< [[Bem User Guide]] < Running Jobs on Bem</small>
 
<small>< [[Bem User Guide]] < Running Jobs on Bem</small>
 
[[Kategoria:User Guide]]
 
[[Kategoria:User Guide]]

Aktualna wersja na dzień 15:05, 15 maj 2017

< Bem User Guide < Running Jobs on Bem

The Bem Cluster uses PBS Pro to schedule jobs. Writing a submission script is typically the most convenient way to submit your job to the batch system. Example submission scripts (with explanations) are provided below for a few job types.

Queues

To submit a job you do not need to provide the queue name, the system will submit your job to the proper queue based of the walltime parameter

qsub -l walltime

To see all the queues please run:

qstat -q

To check the parameters of the queue run:

qstat -Qf queue_name

You may submit your job to a dedicated queue prace, but in current configuration the jobs in this queue still compete with other jobs for resources.

If you need dedicated resources, in case the cluster is heavy loaded, please contact us. We will configure the prace queue with different parameters or priority for your project.

Qsub

To run a simple interactive job with default resources (1CPU, 2000MB RAM) use the -I option and the -l walltime=6:00:00 parameter (to set a walltme limit):

 qsub -I -l walltime=6:00:00

Resources are requested by using -l option. The resources are assigned to chunks, and you may request different resources to every chunk. To request interactive job with 2 chunks with 2 CPUs and 300MB of memory plus 3 chunks with 1 CPU and 100MB of memory execute:

 qsub -I -l walltime=6:00:00 -l select=2:ncpus=2:mem=300mb+3:ncpus=1:mem=100mb

Qsub manual pages describes a lot of other interesting options (such as job dependencies, mail events etc). To see manual type:

 man qsub

Job submission scripts

If you are using the same software many times you probably want to use some scripting. Qsub allows you to run your application from a script with all options defined inside. The example script "test.sh" defines PBS options, loads prace module and executes some commands:

 #!/bin/bash
 #PBS -l walltime=24:00:00
 #PBS -l select=1:ncpus=1:mem=100MB
 #PBS -m be
 #PBS -N test-job
 
 module load prace
 cd $dir
 pwd
 date

To submit a job you just type:

 qsub test.sh

It submits your test.sh script as a 24 h job (-l walltime=24:00:00) with job-name test-job (-N parameter) requesting 1CPU and 100MB RAM in one chunk (-l option). PBS server will notice you on job beginning (-m b) and end (-m e).

After the end of the job it's output will be saved in a file "test-job.o$JOBID". $JOBID is the job number printed by qsub after successful job submission. File "test-job.e$JOBID" will contain STDERR. If your application's STDOUT will exceed 30MB please redirect it to a file in your home directory using standard bash redirection:

 app_exec_command > output_filename

We usually write scripts that prepare some environment and then run qsub. The simple sub-test script to change job execution directory to the one the job was submitted from:

 #!/bin/bash
 
 C=$PWD

 cat << EOF | qsub
 #!/bin/bash
 #PBS -l walltime=24:00:00
 #PBS -l select=1:ncpus=1:mem=100MB
 #PBS -m be
 #PBS -N test-job
 
 module load prace
 cd $C
 pwd
 EOF

Checking status of jobs

To see all your waiting and running jobs type:

 qstat -u $USER

Scratch directory

Directories for temporary files generated by computational programs are in the file system /lustre/scratch on Bem.
The Lustre file system is available from all Bem cluster computing nodes and the archive node archiwum.wcss.pl.
System plików Lustre jest dostępny ze wszystkich węzłów obliczeniowych klastra oraz na węźle do archiwizacji archiwum.wcss.pl.
Directory backups in temporary space are not executed.
User have access to the directories of your tasks /lustre/scratch/tmp/pbs.JobID and your personal directory /lustre/scratch/$USER

  • /lustre/scratch/tmp/pbs.JobID
    Directories are created at the start of a task and deleted at the end of the task.
    Environment variable: $TMPDIR
    User have the right to write to your directories only in temporary space.
  • /lustre/scratch/$USER
    Personal directories on /lustre/scratch are created after the user has made a reasoned request, which requires the administrator to review.
    Requests for such a directory can be sent to: kdm @ wcss.pl
    The contents of these directories are not deleted at the end of the computational task.
    User should control the occupancy of these directories so that redundant files do not take up unnecessary space.


This directories would be available only on worker nodes, so if you want to access its contents, please do it in an interactive job.

< Bem User Guide < Running Jobs on Bem