ALIs
kommt nochSLURM example job scripts
Introductory remarks
The job scripts for SLURM partitions are provided as templates which you can adapt for your own settings. In particular, you should account for the following points:
-
Some entries are placeholders, which you must replace with correct, user-specific settings. In particular, path specifications and e-Mail addresses must be adapted.
-
For recommendations on how to do large-scale I/O please refer to the description of the file systems available on the cluster. It is recommended to keep executables within your HOME file system, in particular for parallel jobs. The example jobs reflect this, assuming that files are opened with relative path names from within the executed program.
-
In case you have to work with the environment modules package in your batch script, you also have to source the file /etc/profile.d/modules.sh.
Serial and Archivation jobs
Serial and archivation jobs are not supported in the SLURM partitions. Please submit an SGE script from the login node lx64ia3 to perform serial processing, or to run the TSM client in batch mode.
Shared Memory jobs
This job type uses a single shared memory node of the designated SLURM partition. Parallelization can be achieved either via (POSIX) thread programming or directive-based OpenMP programming.
Here are example scripts for starting an OpenMP program:
|
MPP cluster, ICE |
UV, Myrinet cluster |
|---|---|
|
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out #SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name> #SBATCH --get-user-env #SBATCH --clusters=mpp1
# in the line above:
# replace mpp1 by ice1 to use the ICE
#SBATCH --nodes=1-1
#SBATCH --cpus-per-task=8
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=08:00:00
|
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out #SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name> #SBATCH --get-user-env #SBATCH --clusters=uv2
# replace uv2 by uv3 to use the other UV partition
# replace uv2 by myri to use a Myrinet node
#SBATCH --nodes=1-1
#SBATCH --cpus-per-task=64
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=08:00:00
# up to 960 threads can be configured on uv2# up to 1120 threads can be configured on uv3# up to 8 or 32 threads can be configured on the Myrinet nodes (see below)# The used value should be consistent
# with --cpus-per-task above
|
For each job, the maximum reasonable value of threads is set inside the script. On the UV or the Myrinet segment, please also specify the value via --cpus-per-task. Furthermore, to select between the 8 and 32 way Myrinet nodes, it may be necessary to specify the partition to be used:
| #SBATCH --partition=myri_std | start job on an 8-way node of the Myrinet cluster |
| #SBATCH --partition=myri_large | start job on an 32-way node of the Myrinet cluster |
MPI jobs
For MPI documentation please consult the MPI page on the LRZ web server. The following examples configure a 64 core job.
On the MPP cluster
|
MPP Infiniband Cluster |
MPP Infiniband Cluster: large memory job (Note: this leaves compute cores unused!) |
|---|---|
#!/bin/bash#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out#SBATCH -D /home/hpc/<group>/<user>/mydir#SBATCH -J <job_name>#SBATCH --get-user-env
#SBATCH --clusters=mpp1#SBATCH --ntasks=64
#SBATCH --mail-type=end#SBATCH --mail-user=<email_address>@<domain>#SBATCH --export=NONE#SBATCH --time=08:00:00
|
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out
#SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name>
#SBATCH --get-user-env
#SBATCH --clusters=mpp1
#SBATCH --ntasks=64
#SBATCH --cpus-per-task=2
# only half the cores on each node are used,
# but 1.8 GB per MPI task available
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=08:00:00
source /etc/profile.d/modules.sh
cd $OPT_TMP/mydata
srun_ps $HOME/exedir/myprog.exe
|
On the Myrinet cluster
|
Myrinet 10 GE 8-way systems |
Myrinet 10 GE 32-way systems |
|---|---|
#!/bin/bash#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out#SBATCH -D /home/hpc/<group>/<user>/mydir#SBATCH -J <job_name>#SBATCH --get-user-env
#SBATCH --clusters=myri#SBATCH --partition=myri_std#SBATCH --ntasks=16
#SBATCH --mail-type=end#SBATCH --mail-user=<email_address>@<domain>#SBATCH --export=NONE#SBATCH --time=72:00:00
# at most 32 cores can be used |
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out
#SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name>
#SBATCH --get-user-env
#SBATCH --clusters=myri
#SBATCH --partition=myri_large#SBATCH --ntasks=32
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=36:00:00
source /etc/profile.d/modules.sh
cd $OPT_TMP/mydata
srun_ps $HOME/exedir/myprog.exe
# at most 32 cores can be used |
On SGI systems
For the Ultraviolet, SLURM will provide you with a cpuset of the required size to which your parallel program will be confined. On the ICE, a suitable number of 8-way nodes will be exclusively assigned to your job. On the UltraViolet, you need to manually select one of the two systems where your job will run.
|
SGI ICE |
SGI Ultraviolet |
|---|---|
#!/bin/bash#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out#SBATCH -D /home/hpc/<group>/<user>/mydir#SBATCH -J <job_name>#SBATCH --get-user-env
#SBATCH --clusters=ice1#SBATCH --ntasks=64
#SBATCH --mail-type=end#SBATCH --mail-user=<email_address>@<domain>#SBATCH --export=NONE#SBATCH --time=08:00:00
|
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out
#SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name>
#SBATCH --get-user-env
#SBATCH --clusters=uv2
#SBATCH --ntasks=64
# or uv3 #SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=08:00:00
source /etc/profile.d/modules.sh
cd $OPT_TMP/mydata
srun_ps $HOME/exedir/myprog.exe
|
Please note:
Please do not use mpirun or mpiexec. Use the LRZ-provided srun_ps command, which is capable of starting up
- programs compiled with Parastation MPI (mpi.parastation module) on the MPP and Myrinet Clusters
- programs compiled with Intel MPI (mpi.intel module) on any of the clusters
- programs compiled with sgi MPT (mpi.mpt module) on the sgi systems
For some software packages, it is also possible to use SLURM's own srun command; this will however not work for programs compiled against Parastation MPI.
It is also possible to use the --nodes keyword in combination with --tasks-per-node (instead of --ntasks) to configure parallel jobs.
If use of hyperthreaded cores is desired on ICE or UV, the --ntasks-per-core=2 setting can be added.
Special job configurations
Hybrid jobs
Programs making joint use of MPI and OpenMP fall into this category. For other parts of the cluster (not all combinations are shown here), some modification may be required.
|
MPP Infiniband Cluster |
sgi ICE |
|---|---|
|
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out
#SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name>
#SBATCH --get-user-env
#SBATCH --clusters=mpp1
#SBATCH --ntasks=64
#SBATCH --cpus-per-task=8
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=08:00:00
source /etc/profile.d/modules.sh
cd $OPT_TMP/mydata
srun_ps -t 8 $HOME/exedir/myprog.exe
# the above command runs with OMP_NUM_THREADS=8 # and 64 MPI tasks, using 32 nodes |
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out
#SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name>
#SBATCH --get-user-env
#SBATCH --clusters=ice1
#SBATCH --ntasks=32
#SBATCH --cpus-per-task=4
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=08:00:00
source /etc/profile.d/modules.sh
cd $OPT_TMP/mydata
srun_ps -t 4 $HOME/exedir/myprog.exe # and 32 MPI tasks, using 16 nodes |
Job Farming (starting multiple serial jobs on a shared memory system)
Please use this with care! If the serial jobs are imbalanced with respect to run time, this usage pattern can waste CPU resources. At LRZ's discretion, unbalanced jobs may be removed forcibly. The example job script illustrates how to start up multiple serial MATLAB jobs within a shared memory parallel SLURM script. Note that the various subdirectories subdir_1, ..., subdir_8 must exist and contain the needed input data.
|
Multi-Serial Example |
|---|
|
#!/bin/bash
#SBATCH -o /home/hpc/<group>/<user>/myjob.%j.%N.out #SBATCH -D /home/hpc/<group>/<user>/mydir
#SBATCH -J <job_name> #SBATCH --get-user-env #SBATCH --clusters=myri
#SBATCH --partition=myri_std
#SBATCH --nodes=1-1
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=08:00:00
|