ALIs

kommt noch

Interactive Jobs and Batch Queuing with SGE

An introduction on how programs are executed on the Linux Cluster systems. Information about batch queuing, handling of resources and limitations e. g. of run times is provided.


Table of contents


Introduction

For execution of programs the following options are available:

  1. Run the program interactively on a login node.
  2. Run the program as an interactive SGE job
  3. Run the program as a script-driven SGE job

This document provides information on how to configure, submit and execute Sun Grid Engine (SGE) jobs, as well as information about policies on interactive and SGE jobs. Resources for the first two items are normally shared between all users and are intended for editing and compiling work as well as performing short test runs. In particular, all production work should be performed using a script-driven SGE job. A misuse of interactive resources can result in the blocking of the violating account.

Important note: SGE scheduling is being phased out for parallel jobs

Batch queuing on the next generations of HPC Systems is performed via SLURM. Please consult the new documentation for usage of batch queuing on these systems. All parallel job classes will be migrated to SLURM step by step during the second half of 2011.

Prerequisite: A login shell

Before starting your work, you need to log in to one of the interactive nodes lx64ia2 or lx64ia3 from the outside world. There exists a single SGE instance which services all cluster segments.

For use of the visualization systems, separate SGE instances have been set up, so jobs run on these systems will not be visible on the Linux Cluster and vice versa.

Interactive shell

If you have a script or an executable file in your current working directory e.g., myprog.exe, it can be started from the command line via

nice -n 10 ./myprog.exe > myprog.out 2>& myprog.err

In this example, the standard output is diverted to myprog.out, and the standard error to myprog.err. In practice, it may be advisable to

  • send the job into the background, such that the shell can be used for other work in the mean time, and to
  • prevent the job from being killed, if the shell is removed, for example by logging out.

This would be achieved by starting the job via

nohup nice -n 10 ./myprog.exe > myprog.out 2>& myprog.err &

As mentioned above, only short running jobs with a lowered priority should be used on the login nodes. Further information on policies for this type of job is available.

Interactive SGE shell

For performing program testing, an interactive SGE shell can be started on various cluster segments:

  • qrsh -q uv1_interactive xterm  will start a SGE shell on a shared CPU set on the SGI Ultraviolet.

Workaround for a problem with qrsh: Sometimes the module environment is set up incorrectly, so commands are not found even though the modules appear to be loaded. In this case, please type the following commands:

module purge

module load lrz

For qrsh to work properly, a password-free cluster-internal ssh setup must be configured; X11 must be enabled so the terminal can be started. Once the SGE shell has been started, for example short MPI test programs (using 4 cores in this example) can be executed via the command

mpiexec -n 4 ./myprog.exe > myprog.out 2>& myprog.err

Similar as for the interactive shells described above, use of these resources is meant for short or medium length test runs. qrsh will terminate after some time. If resources are overloaded or unavailable, the qrsh command will return an error message. Further information on policies and run time limits for this type of job is available.

 

Script-driven SGE jobs

This type of execution method should be used for all production runs. A step-by-step recipe for the simplest type of job is given, illustrating the use of the SGE commands.

Step 1: Editing an SGE script

Assume you intend to run a serial program myprog.exe from within an SGE job under the account po45aod. This program resides in the subdirectory mydir of your home directory. Use your favourite editor (on the cluster, vi, emacs and kate are available) to edit the following script:

#!/bin/bash this is ignored by SGE, but could be used if executed normally
#$-o $HOME/mydir/myjob.$JOB_ID.out -j y (Placeholder) standard error and output go there. Note that the directory where the output file is placed must exist before the job starts.
#$-N myjob (Placeholder) name of job
#$-S /bin/bash shell to use (should be consistent with shebang above)
#$-M my_email_address@my_domain (Placeholder) e-Mail address (don't forget!)
#$-l h_rt=08:00:00 maximum run time; this may be increased up to the queue limit
#$-l march=x86_64 architecture to run on (alternative:  ia64 for sgi Altix)
. /etc/profile load the standard environment (see below)
cd mydir script part starts with change to working directory

./myprog.exe

start executable

This script essentially looks like a bash script. However, there are specially marked comment lines ("control sequences"), which have a special meaning in the SGE context explained on the right hand of the above table. The entries marked "Placeholder" must be suitably modified to have valid user-specific values.

The following three documents provide additional important information on the LRZ SGE configuration and how to construct more specialized scripts.

 

  • SGE specifications: this subdocument gives a listing of specification options (i.e., #$ controls) including recommendations in favour or against certain settings.
  • Policies and Resource Limits: this subdocument describes the limitations on resources (run times, memory requirements, core/cpu number) used by all job types.
  • Example job scripts: this subdocument provides further example job scripts. In particular, various ways of configuring parallel runs are provided.

 

In the following it is assumed that the script has been saved under the file name job_sge.cmd.

Step 2: Submitting the script

Once the script has been completed, it can be submitted to the queue from a shell on a login node by using the command

qsub job_sge.cmd

At submission time the control sequences are evaluated and stored in the queuing database, and the script is copied into an SGE internal directory for later execution. If the command was executed successfully, the Job ID will be returned as follows:

Your job 3532616 ("myjob") has been submitted

It is a good idea to note down your Job ID's, for example to provide to LRZ HPC support as information if anything goes wrong. The submission command can also contain control sequences. For example,

qsub -l march=x86_64 job_sge.cmd

would override the setting inside the script, forcing it to run on an x86_64 based system instead. qalter allows to change resource requirements of a previously submitted job. For example,

qalter -l h_rt=04:00:00 3532616

would change the maximum run time of the job submitted above from 8 to 4 hours. The qsub (1) and qalter (1) man pages provide further information on how to submit and modify jobs.

Step 3: Checking the status of a job

Once submitted, the job will be queued for some time, depending on how many jobs are presently submitted. Eventually, more or less after previously submitted jobs have completed, the job will be started on one or more of the systems determined by its resource requirements. The status of the job can be queried with the qstat command, which will give an output like

job-ID  prior   name       user         state submit/start at     queue      slots ja-task-ID
---------------------------------------------------------------------------------------------
3532616 0.00000 myjob      po45aod      qw    06/08/2009 17:48:06                1

indicating that the job is queued. Once the job is running, the output would indicate the state to be "r" (=running), and would also provide the host it was running on. While the job is executing, it is possible to log in to this host via ssh and check whether processes are executing properly etc.

A more complete list including jobs of other users can be obtained by issuing

qstat -f -u "*"

When specifying resources which cannot be fulfilled, a job will be queued but may never be started (for example, because too long a run time has been specified). To check for this, you can issue

qalter -w v <job_ID> 

or for an overview over all your  job you can issue the command

sge-jobcheck

If all is well, you will receive a message like

verification: found suitable queue(s)

or else you will see a number of lines and the final printout

verification: no suitable queues

Such a job should be removed from the queue since it will never be started.

Deleting jobs from the queue

To remove a job from any SGE queue the command

qdel [-f] <job ID>

should be used. The optional switch -f forces deletion of running jobs and should thus be used with care.

The qstat (1) and qdel (1) man pages provide further information on the use of these commands.

Documentation and Support