Globus user guide

What is Globus?

Globus is a royalty-free, open source toolkit for building grid applications. It provides also command-line tools for example to login to a system, transfer files and submit jobs.

Globus services at LRZ

At LRZ, Globus services are deployed on the following machines:

  • Linux cluster

    • login nodes: lxlogin1.lrz.de, lxlogin2.lrz.de, lxlogin4.lrz.de, lxlogin5.lrz.de, lxlogin6.lrz.de

    • visualization servers (accessible only from the login nodes listed above): gvs1.cos.lrz.de, gvs3.cos.lrz.de and gvs4.cos.lrz.de

  • SuperMUC

    • Thin Node Islands login node: sb.supermuc.lrz.de

    • Fat Node Island login node: wm.supermuc.lrz.de

    • SuperMUC phase 2 login node: hw.supermuc.lrz.de

  • Grid gateway to SuperMUC: use gridmuc.lrz.de to access some grid services on SuperMUC from any IP (i.e., no registration of the source address required)

    • Interactive shell on the Thin Nodes Islands (sb.supermuc.lrz.de): gsissh -p 2222 gridmuc.lrz.de. Alternatively, use port 80, if port 2222 is blocked by your connection provider.

    • Interactive shell on SuperMUC phase 2 (hw.supermuc.lrz.de): gsissh -p 22222 gridmuc.lrz.de. Alternatively, use port 443, if port 22222 is blocked by your connection provider.

    • Interactive shell on the Fat Nodes Island (hw.supermuc.lrz.de): gsissh -p 22223 gridmuc.lrz.de.

    • GridFTP access to the $HOME of SuperMUC: globus-url-copy -vb gsiftp://gridmuc.lrz.de:2811/~/...

The visualization servers provide only interactive login and file transfer services. Login nodes make available the job submission facility too. In the remaining part of the document, each service will be described in detail in a dedicated section.

Using Globus commands at LRZ

In order to use Globus commands on LRZ resources it is necessary to load the globus module so that the needed environmental variables are set.

  • on Linux cluster type module load globus
  • on SuperMUC
    • Normal users should type: module load globus
    • PRACE users should enter: module load prace globus

Getting access to the Globus resources. Grid user support

Certificate and UNIX account

In the Grid realm, users are identified and authorized by means of X.509 digital certificates. Instructions on how to apply and install one are available here. The document focuses on long live credentials (usually the lifetime is one year), however, for some German research network sites, it is possible to use the Short Lived Credential Service (SLCS). A set of slides explains (in German) how to use GSI-SSHTerm with SLCS. Please note that on slide 7 you have to select your home institute, which could be different from LRZ.

Your certificate's unique distinguished name (DN) has to be registered on the target machine to enable the usage of Globus services. This is done automatically for accounts belonging to PRACE projects. All other users can register or modify the DN associated to the account by means of the LRZ ID portal. Only few steps are necessary:

  • log into the LRZ ID portal

  • in the menu on the left, under Account, select view

  • in the main window, a list of accounts is shown; click the Select button next to the one you want to associate the DN with

  • enter (or modify) the DN in the first and only editable row. Please note that the DN should be in RFC 2253 format, that is the Common Name (CN) should appear first and the different fields are comma separated. For example, a valid RFC 2253 DN is CN=Jon Doe,OU=Leibniz-Rechenzentrum,O=GridGermany,C=DE.

Detailed instructions on this can be found here: Registering your DN in LRZ-SIM.

Registering IP address for the Internet firewall (SuperMUC only)

SuperMUC services can be reached from the Internet only if user's IP address is known to LRZ. For this or other grid related issues please contact to following mailing list:

  • Normal LRZ users: please contact to grid-support @ lrz.de to register IP address and for grid related questions.

  • PRACE users are kindly asked to use the PRACE trouble ticket system.

Short Globus user guide

Every Globus command has the -help option and a man page always available. At the end of the page there are also some links for further reading.

Downloading Globus

The Globus Toolkit (GT) download pages provide precompiled packages for many Linux distributions, both RPMs (Red Hat Enterprise Linux, CentOS, Scientific Linux) and DEBs (Debian, Ubuntu). To compile/install the whole Globus Toolkit from source, please follow these instructions.

Proxy certificate

All Globus commands need a proxy certificate: it is a short lived certificate (e.g. 12 hours) derived from a personal one. The section above explained how to deal with certificates, so this is a prerequisite for the next steps.

The reason to work with time limited credentials are:

  • single sign on: the proxy certificate can be used with all Globus services installed on different machines at LRZ or at different sites, without the need to generate it again. In other terms, the user carries the proxy certificate with him during all his/her working session;
  • delegation: by means of proxy certificates, some Globus services (e.g. job submission and Globus Online) can act on behalf of the user, simplifiying the entire application workflow and task management.

On a command prompt, the tool employed to create a proxy is grid-proxy-init. It expects to find the personal certificate in PEM format, stored in the .globus subfolder of the user's HOME directory. Please refer to the section "Extracting your certificate" of this page.

GSI-SSHTerm also offer an interface to create and manage proxy certificates. More details are available here.

A proxy certificate can be saved to a MyProxy server so that it is available for online retrieval. The service is particularly useful in case the login node is not powered by GSI-SSH and the user does not want to (or can not) move the personal certificate to the machine. A specific page is dedicated to the MyProxy service.

Interactive login

PRACE users should read also PRACE User documentation pages for interactive access. There is explained also how to use handy prace_service script to avoid needing to know addresses and ports of Globus services offered by PRACE partners. Visualisation server users should read remote visualisation guide (here the version for GSI-SSHTerm).

Port

GSI-SSH service port is 2222.

Login from a workstation

The easiest way to login is to use GSI-SSHTerm, a Java based client with graphical user interface. The user guide contains a step by step procedure on how to proceed.

It is also possible to install Globus on your machine and use its command line client. Please refer to the Downloading Globus section for more details on how to get the entire toolkit or part of it. The connection procedure envisages two steps:

  • the creation of a proxy certificate, issuing grid-proxy-init
  • the actual connection, specifying the port number, to SuperMUC, for example: gsissh -p 2222 sb.supermuc.lrz.de

Using graphical applications

To enable X11 forwarding on the command line GSI-SSH client, Linux users should use the -X, while on Mac the correct switch is -Y.

GridFTP file transfer service

Port

The GridFTP port on all LRZ systems is 2811.

Usage of globus-url-copy

globus-url-copy is the default command line tool provided by the Globus Toolkit. It sllows to copy single files or directories.

The syntax file:///path/file refers to the file system of the client. On the other side, a location on the GridFTP server is indicated as gsiftp://path/file.

For example copying a file from current directory (environmental variable $PWD) to the home folder (~) on SuperMUC:

globus-url-copy file:///$PWD/my_sourcefile.txt gsiftp://sb.supermuc.lrz.de/~/

It is also possible to copy from gridftp server to a gridftp server: both addresses must be given with the gsiftp syntax. If server needs another port than default 2811, it can be put after server address. For example, to copy a file from the home folder on CINECA PLX to SuperMUC:

globus-url-copy gsiftp://gftp-plx.cineca.it:2812/~/my_sourcefile.txt gsiftp://sb.supermuc.lrz.de/~/

To recursively copy a folder (-r) and to create the target directory (-cd):

globus-url-copy -cd -r file:///$PWD/mydirectory/ gsiftp://sb.supermuc.lrz.de/~/mydirectory/

The tool is treated in a more extensive way here. PRACE users should read the GridFTP chapter in the PRACE User documentation.

Graphical client

If you prefer to use a graphical client, or you are not willing to spend time into installing and configuring the Globus Toolkit only to move files, you can use Globus Online. The multi-platform cloud based tool is explained in the section Using Globus Online with LRZ resources.

gtransfer

An intermediate solution, between the basic globus-url-copy command and Globus Online, is represented by gtransfer. The tool is developed in PRACE, it is documented in the gtransfer section of the LRZ Grid wiki.

Job submission

GRAM5, the Grid Resource Allocation and Management component of the Globus Toolkit version 5.x, has been deployed has been deployed on all login nodes of SuperMUC and Linux Cluster. GRAM5 is not backward compatible, so it does not support web services anymore.

Port

All instances use port 2119.

Batch scheduling system

GRAM5 relies on the batch scheduling system to execute the jobs, offering a uniform access interface for different local resource managers. On the Linux Cluster the batch scheduling system is SLURM and on SuperMUC is IBM LoadLeveler. Both machines have also fork available to run single process jobs, which is the default choice in case nothing else is specified. Fork is not appropriate for real computaion since the process runs on the login node, which should not be used for computation. A user can run a job on the computation nodes specifying a switch in the contact string, or target address, a parameter to be passed to the command line tools interacting with GRAM5. A generic form of a contact string is <address>:<port></optional-jobmanager-specification>. The switch for SLURM is /jobmanager-slurm while /jobmanager-loadleveler is used for LoadLeveler. This switches are optional, but if not specified the job will be executed on the login node.

Job submission command

Globus Toolkit provides three command line tools to run jobs. We will classify them on the basis of how the user can specify the job parameters:

  • in the command line: globus-job-run (for interactive jobs, that is the output is redirected to the client) and globus-job-submit (for batch jobs, running on the background);
  • in a script: globusrun (both interactive and batch jobs).

In the following sections, a quick overview of the main features will be given.

Simple job

The simplest submission consists of an interactive job: the executable (with its absolute path) is specified as a parameter on the command line. The Globus tool to do that is globus-job-run, for example: globus-job-run lxlogin1.lrz.de /bin/date. Some interesting options are:

  • -dryrun: the job is not submitted, GRAM5 diagnoses eventual problems
  • -verify: as before, but if all the checks are successful, then the job is submitted
  • -s: the executable is meant to be on the file system of the machine where globus-job-run is executed, so it is staged in (i. e. transferred)
  • -np <NUM>: NUM instances of the executable are run
  • -m <MAXTIME>: the maximum time in minutes for job scheduling
  • -q <QUEUE>: the queue of the local resource manager where the job is sent

The options are specified between the contact string and the executable name, for example globus-job-run lxlogin1.lrz.de -verify /bin/date. For a comprehensive list of options type globus-job-run -help. The batch submission of a single executable is obtained by means of globus-job-submit, for example: globus-job-submit lxlogin1.lrz.de /bin/date. All the options listed above can still be used, moreover, the following can be specified:

  • -stdin <STDIN>: STDIN is the file used as standard input
  • -stdout <STDOUT>: STDOUT is the file where the standard output is redirected
  • -stderr <STDERR>: STDERR is the file used where the standard output is redirected

Remember that you can always specify the -s or -l flags so that globus-job-submit expects STDIN, STDOUT and STDERR respectively on the server (where GRAM5 is running) or on the client (where globus-job-submit is entered, so the files are staged in or out). By default, -s is assumed. You can even specify the flag for each file, i. e. -stdin -s STDIN -stdout -l STDOUT. For an extensive overview, please refer to the man pages typing globus-job-submit -help or to the documentation on the Globus Toolkit web pages. If the submission is succesful, then globus-job-submit returns a contact string that can be used to:

  • query the status of the job: globus-job-status <CONTACT_STRING>
  • cancel the job: globus-job-cancel <CONTACT_STRING>
  • get the results (stage out files): globus-job-get-ouput <CONTACT_STRING>
  • clean the files: globus-job-clean <CONTACT_STRING>

Please refer to the online documentation for further details.

Job script

For non trivial jobs, a script in Globus specific RSL (Resource Specification Language) language is needed. globusrun can be used to submit to GRAM5 such a file, for example: globusrun -s -r lxlogin1.lrz.de -f myscript.rsl. Switch -s initiate the interactive mode, -r specifies the contact string, while -f points to the RSL file. The batch mode is triggered replacing -s with the -b switch, remembering to specify a value at least for the stdout and stderr keys in the RSL file. The status can be monitored entering globusrun -status <CONTACT_STRING>, where <CONTACT_STRING> is the contact string returned after successful submission. More details are available on the man pages (globusrun -help) or directly online. Regarding the RSL language, the script file is a set of statements of the form (key=value). The most commonly used keys are:

  • executable: command with absolute path to be run
  • arguments: command line arguments for the executable
  • directory: where the job is run; home directory is by default $GLOBUS_USER_HOME
  • count: number or processes
  • max_time: length of the job in minutes
  • job_type: use mpi for MPI jobs
  • queue: the queue of the local resource manager where the job should be directed (platform specific)
  • stdin: the file to be used as standard input
  • stdout: the file on the GRAM5 server where to store the standard output of the job
  • stderr: the file on the GRAM5 server where to store the standard error of the job
  • file_stage_in: it is a set of ("remote URL" "local file") values specifying the URL from where the file should be staged in and the destination file name; the usual value for "remote URL" is gsiftp://<GridFTPserver>:<port>/<full_path_to_your_file>
  • file_stage_out: it is a set of ("local file" "remote URL") values specifying the local file to be staged out and the destination URL; the usual value for "remote URL" is gsiftp://<GridFTPserver>:<port>/<full_path_to_your_file>
  • file_clean_up: a list of files to be removed after job completion

An detailed description of the RSL language is part of the Globus Toolkit documentation.

Examples

Example of MPI job using 3 processor cores and 130 minutes:

&   (executable=/home/path/helloHost)
    (argument=2)
    (directory=$(HOME))
    (stdout=$(HOME)/test.stdout)
    (stderr=$(HOME)/test.stderr)
    (count=3)
    (max_time=130)
    (job_type=mpi)
    (file_stage_in=(gsiftp://my.gridftp.server/path/file $(HOME)/path/))
    (file_stage_out=($(HOME)/output.txt gsiftp://my.gridftp.server/path/))
    (file_clean_up=$(HOME)/test.stdout $(HOME)/test.stderr)

LoadLeveler on SuperMUC specific

LoadLeveler accepts job described according to the Job Command File (JCF) format. The LoadLeveler adapter of GRAM5 is tranlating the RSL script in a JCF file according to this table:

RSL Job Command File keyword
directory initialdir
stdin input
stdout output
stderr error
count total_tasks
environment environment
maxTime wall_clock_limit
maxWallTime wall_clock_limit
maxCpuTime cpu_limit
jobType job_type
queue class
project account_no
hostCount node
minMemory Memory variable of requirements

Since RSL is not reach enough to describe all the features of a Job Command File, then the following environment variables do the remaining mapping:

Environment variable Job Command File keyword
GBLL_COMMENT comment
GBLL_BLOCKING blocking
GBLL_REQUIREMENTS requirements
GBLL_PREFERENCES preferences
GBLL_TASKS_PER_NODE tasks_per_node
GBLL_NETWORK_MPI network.mpi
GBLL_NETWORK_LAPI network.lapi
GBLL_NETWORK_MPI_LAPI network.mpi_lapi
GBLL_NETWORK_PVM network.pvm
GBLL_RESTART restart
GBLL_POOL Pool variable of requirements
GBLL_EMAIL notifiy_user

These environment variable and the respective vales should be specified in the RSL file using the environment keyword or passed to the globus-job-run command by means of the -env option.

According to the resource needed you can choose the proper queue for your job. Please, refer to the following table for details:

The minimum assignment unit is a node (corresponding to 16 cores on SuperMUC and 40 cores on SuperMIG), so the user will be always charged for a multiple of 16 or 40 cores, as explained in the Node allocation policy.

Depending on the job to be submitted to LoadLeveler, there are different job manager tags to be specified in the contact string. The Parallelization models supported on SuperMUC chapter in the SuperMUC user guide is a good start to understand the supported jobs and compilers. First of all, remember that all job managers:

  • add /etc/profile and /etc/profile.d/modules.sh to the job description submitted to LoadLeveler;
  • if the keyword queue is not specified or is not valid, then the job manager tries to match the correct LoadLeveler class according to the number of cores requested (without considering special, tmp1 and tmp2 queues). A note on the matching process for the thin nodes islands: the test queue is taken into account only if the wallclock time is less than two hours, otherwise the job is assigned to the micro queue;
  • set the wallclock time to the maximum allowed by the queue, if not specified.

Moreover, the job manager should be able to know how many tasks to run, so define at least count or host_count plus the environment variable GBLL_TASKS_PER_NODE.

Below you can find some details about the queues available on SuperMUC's thin and fat node islands:

Thin nodes

Queue Maximum number of cores Maximum time (hours)
test 512 2
micro 512 48
general 8192 48
large 32768 48
special 147456 48
tmp1, tmp2 147456 48

Fat nodes

Queue Maximum number of cores Maximum time (hours)
fattest 160 2
fat 2080 48
special 8000 48
tmp1, tmp2 8000 48

Please be aware that special, tmp1 and tmp2 are available only for restricted users.

Finally, the following table summarizes the available job managers together with some details about the expected RSL parameters

Job manager tag Flavour How the job is run by LoadLeveler Expected RSL parameters
jobmanager-loadleveler Parallel job run by means of the POE utility poe <executable> <arguments> -
jobmanager-loadleveler_mpi_ibm Paralle MPI job with IBM executable mpiexec -n <total number of cores> <executable> <arguments> count and/or host_count > 1; executable built with IBM compiler
jobmanager-loadleveler_mpi_intel Paralle MPI job with Intel executable mpiexec -n <total number of cores> <executable> <arguments> count and/or host_count > 1; executable built with Intel compiler
jobmanager-loadleveler_hybrid_ibm Paralle Hybrid job with IBM executable mpiexec <mpi options> <executable> <arguments> OMP_NUM_THREADS [1], GBLL_TASKS_PER_NODE and MP_SINGLE_THREAD (optional) environment variables; count or host_count; executable built with IBM compiler
jobmanager-loadleveler_hybrid_intel Paralle Hybrid job with Intel executable mpiexec <mpi options> <executable> <arguments> OMP_NUM_THREADS [1], GBLL_TASKS_PER_NODE and MP_SINGLE_THREAD (optional) environment variables; count or host_count; executable built with Intel compiler
jobmanager-loadleveler_openmp OpenMP <executable> <arguments> OMP_NUM_THREADS [2] and KMP_AFFINITY (optional) environment variables count and/or host_count > 1; executable built with Intel compiler
jobmanager-loadleveler_parallel Parallel job without POE support <executable> <arguments> If the executable is a script, then its content is appended to the file submitted by GRAM5 to the local resource manager. This allows the user to configure the environment, for example loading the needed modules. Eventually it is necessary to specify in here the mpiexec command. Every occurrency of $0 is replaced by the executable's name, while $1, $2, ... are substituted, respectively, by the first, second, ... argument specified in the RSL argument parameter. The arguments are separated by a space (one or more), so if an entry contains spaces, then please enclose it between double quotes.

[1] If OMP_NUM_THREADS is not specified, then it is set by default to <number of requested nodes> * <number of cores per node> / <total number of requested tasks>.

[2] If OMP_NUM_THREADS is not specified, then it is set by default to =<number of cores per node>.

SLURM on the Linux Cluster specific

In order to fully understand all the options it would be beneficial to overview how the Linux Cluster is organized.

There are 6 login nodes available, please refer to the following table to check the combination of hostname and job manager to reach the desired partition.

Linux Cluster partitionHostnameJob manager
serial lxlogin1.lrz.de, lxlogin2.lrz.de, lxlogin3.lrz.de lxlogin4.lrz.de jobmanager-slurm
mpp1
myri
mpp2 lxlogin5.lrz.de, lxlogin6.lrz.de jobmanager-slurm

The contact string (that is, the string to be passed to the GRAM5 tools to identify the resource) would be <Hostname>/<Job manager name>, for instance lxlogin1.lrz.de/jobmanager-slurm to reach the serial partition through the login node lxlogin1.lrz.de.

The login nodes lxlogin1.lrz.de, lxlogin2.lrz.de, lxlogin3.lrz.de and lxlogin4.lrz.de serve three partitions (serial, myri and mpp1) through the same job manager (jobmanager-slurm). The job manager uses the value of the RSL parameter job_type to direct the job to the most appropriate partition. Please refer to the following table:

job_typedefault cluster (partition)
single serial (serial_std or serial_long)
multiple myri (myri_large) if embarassing parallel
myri (myri_large or mpp1_batch) if openmp
mpi mpp1 (mpp1_batch) if pure MPI
mpp1 (mpp1_batch) if hybrid

Important notes:

    • The environment module is setup, so the job manager adds source /etc/profile.d/modules.sh to the SLURM job description to be submitted;
    • if the RSL only defines count, then this is the number of cores requested (-n or --ntask in SLURM;
    • if the RSL defines host_count, then this is the number of nodes (-N or --nodes in SLURM) and count identifies the number of tasks per node (--ntasks-per-node in SLURM);
    • for debug reason, if you want to save the job description file submitted to SLURM, you have to define the DUMP_CMD environment variable in the RSL file (i.e., (environment=(DUMP_CMD yes))). The file will be saved in your home folder as slurm_job-<job id>.cmd.
How to define a serial job

In the RSL file:

      • (job_type=single)
        

No matter what, the number of nodes and the number of tasks is set to one. If the time limit (max_time in the RSL) is:

      • lower than 10 minutes, then the job goes to the testserial cluster. It is meant only for test, since there's only one node;
      • bigger than 14400 minutes, then the job goes to the serial_long partition, where the maximum walltime is 20 days (28800 minutes);
      • otherwise, the default partition is serial_std.

The executable is run as:

cd <job dir>
./<executable> <arguments>

How to define an MPI job

In the RSL file:

      • (job_type=mpi)
        
      • (count=n)
        

or

      • (host_count=N)
        
      • (count=n)
        

minding the note at the beginning regarding the different meaning of count when host_count is defined. All MPI jobs go to the mpp1 cluster, the only partition is mpp1_batch.

The executable is run as:

cd <job dir>
mpiexec ./<executable> <arguments>

If the executable is a script, then its content is appended to the file submitted by GRAM5 to the local resource manager. This allows the user to configure the environment, for example loading the needed modules, but the command mpiexec has to be specified inside the script itself. Moreover, every occurrency of mpirun or srun_ps is replaced by mpiexec. Each $0 appearing in the executable script is replaced by the executable's name, while $1, $2, ... are substituted, respectively, by the first, second, ... argument specified in the RSL argument parameter. The arguments are separated by a space (one or more), so if an entry contains spaces, then please enclose it between double quotes.

How to define an embarassing parallel job

In the RSL file:

      • (job_type=multiple)
        
      • (count=n)
        

or

      • (host_count=N)
        
      • (count=n)
        

minding the note at the beginning regarding the different meaning of count when host_count is defined. All embarassing parallel jobs go to the myri cluster, the only and default partition is myri_large.

The executable is run as:

cd <job dir>
./<executable> <arguments>

How to define a Hybrid job

In the RSL file:

      • (job_type=mpi)
        
        (environment=(CPUS_PER_TASK c))
        
      • (count=n)
        

or

      • (host_count=N)
        
      • (count=n)
        

minding the note at the beginning regarding the different meaning of count when host_count is defined. The number of CPUs per task is defined in an environment variable defined into the RSL file. All hybird jobs go to the mpp1 cluster, the only partition is mpp1_batch.

The executable is run as:

#SBATCH --cpus-per-task=c
cd <job dir>
mpiexec -t c ./<executable> <arguments>

If the executable is a script, then its content is appended to the file submitted by GRAM5 to the local resource manager. This allows the user to configure the environment, for example loading the needed modules, but the command srun_ps has to be specified inside the script itself. Moreover, every occurrency of mpirun or srun_ps is replaced by mpiexec. Each $0 appearing in the executable script is replaced by the executable's name, while $1, $2, ... are substituted, respectively, by the first, second, ... argument specified in the RSL argument parameter. The arguments are separated by a space (one or more), so if an entry contains spaces, then please enclose it between double quotes.

How to define an OpenMP (or shared memory) job

In the RSL file:

      • (job_type=multiple)
        
      • (count=n)
        *
        environment=(CPUS_PER_TASK c) (OMP_NUM_THREADS o))
        

The number of nodes is always put to one. The number of CPUs per task and the number of threads are defined in environment variables defined into the RSL file. All OpenMP jobs go to the myri cluster, the only and default partition is myri_large. If the number of CPUs per taks or the number of threads exceed the threshold, then the job is switched to the mpp1_batch partition.

The executable is run as:

#SBATCH --cpus-per-task=c
...
export OMP_NUM_THREADS=o
...
cd <job dir>
./<executable> <arguments>

Log files

In case something goes wrong, you can check the content of the following files in the user's home folder:

      • gram_YYYYMMDD.log: created by globus-job-manager, valid for all schedulers (batch or fork)
      • gram_loadleveler_log.<unique id of the job>: created by the LoadLeveler adapter in case something goes wrong in the conversion between RSL and JCF or during the submission (i. e. missing mandatory parameters in the JCF)
      • Loadleveler_job-<unique id of the job>.jcf: the JCF file submitted to LoadLeveler , saved if the DUMP_JCF environment variable is set to "yes"

Further reading

The Globus Toolkit official User Guide contains specific user guides for each service: