ALIs

kommt noch

GROMACS - A Molecular Dynamics Package

Description of the LRZ specific usage of GROMACS on the Linux Cluster HPC Systems.

Introductory Remarks

What is GROMACS?

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.

GROMACS is licensed and redistributed under the GPL.

Please consult the GROMACS web site for further information on this package.

The xdrfile library facility for I/O to xtc, edr and trr files is also available.

Authors

GROMACS was first developed in Herman Berendsens group, department of Biophysical Chemistry of Groningen University. It is a team effort, with contributions from several current and former developers all over world.

GROMACS installations at LRZ

The following table gives an overview of GROMACS installations at LRZ. Default versions are indicated by a bold red font. Note that for newer GROMACS releases, normally only the major and minor versions are specified in the module name; the most recently installed bug fix release is referenced from there.

 

Platform

Releases

Precision

Remarks

x86_64 Linux Cluster (SGE cell)

3.3.3, 4.0, 4.5

single

MPI parallel program mdrun_mpi available

x86_64 Linux Cluster (SGE cell)

3.3.3d , 4.0d, 4.5d

double

MPI parallel program mdrun_mpi available

Linux Cluster (SLURM cell) 4.5 single MPI parallel program mdrun_mpi available
Linux Cluster (SLURM cell) 4.5d double MPI parallel program mdrun_mpi available

Altix systems

3.3.3, 4.0, 4.5

single

MPI parallel commands available for SGI Altix (mdrun_altix).

Altix systems

3.3.3d , 4.0d, 4.5d

double

MPI parallel commands available for Parastation MPI (mdrun_mpi) and SGI Altix (mdrun_altix).

SuperMUC 4.5 single MPI parallel program mdrun_mpi available
SuperMUC 4.5d double MPI parallel program mdrun_mpi available

Please consult the example batch scripts below for how to use the MPI parallel versions. The single precision builds typically show larger numerical instabilities than the double precision builds. Furthermore, the GROMACS executables always have the same name (no additional _d suffix for the double precision version) with exception of the MPI parallel mdrun_xxx binaries.

Usage

Before using GROMACS, you need to load an appropriate environment module:

module load gromacs
  • On all systems, the single precision version will be loaded by default. To use the double precision builds, please issue e.g.,

    module load gromacs/4.5d
    

    (the version number with the attached "d" indicates double precision).

  • On Itanium systems, the 3.3.x releases use assembler loops, which for many cases give considerable performance improvements. You can deactivate the assembler loops by setting the environment variable NOASSEMBLYLOOPS, but this will give much lower performance in most cases. The 4.0 and higher versions are built without assembler loops for Itanium, using the Fortran kernels instead, since the assembler loops do not work any more on recent IA64 processors.

  • Note that in the GROMACS path there are automatic shell completion files available (completion.$SHELL) which add all GROMACS file extensions if you source them into your shell.

Using the xdrfile library

To use this library, please load the module

module load xdrfile

and compile/link your application using the environment variables $XDRFILE_INC and $XDRFILE_LIB, respectively.

Setting up batch jobs

Gromacs on the Linux Cluster systems run with SGE:

For long production runs, an SGE batch job should be used to run the program. The example batch scripts provided in this section require the input files speptide.top, after_pr.gro and full.mdp, all contained in the example archive,  to be placed in ~/mydir before the run.

Further notes:

  • to run in batch mode, submit the script using the qsub command. To run small test cases interactively, start the script directly.

  • the parallel batch script uses the variable $NSLOTS which is either set by SGE or - if you run the script interactively - set to 4 CPUs. $NSLOTS is repeatedly referred to in the following lines of the SGE script

  • for parallel runs, some MPI implementations do not find the executable, hence its full pathname is specified via "which".

  • for the 4.0.x versions, the grompp command does not accept the -np argument any more. Please consult the documentation for how to set up parallel runs in 4.0.x

  • for batch jobs, the nice switch is set to 0 for mdrun. Please omit this switch when running interactively, otherwise your job will be forcibly removed from the system after some time.

  • please do not forget to replace the dummy e-Mail address in the example scripts by your own one.

Gromacs on the Cluster systems run with the SLURM batch scheduler:

A SLURM script must be generated, which is submitted via the sbatch command.

Gromacs on the HLRB-II:

A PBS job script must be generated and submitted. An example is given at the end of the table below.

Gromacs on SuperMUC:

A LoadLeveler job script must be generated and submitted. An example is given at the end of the table below.

 

Serial processing

x86-64 Cluster

#!/bin/bash
#$-o $HOME/mydir/gromacs.out -j y
#$-N gromacs
#$-S /bin/bash
#$-l march=x86_64 
#$-M  wrzlprmft@mydomain
. /etc/profile
cd mydir
module load gromacs
grompp -v -f full -o full -c after_pr -p speptide
mdrun -v -nice 0 -s full -e full -o full -c after_full -g flog

Parallel processing
A run time limit of 24 hours is specified in the examples. You can increase this up to the queue limit, but longer run times incur a greater risk of job crashes.

x86_64 based systems in the cluster (SGE batch processing)

#!/bin/bash
#$-o $HOME/mydir/gromacs.out -j y
#$-N gromacs
#$-S /bin/bash
#$-l h_rt=24:00:00
#  Please choose "mpi_8" for 8-way opterons,
#  "mpi_ice" for sgi ICE, or "uv" for sgi Ultraviolet
#$-pe [mpi_8|mpi_ice|uv] 32
#$-M  wrzlprmft@mydomain
. /etc/profile
cd mydir
module load gromacs

if [ -z $NSLOTS ] ; then
  export NSLOTS=4
fi
grompp -v -f full -o full -c after_pr -p speptide
mpiexec -n $NSLOTS $(which mdrun_mpi) -v -s full -e full \
       -o full -c after_full -g flog -N $NSLOTS

x86_64 based Cluster systems (SLURM batch processing)

#!/bin/bash
#SBATCH -o /home/cluster/<group>/<user>/myjob.%j.out
#SBATCH -D /home/cluster/<group>/<user>/mydir
#SBATCH -J <job_name>
#SBATCH --partition <partition_name>
#SBATCH --ntasks=64
#SBATCH --mail-type=end
#SBATCH --mail-user=<email_address>@<domain>
#SBATCH --export=NONE
#SBATCH --time=24:00:00
 
. /etc/profile
cd mydir
grompp -v -f full -o full -c after_pr -p speptide
srun_ps5 -n $NSLOTS $(which mdrun_mpi) -v -s full -e full \  
         -o full -c after_full -g flog -N $NSLOTS -nt 1

SGI Altix (Linux-Cluster, SGE batch processing)

#!/bin/bash
#$-o $HOME/mydir/gromacs.out -j y
#$-N gromacs
#$-S /bin/bash
#$-l h_rt=24:00:00
#$-pe numa* 16
#$-M  wrzlprmft@mydomain
. /etc/profile
cd mydir
module load gromacs

if [ -z $NSLOTS ] ; then
  export NSLOTS=4
fi
grompp -v -f full -o full -c after_pr -p speptide
mpirun -np $NSLOTS $(which mdrun_altix) -v -s full -e full \
       -o full -c after_full -g flog -N $NSLOTS

Parallel processing on HLRB-II
The PBS batch queuing software is used on this system.

#!/bin/ksh
#PBS -o $HOME/mydir/gromacs.out
#PBS -N gromacs
#PBS -l select=16
#PBS -l walltime=10:00:00
#PBS -M <your email address>
. /etc/profile.d/modules.sh
cd mydir
module load gromacs

grompp -v -f full -o full -c after_pr -p speptide
mpiexec  $(which mdrun_altix) -v -s full -e full \
       -o full -c after_full -g flog -N 16

Parallel processing on SuperMUC with LoadLeveler

TBD

Expected performance

According to the internal performance evaluation of mdrun, around 1 GFlop/s are achieved by the small test system referred to in the example job scripts on both the IA32 (2.8 GHz) and Itanium (1.6 GHz) systems if dedicated CPUs are available and the double precision version is used. For parallel runs, scaling depends on the system size; the test system achieves a speedup of 2.1-2.2 with 4 CPUs. Please run tests of your own to optimize the number of processors for your input data.

The Gromacs benchmark results for the LRZ HPC systems are also available.

Documentation

After loading the environment module, the $GROMACS_DOC variable points to a directory containing documentation and tutorials.

For gromacs 4.0 and higher, parallel scalability has been much improved. The invocation of grompp and of the parallel mdrun_mpi, mdrun_altix binaries has slightly changed.

For further information (including the man pages for all GROMACS subcommands), please refer to the GROMACS web site.