Marmot - an MPI checker

This tool assists the parallel programmer in isolating MPI-related programming errors in large scale parallel codes. MARMOT surveys the MPI-calls made and automatically checks the correct usage of these calls and their arguments during runtime. It does not replace classical debuggers, but can be used in addition to them.

Availability

Marmot is presently available on the IA64 based SGI Altix systems at LRZ; programs built with MPT can be analyzed.

Basic Usage

To enable the tool, please load the environment module

module load marmot

Then, completely rebuild your MPI code after replacing

  • mpif90 by marmotf90
  • mpicc by marmotcc
  • mpiCC by marmotcxx

Finally, the resulting executable is executed with one MPI task more than normally used:

mpirun -np <n+1>  ./<executable_name>

The additional MPI task is used by Marmot as debug server. The program run will

  • signal to standard output if a deadlock is encountered
  • write a file Marmot_<executable_name>_<date>_<time>.txt containing the information about any MPI errors diagnosed.

If you run into trouble with using the Marmot wrappers, please consult the user's guide linked in the documentation section below.

Beyond the (default ASCII) text output, other formats are supported. Please set the environment variable

export MARMOT_LOGFILE_TYPE=1

to generate HTML files instead.

Support

Please contact the LRZ HPC support team if you run into trouble with using this tool.

Documentation

Once the environment module is loaded, the variable MARMOT_DOC points to the location of the documentation. The user's guide is also available from the LRZ web server; it contains much more information on further configuration possibilities for Marmot.