ALIs

kommt noch

Intel Tracing Tools: Profiling of MPI programs

Short introduction to the Tracing Tools, used for performance analysis, tuning and debugging of parallel programs.


Table of contents


Introduction

The Intel Tracing Tools, comprised of Trace Collector, Trace Analyzer and Message Checker, support the development and tuning of programs parallelized using the MPI message passing interface. Using these tools enables you to investigate the communication structure of your parallel program, and hence to isolate incorrect and/or inefficient MPI programming.

  • Trace Collector provides a MPI tracing library which produces tracing data collected during a typical program run; these tracing data are written to disk in an efficient storage format for subsequent analysis.
  • Trace Analyzer provides a GUI for analysis of the tracing data.
  • Message Checker allows you to identify certain classes of bugs in your MPI-parallel algorithm.

Installation on LRZ HPC platforms

Installed versions

Plattform Trace Collector Trace Analyzer Remarks
IA64 Linux cluster and Altix superclusters 7.1, 7.2 7.1, 7.2 available for SGI's MPT, MPICH (e.g., Parastation MPI), and Intel MPI
x86_64 Intel-based systems 7.1 7.1 available for Intel MPI

Remarks on the various MPI flavors supported

  • The SGI MPT version also supports tracing of SHMEM calls. These are grouped into a separate class.
  • There are linkage problems when using Tracing with MPICH installations. This appears to be due to changes in the linker functionality introduced in SLES10. As a workaround, please use the additional link switch --allow-multiple-definition. The bug has been reported and will hopefully be fixed in a future update.
  • ITC will not trace applications on non-Intel processors. However, the ITA can be used also on non-Intel systems to analyze existing tracefiles.

Usage

Initialization

Before using either Trace Collector or Trace Analyzer it is necessary to load the appropriate environment module:
module load mpi_tracing
Note that this module will load the tracing environment depending on the loaded MPI module; not all available MPI environments will be supported (see table above for details). In particular, if you change the MPI environment to be different from the default, you must unload the mpi_tracing and the reload it after the new MPI environment has been configured.

Tracefile Generation

As long as no changes to the program are introduced - for example to explicitly call ITA routines - it is sufficient to relink the executable. In all other cases it is necessary also to recompile the sources. In every case you should however use the MPI wrapper scripts to perform compilation; on all LRZ HPC platforms the following are supported:
  • mpif77 -g -vtrace -c <further options> myprog.f
    for compilation of Fortran 77 programs
  • mpif90 -g -vtrace -c <further options> myprog.f90
    for compilation of Fortran 90/95 programs
  • mpicc -g -vtrace -c <further options> myprog.c
    for compilation of C programs
  • mpiCC -g -vtrace -c <further options> myprog.C for compilation of C++ programs
Linkage is similarly performed; do not forget to add the -vtrace option there too. Also note that -g usually implies that optimization is turned off unless you add them back again in the <further options> section.Note that if you use Intel MPI (which is the case if the environment module mpi.intel is loaded), you need to replace the -vtrace option by -t=log

Automatic subroutine tracing

By default, only the MPI part of the program can be resolved ("ungrouped") into the various API calls. If you also wish to resolve subroutine calls, you either need to make use of explicit API calls, or perform automatic subroutine instrumentation. Please note that the latter method may involve a much larger overhead compared to explicit API calls.

By compiler switch

Simply specify the -tcollect switch in addition to -vtrace and recompile as well as relink your application.

By binary instrumentation

This is available for EM64T and Itanium based applications, but is not supported for the SGI MPT used onn Altix systems. It is recommended to use this functionality with Intel MPI.
  • Perform

        module unload mpi.altix mpi.parastation
        module load mpi.intel
        module load mpi_tracing

    and build your MPI application as usual (i.e., without extra switches for tracing)

  • Run your application with the command line

        mpiexec -n <No. of MPI tasks> itcpin --profile \
         --run -- ./myprog.exe <application-specific switches>

    Note that the --profile switch will perform instrumentation not only on MPI, but also your own subroutine calls, LIBC calls etc. and may considerably increase the size of trace files unless you take steps to filter excess information.

Note that the --profile switch will perform instrumentation not only on MPI, but also your own subroutine calls, LIBC calls etc. and may considerably increase the size of trace files unless you take steps to filter excess information. See section 3.5 of the ITC User's Reference (linked in the documentation subsection below) for further switches usable with itcpin

Configuration File

An arbitrarily named configuration file may contain a large number of entries which control tracing execution. Please set the environment variable VT_CONFIG to the full path name of this file. Here is an example on what kinds of entries could be contained:

# Log file
LOGFILE-NAME myprog.stf
LOGFILE-FORMAT STF<
# disable all MPI activity
ACTIVITY MPI OFF
# enable all bcasts, recvs and sends
SYMBOL MPI_WAITALL ON
SYMBOL MPI_IRECV ON
SYMBOL MPI_ISEND ON
SYMBOL MPI_BARRIER ON
SYMBOL MPI_ALLREDUCE ON
# enable all activities in the Application class
ACTIVITY Application ON
Please check out the user's guide for further settings which e. g. may be advantageous in limiting the amount of generated trace data.

LRZ specific configurations

The VT_FLUSH_PREFIX environment variable, which denotes the path for the intermediate traces, is set by the mpi_tracing environment module to point at the high-bandwidth scratch file system. The rationale for this is to prevent the /tmp filesystem from overflowing if large traces are performed.

Using Control Calls

In order to obtain more fine-grained control over the tracing procedure, it is possible to insert suitable subroutine calls into the program source code. For example, a call of
     VT_traceoff()

will switch off tracing for the subsequent program execution flow, and
     VT_traceon()

will switch tracing back on again. With

     VT_begin(mark), VT_end(mark)

you can mark certain program regions. Since tracing will usually involve a performance overhead it is recommended to use preprocessor macros to enable tracing only during the optimization phase, thus for a C/C++ program

     #ifdef USE_VT
     # include "VT.h"
     # endif
        ..... 
     # ifdef USE_VT;
     VT_traceoff()
     # endif

Note that an additional include File VT.h is required for C programs. For the above example, you'd need to compile with the command

     mpicc -o myfoo.o myfoo.c -vtrace -DUSE_VT

(where the Macro -DUSE_VT is arbitrarily named). This method is also applicable to Fortran programs, if the file name extension .F is used to automatically apply the C preprocessor before the actual compilation process.

For details of the ITA API please consult the documentation.

After execution of your tracing run, assuming your program's name is myprog, you will find a number of files myprog.stf*. These are summarily analyzed by issuing the

     traceanalyzer myprog.stf

For some time, the previous command name vampir will also be available still.

Message Checking

Error detection for MPI code is only supported via Intel MPI. Hence, please perform the following steps:

  1. Load the module stack supporting MPI checking:
         module unload mpi_tracing
         module unload mpi.parastation mpi.altix
         # may need to unload further modules
         module load mpi.intel
         module load mpi_tracing
    
  2. Completely recompile your application with debug symbols switched on:
         mpif90 -g -O2 -c foo.f90
         ...
         mpif90 -g -O2 -o myprog.exe myprog.f90 foo.o ...
    
    Dynamic linkage must be performed. This is necessary to allow the LD_PRELOAD mechanism described below to work. 
  3. Run the program as follows:
         mpiexec -genv LD_PRELOAD libVTmc.so  -n [# of MPI tasks] ./myprog.exe
    
  4. The report of Message Checker is written to standard error. Please check all lines marked ERROR or WARNING. Due to compiling with debug symbols, line information will also be displayed, pinpointing the location of your bug.

Further environment variables can be specified with additional -genv clauses on the mpiexec line:

Variable Default Value Meaning
VT_DEADLOCK_TIMEOUT 60 maximum interval to wait (in seconds) for deadlock detection
VT_DEADLOCK_WARNING 300 maximum interval to wait (in seconds) for deadlock warning
VT_CHECK_MAX_ERRORS 1 maximum number of errors before aborting
VT_CHECK_MAX_REPORTS 0 (unlimited) maximum number of reports before aborting

This list is not complete, at run time further settings are indicated in the lines of output marked INFO, as well as in the file <program name>.prot. The latter will also give an indication which variables have been modified from the default.

Documentation

Manuals and Weblinks

Course Material

Within LRZ's HPC training courses, a ITA/ITC tutorial is usually provided. This includes information on

  • setting up tracing runs

  • programming the API

  • giving hints on tracing configuration

  • usage of the GUI

The lecture notes of the most recently held training course are available; however note that they may not always be entirely up to date with the most recently available software release.

Troubleshooting

Please consult the appropriate sections in the Troubleshooting Document.