Vampir NG: Analysing the parallel Efficiency
Vampir, comprised of tracing libraries and the Vampir GUI for analysis of the run time traces, supports the development and tuning of programs parallelized using e.g., the MPI message passing interface. Using these tools enables you to investigate the communication structure of your parallel program, and hence to isolate inefficient MPI programming. In particular, the parallel capabilities of Vampir Server allow to analyze very large trace files efficiently.
Installation on LRZ HPC platforms
The tracing software as well as the GUI for visualizing the trace files is available on all HPC platforms at LRZ. However the Vampir Server software, which also allows parallel analysis of (large) trace files is only available on SuperMUC.
Before using Vampir or Vampirtrace, it is necessary to load the appropriate environment module:
module load vampir
Note that this module will load the tracing environment that depends on the loaded MPI module; not all available MPI environments are supported (see table above for details).
The standard method of invoking the MPI compilers (via the mpif90, mpicc, ... wrapper scripts) must be replaced by special build commands which automatically perform instrumentation. The following commands must be used for compiling:
|Language||Compilation line template|
|Fortran 77||vtf77 [-g] -c <further options> myprog.f|
|Fortran 90 and higher||vtf90 [-g] -vt:f90 mpif90 -c <further options> myprog.f90|
|C||vtcc [-g] -vt:cc mpicc -c <further options> myprog.c|
|C++||vtcxx [-g] -c -vt:cxx mpiCC <further options> myprog.cpp|
Linkage is similarly performed, for example:
vtf90 <further linkage options> -o myprog.exe myprog.o foo1.o foo2.o ...
Also note that -g usually implies that optimization is turned off unless you add them back again in the <further options> section. In particular, OpenMP tracing is also supported if you additionally specify the -openmp switch at compilation and linkage.
Subroutine tracing is automatically performed by the compiler wrappers.
Binary instrumentation (dyninst) is presently not supported in the LRZ-installed version.
Controlling Trace Generation
Please check the documentation on VampirTrace referenced below for the
- environment variables
- configuration file entries
which are available for controlling the tracing process (name of output file, process and file filtering etc.) Suitably chosen settings in particular may be advantageous in limiting the amount of generated trace data.
LRZ specific configurations
The VT_PFORM_LDIR environment variable, which denotes the path for the intermediate traces, is set by the vampir environment module to point at the high-bandwidth scratch file system.
Viewing the results (serial Vampir)
After execution of your tracing run, assuming your program's name is myprog, you will find a file <filename>.otf as well as a number of files *.events.z. These are summarily analyzed by issuing the command
It is also possible to use the "File --> Open ..." menu item to open a trace file.
Viewing the results (Vampir Server)
For very large trace file, it is recommended to use parallel processing for viewing the trace files. However, this functionality is only available on SuperMUC. Starting up the server-based Vampir is a two-step process:
- Start up an instance of the server by executing the command vampirserver start -n <tasks> script. This will start up the specified number of tasks using an interactive LoadLeveler job, whose job ID is written to standard output. As the final line of the output, a host and port number combination will be printed, which the client (see below) must use to connect with the server.
- Start up the GUI by executing vampir. To connect to the running server, please select "File --> Open Remote ...", specifying host and port number as well as selecting "Socket connection" in the appearing window. After this, a file selection window will appear, within which you should enter the path of your OTF format trace file, which then is opened for analysis.
- Once you have completed your analysis, please use the llcancel command to remove the job - this prevents a waste of your core resource budget.