The Allinea DDT debugger

DDT, the Distributed Debugging Tool is a comprehensive graphical debugger for scalar, multi-threaded and large-scale parallel applications that are written in C, C++ and Fortran.

Allinea DDT

With Allinea DDT it is now possible to debug masive parallel program and deploying several hundreds of cores with very few clicks.

Using DDT

First, it is strongly recommended you clean and rebuild  your code with debugging symbols For most compilers (C/C++ and Fortran) this is achieved with the -g option.

Additionally, you might specifiy other options, described briefly in the table below:

CompilerOptionsRemarks
icc/icpc/mpicc/mpiCC -g -O2 -traceback -traceback  for C code mixed with Fortran
fort/mpif90 -g -O2 -check all -traceback -check all activates runtime error checking and may have performance impact. Refer to the compiler documentation for other check options, i.e., -check bounds
gcc/g++ -g -O2   
gfortran -g -O2 -fbacktrace -fcheck=<all|bounds> The use of -fcheck might impact severely performance.  Consult the man page for detailed information.

Now you should load the the appropriate environment module and call the ddt command

  module load ddt
  ddt

As represented in Figure 1, DDT gives you the option to debug a program, launch manually a program, attach a running program  or open  core files 

1

Figure 1.  DDT main menu

If you choose to Debug a program, DDT will be automatically integrated with the IBM Load Leveler. Thus, DDT will generate a corresponding  batch job and will connect to the compute nodes when the program starts execution.

As shown in Figure 2  the executable and the corresponding parameters can be specified. Memory debugging and other features can be activated by selecting the tick box and configuration options can be accessed by clicking on the adjacent tabs.   

2

Figure 2. Run Window. 

If you haven't used ddt from your account or the ~/.allinea directory is missing you will need to perform the initial configuration setup. To do this click on the Options button as illustrated in figure 2 and the window shown in figure 3 comes up.

DDT_init_jobsubmission

Figure 3. Configuration and initial setup.

Next select the Job Submission icon and the Job Submission Settings window should be unpopulated as illustrated in figure 3. Now choose a Submission template file by clicking on the icon and the window in figure 4 appears.

DDT_init_jobsubmission_qtf

Figure 4. Select template file.

Select the loadlever_supermuc.qtf file and the Job Submission Settings window should look like figure 6. Then click on the System icon and it should be populated as shown in figure 5. Now the initial configuration set up is done and only has to be redone if the ~/.allinea directory has been removed.

Execution Setup:

The user is also advised to specifiy  the number of nodes along with the number of cores per node (through  the PROCS_PER_NODE_TAG).  Programs can  be scheduled to the test or general queues. Refer to the Loadleveler site for information about limits and resources available on each queue.  Keep in mind, the waiting times on each queue might increase with the amount of resources (compute nodes or Wall Clock limits) your request.

In order to modify some parameters for the Load Leveler scheduler, click Options figure 2.

3

Figure 5. Parallel Enviroment

As show in Figure 5, by default the IBM MPI parallel  enviroment is chosen. You could specify, for example, intel-mpi  if you want to debug programs with Intel MPI.

4

Figure 6. Batch job templates

In figure 6 you can select a template for the job script that DDT will use to launch your program.  Defualt is the IBM MPI. For  Intel MPI, choose loadleveler_intel_mpi.qtf, located in the $HOME/.ddt subdirectory.  This directory will be generated with your onw preferences when you load the ddt module for first time.  Any modification or fine adjustment you do will be preserved in the  $HOME/.ddt_templates directory.

5

Figure 7.  Queue submition parameters.

In the job submittion window, it is possible to specify  the number of processors per node with the value of PROCS_PER_NODE_TAG.   A maximun of 40 can be allocated.  

On the Queue Submission Parameters windows, figure 7, it is possible to tell change the queue, the class job (parallel or MPICH) , the number of MPI Tasks and OpenMP threads and the walltime.  The meaning of each value and possible combinations are extensively described in the Loadleveler page.  

Finally, the Loadleveler  will decide when your batch job should start.  Once your job is running  you will be able to use the control buttoms (play, pause, step, etc.), or add break and watch points. Refer to the documention for a complete description of options.  

running program

Figure 8. Control Buttoms

Documentation

We recommend to read the documentation in PDF-format, accesible through the path in the $DDT_DOC environment variable set by

  module add ddt

The documentation location can be also directly listed via

  module show ddt

Debugging Options

Setting BackTrace

Workshop "Debugging at Scale"

The course materials of the LRZ/Allinea workshop "Debugging at Scale" can be downloaded below:

Note that, the materials for this course are password protected. To access the protected information log into one of the LRZ HPC systems type the command module load lrztools then type in the command get_manuals_passwd , or contact the Service Desk.