Edgar Gabriel Debugging and Performance Analysis of Parallel MPI Applications Edgar Gabriel

Debugging and Performance Analysis of Parallel MPI Applications

  • Upload

  • View

  • Download

Embed Size (px)


Debugging and Performance Analysis of Parallel MPI Applications. Edgar Gabriel. Debugging sequential applications. Several ways how to debug a sequential application: printf() statements in the source code Works, works reliably, painful to remove afterwards assert() statements - PowerPoint PPT Presentation

Citation preview

Page 1: Debugging and Performance Analysis of Parallel MPI Applications

Edgar Gabriel

Debugging and Performance Analysis of Parallel MPI Applications

Edgar Gabriel

Page 2: Debugging and Performance Analysis of Parallel MPI Applications

2Edgar Gabriel

Debugging sequential applications

• Several ways how to debug a sequential application:– printf() statements in the source code

• Works, works reliably, painful to remove afterwards– assert() statements

• check for a certain value of a variable. If the expression is false, the application aborts.

• only active, if the macro NDEBUG is defined– Setting in the source #define NDEBUG 1– Compiling with the flag –DNEDBUG=1

#include <assert.h>

void open_record(char *record_name){ assert (record_name!=NULL);}

Page 3: Debugging and Performance Analysis of Parallel MPI Applications

3Edgar Gabriel

Using a debugger

• For a source file to be visible in the debugger, you have to compile the source code with the –g option, e.g. gabriel@salmon>mpicc –g –o test test.c– Avoid using optimization flags, such as –O3 when you

would like to debug the code• Two types of debugger

– Command line debugger, such as gdb– Graphical debuggers, such as ddd (which is a GUI to gdb)

Page 4: Debugging and Performance Analysis of Parallel MPI Applications

4Edgar Gabriel

Start app.

Load application into the debugger

Debugger points to the problem

Show the value of a variable when the problem occurred

Show source code of app.

Page 5: Debugging and Performance Analysis of Parallel MPI Applications

5Edgar Gabriel

gdb commands• Setting breakpoints: debugger stops execution at the

specified line. Example

(gdb) break errexample.c:10

(gdb) break myfunct• Stepping through the source code

(gdb) next (skips subroutines/functions)

(gdb) step (enters subroutines/functions)• Continue execution (not step by step anymore)

(gdb) cont• Quit debugger

(gdb) quit

Page 6: Debugging and Performance Analysis of Parallel MPI Applications

6Edgar Gabriel

Page 7: Debugging and Performance Analysis of Parallel MPI Applications

7Edgar Gabriel

Debugging a parallel application

• Some debuggers for parallel applications available (e.g. totalview, ddt)– Unfortunately expensive products

• You can still use printf and assert– Output from several processes will be mixed– you should put the rank of the process in front of each printf

statement• gdb or ddd still useable

– You have to choose which process you would like to debug– Please be aware, that ddd or gdb can only see processes on

the local machine

Page 8: Debugging and Performance Analysis of Parallel MPI Applications

8Edgar Gabriel

Debugging a parallel application (II)

• Hints for parallel debugging– Try to find the lowest number of processes for which

the problem still occurs– Try to execute the application on a single node

• If the problem does not show up on a single node, you will have to run the application on multiple nodes and login to the node, where the problem occurs

– Introduce a sleep () statement in your application to have time to attach with a debugger

Page 9: Debugging and Performance Analysis of Parallel MPI Applications

9Edgar Gabriel

Attaching to a process

• Menu File• Bullet: attach to processes

• Choose the PID which you would like to debug

Page 10: Debugging and Performance Analysis of Parallel MPI Applications

10Edgar Gabriel

Debugging parallel applications (III)

• Some MPI libraries support the startup of a debugger in the mpirun command– e.g. Open MPI

mpirun –np 2 ddd ./colltest- Starts one ddd session per process- Not useful for large numbers of processes

Page 11: Debugging and Performance Analysis of Parallel MPI Applications

11Edgar Gabriel

Performance Analysis of parallel applications

• Sometimes, it is useful to see – Who is communicating with whom and how much– How are the messages and the processes

synchronized• General approach: link a (tracing) library to your

application which records– per process and per MPI function– when the function has been started and how long it

took ( two time-stamps)– Some additional information, e.g. for Send operations

source and destination process, message length, communicator etc.

Page 12: Debugging and Performance Analysis of Parallel MPI Applications

12Edgar Gabriel

Performance Analysis of parallel applications

• The mechanism used to do that is called MPI Profiling Interface– MPI defines, that each function has to be available

not only with the prefix MPI but also with the prefix PMPI

• e.g. instead of MPI_Send you could also call PMPI_Send,

• same functionality just different name• Tracing libraries implement the MPI function and call

internally the PMPI function for the ‘real’ operation

Page 13: Debugging and Performance Analysis of Parallel MPI Applications

13Edgar Gabriel

Profiling interface

• Example: an application calls MPI_Send– Usual scenario: the function MPI _Send is

implemented in the MPI library– Profiling scenario:

• the function MPI_Send is implemented in the tracing library.

• The MPI_Send function of the tracing library calls PMPI_Send

•PMPI_Send is implemented in the MPI library

Page 14: Debugging and Performance Analysis of Parallel MPI Applications

14Edgar Gabriel

Profiling interface (II)

• Question: How does the application now which MPI_Send function to use (e.g. from MPI library or from tracing library)?

• Answer: You tell the application at link-time• Example:

mpicc –o test test.c

is in reality (for Open MPI)

cc –o test test.c –I/home/gabriel/OpenMPI/include

–L/home/gabriel/OpenMPI/lib –lmpi –lorte -lopal

Page 15: Debugging and Performance Analysis of Parallel MPI Applications

15Edgar Gabriel

Profiling interface (III)

• The –I option with cc: tells the compiler where to look for header files

• The –L option tells the linker where to look for libraries• Linker searcher the directories in the order as specified

by the user• If a function has been found in a library, the linker does

not continue searching in the following libraries (usually!)• Thus

cc –o test test.c –I/home/gabriel/OpenMPI/include –L/home/gabriel/OpenMPI/lib –lmpi –lorte -lopal

will use MPI_Send from lmpi, because that’s where the linker finds the function first

Page 16: Debugging and Performance Analysis of Parallel MPI Applications

16Edgar Gabriel

Profiling interface (IV)

• However,cc –o test test.c –I/home/gabriel/OpenMPI/include

-L/home/gabriel/MPE/lib –llmpe –lmpe

–L/home/gabriel/OpenMPI/lib –lmpi –lorte -lopal

– will use MPI_Send from llmpe, because that’s where the linker finds the function first

– but PMPI_Send from lmpi, because that’s where the linker finds the function first.

Page 17: Debugging and Performance Analysis of Parallel MPI Applications

17Edgar Gabriel

Available performance analysis tools

• Commercials (among others)– Vampir and vampirtrace– Paraver and mpilog

• Public domain (among others)– MPE and jumpshot– CUBE and EPILOG

Page 18: Debugging and Performance Analysis of Parallel MPI Applications

18Edgar Gabriel

Hints when using MPE and jumpshot

• When using MPE, you can use instead of the long commandline a wrapper-script called mpecc/mpef77, e.g.

gabriel@linux12>mpecc –mpilog –o test.mpe test.c• Run the application as usuallygabriel@linux12>mpirun –np 2 ./test.mpe• After the run, there is file called test.mpe.clog2 in the same

directory• The viewer (jumpshot) unfortunately needs a different file

format, so you have to convert it withgabriel@linux12>clog2TOslog2 test.mpe.clog2• This generates a file called test.mpe.slog2• You can load an slog2 file into jumpshot

Page 19: Debugging and Performance Analysis of Parallel MPI Applications

19Edgar Gabriel

Example of an analysis with VAMPIR (I)

Timeline view – see which process is at which time in which MPI function

Page 20: Debugging and Performance Analysis of Parallel MPI Applications

20Edgar Gabriel

Example of an analysis with VAMPIR (II)

Message statistics view – how much data is being sent between processes