41
Zellescher Weg 12 Willers-Bau A114 Tel. +49 351 - 463 - 38323 Andreas Knüpfer ([email protected]) Event Tracing with VampirTrace and Vampir

1 Vampir Overview

  • Upload
    ptihpa

  • View
    1.011

  • Download
    3

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: 1 Vampir Overview

Zellescher Weg 12

Willers-Bau A114

Tel. +49 351 - 463 - 38323

Andreas Knüpfer ([email protected])

Event Tracing withVampirTrace and Vampir

Page 2: 1 Vampir Overview

2

Introduction

Event Tracing Overview

Instrumentation

Run-Time Measurement

Conclusions

Overview

Page 3: 1 Vampir Overview

Zellescher Weg 12

Willers-Bau A114

Tel. +49 351 - 463 - 38323

Andreas Knüpfer ([email protected])

Introduction

Page 4: 1 Vampir Overview

4

Moore's Law still in charge, so what?

increasingly difficult to get close to peak performance

– for sequential computation• memory wall• optimum pipelining, ...

– for parallel interaction• Amdahl's law• synchronization with single late-comer, ...

efficiency is important because of limited resources

scalability is important to cope with next bigger simulation

Why bother with performance analysis?

Page 5: 1 Vampir Overview

5

Profile Recording

of aggregated information (Time, Counts, …)

about program and system entities

– functions, loops, basic blocks

– application, processes, threads, …

Methods of Profile Creation

sampling (statistical approach)

direct measurement (deterministic approach)

Profiling and Tracing

Page 6: 1 Vampir Overview

6

Trace Recording

run-time events (points of interest)

during program execution

saved as event record

– timestamp, process, thread, event type

– event specific information

via instrumentation & trace library

Event Trace

collection of all events of a process / program

sorted by time stamp

Profiling and Tracing

Page 7: 1 Vampir Overview

7

Tracing Advantages

preserve temporal and spatial relationships (context)

allow reconstruction of dynamic behavior

profiles can be calculated from traces

Tracing Disadvantages

traces can become very large

may cause perturbation

instrumentation and tracing is complicated

– event buffering, clock synchronization, …

Profiling and Tracing

Page 8: 1 Vampir Overview

Zellescher Weg 12

Willers-Bau A114

Tel. +49 351 - 463 - 38323

Andreas Knüpfer ([email protected])

Event Tracing Overview

Page 9: 1 Vampir Overview

9

Event Tracing from A to Z

Instrumentation Run TimeMeasurement

Visualization / Analysis

src

exec.

instrument

instrument

exec.

trace file(s)

see more belowsee followingpresentation

Page 10: 1 Vampir Overview

10

Which events to monitor?

enter/leave of function/routine/region

– time stamp, process/thread, function ID

send/receive of P2P message (MPI)

– time stamp, sender, receiver, length, tag, communicator

collective communication (MPI)

– time stamp, process, root, communicator, # bytes

hardware performance counter value

– time stamp, process, counter ID, value

corresponding “record types” in trace file format

Most common event types

Page 11: 1 Vampir Overview

11

10010 P 1 ENTER 5

10090 P 1 ENTER 6

10110 P 1 ENTER 12

10110 P 1 SEND TO 3 LEN 1024 ...

10330 P 1 LEAVE 12

10400 P 1 LEAVE 6

10520 P 1 ENTER 9

10550 P 1 LEAVE 9

...

10020 P 2 ENTER 5

10095 P 2 ENTER 6

10120 P 2 ENTER 13

10300 P 2 RECV FROM 3 LEN 1024 ...

10350 P 2 LEAVE 13

10450 P 2 LEAVE 6

10620 P 2 ENTER 9

10650 P 2 LEAVE 9

...

DEF TIMERRES 1000000000

DEF PROCESS 1 `Master`

DEF PROCESS 1 `Slave`

DEF FUNCTION 5 `main`

DEF FUNCTION 6 `foo`

DEF FUNCTION 9 `bar`

DEF FUNCTION 12 `MPI_Send`

DEF FUNCTION 13 `MPI_Recv`

Parallel Trace Files

Trace Format Schematics

Page 12: 1 Vampir Overview

12

Trace Visualization: Timeline Display

Page 13: 1 Vampir Overview

13

Trace Visualization: Process Timeline Display

Page 14: 1 Vampir Overview

14

Trace Visualization: Statistic Summary Display

Page 15: 1 Vampir Overview

15

Trace Visualization: Message Statistics Display

Page 16: 1 Vampir Overview

16

The Vampir Tool Family

VampirTrace

convenient instrumentation and measurement

hides away complicated details

provides many options and switches for experts

VampirTrace is part of Open MPI 1.3

Vampir/VampirServer

interactive trace visualization and analysis

intuitive browsing and zooming

scalable to large trace data sizes (100GB)

scalable to high parallelism (2000 processes)

Vampir for Windows in progress, beta versionavailable

Page 17: 1 Vampir Overview

17

Open Trace Format (OTF)

Open source trace file format

Includes powerful libotf for use in custom applications

High level interface for tools + low level interface for trace libraries

Other Formats

TAU Trace Format (Univ. of Oregon)

Epilog (ZAM, FZ Jülich)

STF (Pallas, now Intel)

Trace File Formats

Page 18: 1 Vampir Overview

18

Other Event Tracing Tools

TAU profiling (University of Oregon, USA)

– profiling and tracing for parallel applications

– http://www.cs.uoregon.edu/research/tau/

Paraver (CEPBA, Barcelona, Spain)

– trace based parallel performance analysis and visualization

– http://www.cepba.upc.edu/paraver/

Scalasca (FZ Jülich)

– tracing and automatic detection of performance problems

– http://www.scalasca.org/

Intel Trace Collector & Analyzer

– Very similar to Vampir

Other Tools

Page 19: 1 Vampir Overview

Zellescher Weg 12

Willers-Bau A114

Tel. +49 351 - 463 - 38323

Andreas Knüpfer ([email protected])

Instrumentation

Page 20: 1 Vampir Overview

20

Instrumentation: Process of modifying programs to detect and reportevents by calling instrumentation functions.

instrumentation functions provided by trace library

notification about run-time event

there are various ways of instrumentation

Instrumentation

Page 21: 1 Vampir Overview

21

Edit – Compile – Run Cycle

Edit – Compile – Run Cycle with VampirTrace

Source Code Binary ResultsCompiler Run

Source Code Binary ResultsVT Wrapper

Run

Traces

Compiler

Instrumentation

Page 22: 1 Vampir Overview

22

Source code instrumentation

– manually

– automatically

Instrumentation with wrapper functions

Library pre-load instrumentation

Compiler Instrumentation

Binary instrumentation

VampirTrace supports different methods of instrumentation

Hidden in compiler wrappers

Instrumentation Types

Page 23: 1 Vampir Overview

23

int foo(void* arg) {

if (cond) {

return 1;

}

return 0;

}

int foo(void* arg) {

enter(7);

if (cond) {

leave(7);

return 1;

}

leave(7);

return 0;

}

manually or automatically

Source Code Instrumentation

Page 24: 1 Vampir Overview

24

manually

large effort

error prone

difficult to manage

automatically

via source to source translation

Program Database Toolkit (PDT)http://www.cs.uoregon.edu/research/pdt/

OOpenMP PPragma AAnd Region IInstrumentor (Opari)http://www.fz-juelich.de/zam/kojak/opari/

Source Code Instrumentation

Page 25: 1 Vampir Overview

25

provide wrapper functions

– call instrumentation function for notification

– call original target for actual functionality

implement via library pre-load

or via preprocessor directives

suitable for standard libraries (e.g. MPI, glibc)

can evaluate function call semantics (function signature, arguments)

#define fread WRAPPER_glibc_fread

#define fwrite WRAPPER_glibc_fwrite

Instrumentation with Wrapper Functions

Page 26: 1 Vampir Overview

26

wrapper library

Instrumentation via library pre-load, e.g. for MPI

Each MPI function has two names:

– MPI_xxx and PMPI_xxx

Selective replacement of MPI routines at link time

user program

MPI library

MPI_Send

PMPI_Send MPI_Send

MPI_Send

MPI_Send

MPI_SendMPI_Send

The MPI Profiling Interface

Page 27: 1 Vampir Overview

27

gcc -finstrument-functions –c foo.c

many compilers support instrumentation:

(GCC, Intel, IBM, PGI, NEC, Hitachi, Sun Fortran, …)

no common API, different command line switches, differentbehavior

no source modification necessary

managed by VampirTrace

void __cyg_profile_func_enter( <args> );

void __cyg_profile_func_exit( <args> );

Compiler Instrumentation

Page 28: 1 Vampir Overview

28

modify binary executable in main memory (or in a file)

insert instrumentation calls

very platform/machine dependent

expensive

Using the DynInst project

provides common interface to binary instrumentation

available for Alpha/Tru64, MIPS/IRIX, PowerPC/AIX,Sparc/Solaris, x86/Linux+Windows, ia64/Linux

see http://www.dyninst.org

Dynamic Instrumentation

Page 29: 1 Vampir Overview

29

Use VampirTrace compiler wrappers

Internals and plattform specifics hidden

Select appropriate way(s) of instrumentation

Substitute calls to the regular compiler with calls to compilerwrappers

CC=mpicc

CC=vtcc

Practical Instrumentation

Page 30: 1 Vampir Overview

Zellescher Weg 12

Willers-Bau A114

Tel. +49 351 - 463 - 38323

Andreas Knüpfer ([email protected])

Run Time Measurement

Page 31: 1 Vampir Overview

31

What does the trace library do?

provide instrumentation functions

receive events of various types

collect event properties

– time stamp

– location (thread, process, cluster node, MPI rank)

– event specific properties

– perhaps hardware performance counter values

record to memory buffer, flush eventually

try to be fast, minimize overhead

Trace Library

Page 32: 1 Vampir Overview

32

There are a number of run-time options

Controlled by environment variables

PAPI hardware performance counters

Memory allocation counters

Application I/O calls

Filtering

Grouping

more ...

see more in the following presentations and hands-on parts

Run-Time Options

Page 33: 1 Vampir Overview

33

Include hardware performance counters in traces

– via PAPI library

– or Sun Solaris CPC counters

– or NEC SX counters

VT_METRICS can be used to specify a colon-separated list of counters

see papi_avail and papi_command_line tools etc.

see VampirTrace Documentation for CPC and NEC counters

set VT_METRICS environment variable

export VT_METRICS=PAPI_FP_OPS:PAPI_L2_TCM

Performance Counters

Page 34: 1 Vampir Overview

34

monitor memory allocation behavior

record memory volume as counter

record glibc calls like “malloc” and “free” as function calls

via environment variable VT_MEMTRACE

export VT_MEMTRACE=yes

Memory Allocation Tracing

Page 35: 1 Vampir Overview

35

monitor POSIX I/O behavior

record read/write rates as counters

record standard I/O calls like “open” and “read”

via environment variable VT_IOTRACE

mmap I/O not supported

export VT_IOTRACE=yes

I/O Tracing

Page 36: 1 Vampir Overview

36

selective tracing of certain functions/subroutines

one way to reduce trace file size!

via environment variable VT_FILTER_SPEC

run-time filtering, no re-compilation or re-linking

see also the vtfilter tool

– can create a filter file with rough target size estimate

– can apply a filter to an existing trace file as post processing

export VT_FILTER_SPEC=/home/user/filter.spec

my*;test -- 1000calculate -- -1* -- 1000000

Function Filtering

Page 37: 1 Vampir Overview

37

defined user specified groups

highlighting application behavior, different activities, program phases

– communication, computation, initialization, different libraries, ...

groups are assigned to colors in Vampir displays

run-time grouping, no re-compilation or re-linking

via environment variable VT_GROUPS_SPEC

contains a list of groups of associated functions, wildcards allowed

export VT_GROUPS_SPEC=/home/<user>/groups.spec

CALC=calculateMISC=my*;testUNKNOWN=*

Function Grouping

Page 38: 1 Vampir Overview

38

Further activities of the trace library:

Data management

– Trace data is written to a buffer in memory first

– When this buffer is full, data is flushed to files

– Data compression, etc

Timer selection and time synchronization between local clocks

– use highly accurate clocks

Unification of local process/thread traces (post processing)

– trace processes/threads separately

– collect all traces of all parallel processes/threads at the end

– add global information about all participants

Behind the Scenes

Page 39: 1 Vampir Overview

Zellescher Weg 12

Willers-Bau A114

Tel. +49 351 - 463 - 38323

Andreas Knüpfer ([email protected])

Conclusions

Page 40: 1 Vampir Overview

40

performance analysis is very important in HPC

use performance analysis tools for profiling and tracing

do not spend effort in DIY solutions, e.g. like printf-debugging

use tracing tools with some precautions

– overhead

– data volume

let us know about problems and about feature wishes [email protected]

Conclusion

Page 41: 1 Vampir Overview

41

available via http://www.vampir.eu/ and http://www.tu-dresden.de/zih/vampirtrace/

Thank you !