4 HPA Examples Of Vampir Usage

Preview:

Citation preview

What Is It Good For?!

Real World Examples ofVampir Usage at IU

Robert Henschelrhensche@indiana.edu

Thanks to Scott Teige, Don Berry, Scott Michael, Huian Li, Judy Qiu for providing traces.

May 2009

Robert Henschel

Contents• IPP – Repeated Execution of an Application• Runtime Anomaly on the iDataPlex (Tachyon)• Cell B.E. Tracing• Particle Swarm Optimizer – Memory Tracing• Finding an I/O Bottleneck (EMAN)• Cluster Challenge

– WPP, GAMESS• Tracing on Windows• Swarp – Pthread Tracing• MD Simulation

Robert Henschel

What is this all about• Provide a feeling for what Vampir can be used for• Show a few of Vampirs features• Raise the awareness that tracing software is a

complex undertaking

Robert Henschel

IPP - Repeated Execution of an Application• Image analysis pipeline• Every module/binary is called a few hundred times

during a single run

• Setting VT_FILE_UNIQUE=yes allows for normal execution of the pipeline, while traces are produced

• We ended up with 640 traces, that we did not want to look at

• otfprofile was used in bash script to retrieve the profile of every trace, and then combine to a profile of the entire run

Robert Henschel

IPP - Repeated Execution of an Application ppImage_103.0.def.z ppImage_103.0.marker.z ppImage_103.1.events.z ppImage_103.otf ppImage_103.otf_collop.csv ppImage_103.otf_data.csv ppImage_103.otf_func.csv ppImage_103.otf_p2p.csv ppImage_103.otf_result.aux ppImage_103.otf_result.dvi ppImage_103.otf_result.log ppImage_103.otf_result.ps ppImage_103.otf_result.tex

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IPP - Repeated Execution of an Application

Robert Henschel

IBM iDataPlex Testing• We are testing an IBM iDataPlex system

– 84 dual quad-core node cluster– Infiniband interconnect

• Tachyon, a part of the SPEC MPI2007 benchmark suite had a strange runtime behavior– 100 vs. 200 seconds

• Tracing the application, made the problem go away, stable runtime at 110 seconds

• There will be a deeper interaction of OpenMPI and VampirTrace in the future, OpenMPI 1.3 now contains VampirTrace

Robert Henschel

IBM iDataPlex Testing

Robert Henschel

IBM iDataPlex Testing

Robert Henschel

IBM iDataPlex Testing

Robert Henschel

IBM iDataPlex Testing

Robert Henschel

IBM iDataPlex Testing

Robert Henschel

Cell B.E. Tracing• Code exists as serial version, MPI version and as Cell

B.E. version• Great to be able to use the same tracing tool on both

platforms and for different parallelization strategies• We used a beta version of VampirTrace for Cell B.E.

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

Cell B.E. Tracing

Robert Henschel

PSO – Memory Footprint• Determining the memory footprint of a particle swarm

optimizer

Robert Henschel

PSO – Memory Footprint

Robert Henschel

PSO – Memory Footprint

Robert Henschel

EMAN Tracing• Application ran very slow on regular hardware, NFS

mounted file system• Application was faster on Lustre• Application was really fast on shared memory filesystem• Suspected I/O problem• However, the actual I/O problem was not visible in

Vampir, as this had to do with file locking

Robert Henschel

EMAN Tracing

Robert Henschel

EMAN Tracing

Robert Henschel

EMAN Tracing

Robert Henschel

Cluster Challenge• WPP (Wave Propagation Program)• Vampir helped to understand the code (few comments,

not sure what the code does, and how it does it)• PAPI counter helped to show that on X86, peak floating

point performance was already achieved, thus, not much room for tuning, besides rewriting the algorithm

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge

Robert Henschel

Cluster Challenge• GAMESS• Problem with MPI communication, while socket

communication worked just fine

Robert Henschel

Tracing on Windows• On Windows, the trace creation is handled by MS MPI• Unfortunately, not a lot of details are provided, thus ,the

traces look a bit different to traces that are created on Linux– MPI collective operations are not marked as such!

• However, on Windows, you are able to trace C# application in addition to C/C++/Fortran applications– And possibly other applications, that are build on top

of MS MPI

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Tracing on Windows

Robert Henschel

Swarp - Pthread Tracing• SWARP can be run as a Pthread parallel application• Tracing of Pthread parallel applications is a new feature

of the latest VampirTrace version• Enabled by default, if Pthread flag is detected on the

compile and link commands, but can be forced as well

• Compiling with -DVTRACE_PTHREAD and including vt_user.h in the appropriate file

– Will trace the overhead of Pthread functions

Robert Henschel

Swarp - Pthread Tracing

Robert Henschel

Swarp - Pthread Tracing

Robert Henschel

Swarp - Pthread Tracing

Robert Henschel

Swarp - Pthread Tracing

Robert Henschel

Swarp - Pthread Tracing

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid• MD application, part of the research of physics

professor Chuck Horowitz• Used for studying dense nuclear matter in supernovae,

white dwarf and neutron stars • In production use on a daily basis• Exists as serial, MPI, OpenMP, Hybrid and

MDGRAPE-2 application• MDGRAPE-2 tracing is not supported out of the box,

but could be implemented using manual instrumentation

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Robert Henschel

MD Application, Serial, OpenMP, MPI, Hybrid

Recommended