64
Pin2 Tutorial 1 Pin Tutorial Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Embed Size (px)

Citation preview

Page 1: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 1

Pin TutorialPin Tutorial

Kim HazelwoodRobert Muth

VSSAD Group, Intel

Page 2: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 2

Pin People

Robert Cohn

Kim Hazelwood

Artur Klauser

Geoff Lowney

CK Luk

Robert Muth

Harish Patil

Ramesh Peri

Vijay Janapareddi

Steven Wallace

Page 3: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 3

Outline

Pin Overview

Instrumentation Basics

Advanced Topics

Page 4: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 4

What is Pin?

• Pin Is Not a TLA

• Pin is a dynamic binary rewriting engine

• Derived from Spike: a static rewriter

• Two versions available:– Pin2 is the current version– Pin0 (IPF only) is not covered in this talk

Page 5: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 5

Pin Features

• Rewritten program exists only in memory • No tool chain dependence

– No issues with code/data mixing, missing relocs, etc.

• Rewrites all user level code including shared libs• Multi-ISA: Itanium, IA32, EM64T, XScale• Attach/detach to/from running process (like gdb)• Transparent: unchanged program behavior • Efficient: very good performance

Page 6: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 6

Pin Applications

• Optimization

• Security (program shepherding)

• Debugging

• Instrumentation

Instrumentation is our current focus

Page 7: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 7

Uses for Instrumentation

• Profiling for optimization– Basic block counts, edge counts– Value profiles, stride profiling, load latencies

• Micro-architectural studies– Branch predictor simulation– Cache simulation– Trace generation

• Bug checking– Find uninitialized or unallocated data references

Page 8: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 8

Pin Instrumentation Features

• User programmable via plug-ins– many examples provided– plug-ins are typically ISA agnostic

• Can take advantage of symtab info

• Automatic register saving/restoring

• Various instrumentation granularities– Instruction, “Trace”, Routine

• ATOM compatibility mode (AOTI)

Page 9: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 9

Other Dynamic Rewriting Engines(and what they focus on)

• Dynamo (PA-RISC HPUX)– Dynamic optimization

• DynamoRIO (IA32 Linux + Win32)– Originally: Dynamic optimization– Now: Sandboxing, some instrumentation

• Valgrind (IA32 Linux)– Originally: Special-purpose instrumentation– Now: General-purpose instrumentation

Page 10: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 10

Static Instrumentation(“Atom Style”)

• (Way) Ahead-of-time • Persistent• Good but not perfect transparency• Shared libraries can be a problem

ProgramInstrumented ProgramATOM

Page 11: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 11

Dynamic Instrumentation(“Pin Style”)

• Execution driven– Occurs when code is executed

• Original program is NOT modified– Code is “copied” into code cache– Only code in code cache is executed

• Instrumentation is not persistent

• Can also instrument libraries

Page 12: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 12

Dynamic Instrumentation

2 3

1

7

4 5

6

Pin

Originalcode

Codecache

Pin has grabbed control before execution of block 1

Page 13: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 13

Dynamic Instrumentation

2 3

1

7

4 5

67’

2’

1’

Pin

Originalcode

Codecache

Pin fetches trace and allows for instrumentation

Page 14: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 14

Dynamic Instrumentation

2 3

1

7

4 5

67’

2’

1’

Pin

Originalcode

Codecache

Pin transfers control intocode cache (block 1)

Page 15: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 15

Dynamic Instrumentation

2 3

1

7

4 5

67’

2’

1’

Pin

Originalcode

Codecache 3’

5’

6’

Pin fetches new traceand ‘links’ it

Page 16: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 16

Dynamic Instrumentation

2 3

1

7

4 5

67’

2’

1’

Pin

Originalcode

Codecache

3’

5’

6’

Pin transfers control intocode cache (block 3)

Page 17: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 17

Running Pin

• Three program images are involved:1.pin 2.pintool/plug-in 3.Application

• “Shell mode” $ pin –t inscount –- xclock

• “Gdb mode” - attaching to existing process $ pin –pid 1067 –t inscount (can detach and re-attach with different plug-in)

Page 18: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 18

Transparency

Program execution under Pin is transparent:

• Program state is unchanged– Code/data addresses, memory content

• Will not expose latent bugs

• Instrumentation sees the original program– Code/data address, memory content

• (But: intentional program state changes possible, e.g. fault injection)

Page 19: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 19

Transparency (Example)

Push 0x1006 on stack, then jump to 0x4000

Original Code:0x1000 call 0x4000

Code cache address mapping:0x1000 ->0x7000 “caller”0x4000 -> 0x8000 “callee”

Translated Code:0x7000 Push 0x10060x7006 Jmp 0x8000

Stack content remains unchanged

Page 20: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 20

Transparency has a Price

Pop 0x1006 from stack, then jump to 0x1006

Original Code:0x4400 ret

Translated Code:0x8400 Pop rx0x84… ry = Translate(rx)

0x84… Jmp ry

• Pin needs to translate program address to code cache address.• Main reason for slowdowns in dynamic instrumentation systems!

Page 21: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 21

Portability ChallengesARM IA-32/EM64T IPF

Type RISC CISC VLIW

Instruction Fixed length Variable length, prefixes

Bundled

Memory Instruction

LD/ST Any, Implicit LD/ST

Memory op size Fixed Variable length Fixed

Addressing modes Pre/post/iprel increment

Index/offset/

scale/iprel

post

Predication Cond. codes None Predicate regs

Parameters Registers Stack/registers Stacked registers

Page 22: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 22

Pin Instrumentation Query API

• ISA independent part (usually sufficient)– INS_Address(), INS_Size(), INS_IsRet(),

INS_IsCall(), INS_MemoryReadSize(), INS_Mnemonic(), etc.

• ISA dependent part (optional)– INS_GetPredicate(), INS_RegR(),

INS_RegW(), etc.

Page 23: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 23

Performance Comparison:No Instrumentation

433

747

587

1188

739

168

414

847

302

568

855

623

105 21

3

138

178

109

101

121 20

3

113

173

105

142

108 18

2 299

111

101

115 23

7

114 19

8

109

154

122

0

200

400

600

800

1000

1200

No

rma

lize

d E

xe

cu

tio

n T

ime

(%

) Valgrind DynamoRIO Pin/IA32

latest numbers are even better

Page 24: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 24

Performance Comparison: Basic-Block Counting

582

1091

860

1583

934

191

574

1220

391

817 93

6

834

479 61

7

606

633 71

8

158

480

793

269 52

0

320 50

8

240 37

7

365 47

8

212

119 20

0

517

147 31

8

177 28

6

0200400600800

1000120014001600

No

rmal

ized

Exe

cuti

on T

ime

(%)

Valgrind DynamoRIO Pin/IA32

latest numbers are even better

Page 25: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 25

Pin2 Status

• ISAs: IA32, IA32E, Xscale, (IPF soon)

• Distros: Debian, Suse, Mandrake,Red Hat 7.2, 8.0, 9.0, EL3, FC3

• >2500 downloads

• Multithreading support in beta

• Windows support in preparation

Page 26: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 26

Project Engineering

• Automatic nightly testing– >4 platforms– >7 Linux distributions– >8 compilers– >9000 binaries

• Automatically generated user manual, internal documentation using Doxygen

Page 27: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 27

Outline

Pin Overview

Instrumentation Basics

Advanced Topics

Page 28: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 28

Instrumentation vs. Analysis

Concepts borrowed from ATOM• Instrumentation routines define where

instrumentation is inserted– e.g. before instruction

Occurs at compile time (JIT time)

• Analysis routines define what to do when instrumentation is activated– e.g. increment counter

Occurs at runtime

Page 29: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 29

Instrumentation vs. Analysis (2)

In ATOM:• Instrumentation and analysis occurred in separate phase• Code was in separate files

In Pin:• Difference is somewhat blurred• Instrumentation and analysis are interleaved• User plug-in provides code for both

These are difficult terms to remember!Mental Bridge: Instrumentation → Insertion

Analysis → Action

Page 30: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 30

Instrumentation Routine

• Written in C++• Invoked by Pin via Callback mechanism• Invoked when Pin places new code in

code cache (different granularities: instruction, trace, …)

• Instruments using the Pin API for– inserting calls to analysis routines– picking arguments for analysis routines

Page 31: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 31

Analysis Routines

• Written in any language: C, C++, Asm, etc.• Invoked when surrounding code executes • Isolated from application by

– separate memory areas– separate register state

• Automatically optimized by Pin (inlining, register allocation, etc.)

Page 32: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 32

Example: Instruction Count

mov r2 = 2

add r3 = 4, r3

beq L1

add r4 = 8, r4

beq L2

IncCounter();

IncCounter();

IncCounter();

IncCounter();

IncCounter();

Instrumentation:Insert call to IncCounter()before every instruction

Analysis:

VOID IncCounter() { icount++;}

Page 33: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 33

$ /bin/ls Makefile atrace.o imageload.out

$ pin -t inscount -- /bin/ls Makefile atrace.o imageload.out

Count 422838

$

Example: Instruction Count

Output of inscount plug-in

Page 34: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 34

#include <iostream> #include "pin.H"

UINT64 icount = 0;

VOID IncCounter() { icount++;} VOID Instruction(INS ins, VOID *v) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)IncCounter, IARG_END);}

VOID Fini(INT32 code, VOID *v) { std::cerr << "Count " << icount << endl;}

int main(int argc, char * argv[]) { PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0;}

inscount.C

analysis

instrumentation

driver

ISA independent!

1

2

3

Page 35: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 35

Explanations

1. Register Instruction() to be called back for every instruction placed into the code cache

2. Insert call to IncCount() before code cache instruction

3. Register Fini() to be called back at the end

Page 36: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 36

2

Instrumentation Points

L2: mov r9 = 4

retbeq L2

Relative to an instruction (“beq L2”):1. Before (IPOINT_BEFORE)2. After (IPOINT_AFTER)3. On taken branch (IPOINT_BRANCH_TAKEN)

1

mov r4 = 2

add r3=8,r9

3

Page 37: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 37

Example: Instruction Trace

mov r2 = 2

add r3 = 4, r3

beq L1

add r4 = 8, r4

beq L2

traceInst(ip);

traceInst(ip);

traceInst(ip);

traceInst(ip);

traceInst(ip);

Page 38: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 38

Example: Instruction Trace

$ pin -t itrace -- /bin/ls Makefile atrace.o imageload.out

$ head itrace.out 0x40001e90 0x40001e91 0x40001ee4 0x40001ee5 0x40001ee7 0x40001ee8 …$

Page 39: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 39

#include <stdio.h>#include "pin.H"

FILE * trace;

VOID traceInst(VOID *ip) { fprintf(trace, "%p\n", ip);}

VOID Instruction(INS ins, VOID *v) { INS_InsertCall(ins, IPOINT_BEFORE,

(AFUNPTR)traceInst, IARG_INST_PTR, IARG_END);}

int main(int argc, char * argv[]) { trace = fopen("itrace.out", "w"); PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_StartProgram(); return 0;}

itrace.C

1

Page 40: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 40

Explanations

1. Insert traceIns() before code cache instruction, traceIns() takes extra argument!

(Bad coding practice: we should have closed the file descriptor using a Fini function)

Page 41: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 41

Analysis Routine Parameters

• IARG_UINT32 <number>• IARG_REG_VALUE <register name> [*]• IARG_INST_PTR• IARG_BRANCH_TAKEN• IARG_BRANCH_TARGET_ADDR• IARG_G_ARG0_CALLER• IARG_MEMORY_READ_EA• IARG_SYSCALL_NUMBER• …[*] Will result in ISA dependent tool

Page 42: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 42

BBL1

BBL2

Example: Fast Instruction Count

mov r2 = 2

add r3 = 4, r3

beq L1

add r4 = 8, r4

beq L2

IncCounter(1);

IncCounter(1);

IncCounter(1);

IncCounter(1);

IncCounter(1);

IncCounter(3);

IncCounter(2);

Page 43: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 43

#include <stdio.h> #include "pin.H“

UINT64 icount = 0;VOID IncCounter(INT32 c) { icount += c; }

VOID Trace(TRACE trace, VOID *v) { for(BBL b=TRACE_BblHead(trace); BBL_Valid(b); b=BBL_Next(b)){ BBL_InsertCall(b, IPOINT_BEFORE, (AFUNPTR)IncCounter,

IARG_UINT32, BBL_NumIns(b), IARG_END); }}

VOID Fini(INT32 code, VOID *v) { fprintf(stderr, "Count %lld\n", icount);}

int main(int argc, char * argv[]) { PIN_Init(argc, argv); TRACE_AddInstrumentFunction(Trace, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0;}

inscount.C

1

2

Page 44: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 44

Explanations

1. Register Trace() to be called back for every trace placed in the code cache

As first approximation, a “trace” is sequence of basic blocks (BBLs)

2. For each trace walk the BBLs and insert IncCount() with appropriate integer parameter at beginning

Page 45: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 45

Further Reading

The following material is also covered in the Pin user manual

Go to

http://rogue.colorado.edu/Pin/

Then follow the “manuals” link

Page 46: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 46

Summary

• Pin instrumentation is:– Robust– Transparent– Easy-to-use– Efficient– Portable

• Try it: http://rogue.colorado.edu/Pin

Page 47: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 47

Outline

Pin Overview

Instrumentation Basics

Advanced Topics

Page 48: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 48

Trace vs. Instruction Instrumentation

VOID Instruction(INS ins, VOID *v) { INS_InsertCall(ins, IPOINT_BEFORE,(AFUNPTR)Cnt, IARG_END);}

Can be emulated by:

VOID Trace(TRACE trace, VOID *v) { for (BBL bbl = TRACE_BblHead(trace);

BBL_Valid(bbl); bbl = BBL_Next(bbl)) { for ( INS ins = BBL_InsHead(bbl); INS_Valid(ins); ins = INS_Next(ins)){

INS_InsertCall(ins,IPOINT_BEFORE,(AFUNPTR)Cnt,IARG_END);

} }}

Page 49: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 49

Definition: Pin Trace (JITI)

• List of instructions that is only entered from top, but may have multiple exits

• No side entries (Pin duplicates code to ensure this!)

• Multiple copies of instruction in code cache

Program:mov r2 = 2

L2: add r3 = 4, r3

add r4 = 8, r4

beq L2

Trace 1:mov r2 = 2

add r3 = 4, r3

add r4 = 8, r4

beq L2

Trace 2:

add r3 = 4, r3

add r4 = 8, r4

beq L2

Page 50: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 50

Instrumentation Modes

• Just-In-Time Instrumentation (JITI)– Per instruction, per trace– “basic block” notion

• Ahead-Of-Time Instrumentation (AOTI)– Per instruction, per function, per

section/image– Emulated using JITI– Functionality similar to ATOM – Extra startup overhead– No “basic blocks” notion

Page 51: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 51

Per Image Instrumentation (AOTI)Hooking Image (Un)Loading

$pin -t imageload -- /bin/ls Makefile imageload.o inscount0.o

$ cat imageload.out Loading /bin/ls Loading /lib/ld-linux.so.2 …Unloading /bin/ls Unloading /lib/ld-linux.so.2 …

Page 52: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 52

… FILE * T;

VOID ImageLoad(IMG img, VOID *v){ fprintf(T, "Loading %s\n", IMG_Name(img).c_str());}

VOID ImageUnload(IMG img, VOID *v){ fprintf(T, "Unloading %s\n", IMG_Name(img).c_str());}

VOID Fini(INT32 code, VOID *v) { fclose(T); }

int main(int argc, char * argv[]) { trace = fopen("imageload.out", "w"); PIN_Init(argc, argv); IMG_AddInstrumentFunction(ImageLoad, 0);

IMG_AddUnloadFunction(ImageUnload, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0;

}

Page 53: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 53

“Walking” ImagesVOID ImageLoad(IMG img, VOID *v) { for (SEC sec = IMG_SecHead(img);

SEC_Valid(sec); sec = SEC_Next(sec)) { for (RTN rtn = SEC_RtnHead(sec);

RTN_Valid(rtn); rtn = RTN_Next(rtn)) {

RTN_Open(rtn); for (INS ins = RTN_InsHead(rtn);

INS_Valid(ins); ins = INS_Next(ins)) static_count++;

RTN_Close(rtn); } }}

Page 54: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 54

Explanations

• Image->Section->Routine->Instruction

• We are essentially walking the symtab

• For each functions symbol:– Disassemble function (RTN_Open)– Then walk instructions – NB: no basic blocks available!

Page 55: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 55

“Walking” And InstrumentingVOID ImageLoad(IMG img, VOID *v) { for (SEC sec = IMG_SecHead(img);

SEC_Valid(sec); sec = SEC_Next(sec)) { for (RTN rtn = SEC_RtnHead(sec);

RTN_Valid(rtn); rtn = RTN_Next(rtn)) { RTN_Open(rtn); for (INS ins = RTN_InsHead(rtn);

INS_Valid(ins); ins = INS_Next(ins)) {

INS_InsertCall(ins,IPOINT_BEFORE, (AFUNPTR)Cnt,IARG_END);

} RTN_Close(rtn);

} }}

Page 56: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 56

Explanations

• AOTI, instrumentation request are cached until code is executed

• Effect like 1st instruction count example • But:

– worse (startup) performance– higher memory consumption

• Requires symbol table

→ Bad use of AOTI!

Page 57: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 57

“Searching” And Instrumenting

VOID ImageLoad(IMG img, VOID *v) { RTN mallocRtn = RTN_FindByName(img, "malloc"); if (RTN_Valid(mallocRtn)) { RTN_Open(mallocRtn); RTN_InsertCall(mallocRtn, IPOINT_BEFORE, (AFUNPTR)MBefore, IARG_G_ARG0_CALLEE,

IARG_END); RTN_InsertCall(mallocRtn, IPOINT_AFTER, (AFUNPTR)MAfter, IARG_G_RESULT0, IARG_END); RTN_Close(mallocRtn); }}

SimpleExamples/malloctrace.C

Page 58: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 58

Explanations

• Instrument prolog and epilogs of malloc() using RTN_InsertCall

• Instrumentation really happens on instruction level, hence we must call RTN_Open

• Requires symbol table

• Good use of AOTI!

Page 59: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 59

Performance Considerations

VOID count( ADDRINT s, ADDRINT d ) {

COUNTER *pedg = Lookup( s,d ); // expensive!

pedg->_count++; }

VOID Instruction(INS ins, void *v) {

...

if ( [ins is a branch or a call instruction] ) INS_InsertCall(ins, IPOINT_BEFORE,

(AFUNPTR)count, IARG_INST_PTR,

IARG_BRANCH_TARGET_ADDR, IARG_END);

... }

Page 60: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 60

Improved Version

VOID count_fast( COUNTER *pedg ) { pedg->_count++;}

VOID InstructionFast(INS ins, void *v) {… if (INS_IsDirectBranchOrCall(ins)) { COUNTER *pedg = Lookup( INS_Address(ins),

INS_DirectBranchOrCallTargetAddress(ins) );

INS_InsertCall(ins, IPOINT_BEFORE,

(AFUNPTR) count_fast, IARG_ADDRINT, pedg,

IARG_END); } else { ... }

Page 61: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 61

Remarks

• If possible move work from analysis to instrumentation!

• Keep analysis routine small so that they get inlined!

Page 62: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 62

Plug-ins Shipped with Pin2

• Data cache simulation

• Malloc/Free tracer

• Syscall tracer

• Opcode mix profiler

• Register usage profiler

• …

Page 63: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 63

Debugging Pin Plug-ins

Pause Pin for 7 sec to attach with gdb

$ pin -pause_tool 7 -t inscount -- /bin/ls Pausing to attach to pid 28769$ gdb

(gdb) attach 28769 …(gdb) break main ...(gdb) cont

Page 64: Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel

Pin2 Tutorial 64

Summary

• Pin instrumentation is:– Robust– Transparent– Easy-to-use– Efficient– Portable

• Try it: http://rogue.colorado.edu/Pin