28
CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Embed Size (px)

Citation preview

Page 1: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

CLRProfiler

A Tool to Analyze the Heap Usage of Managed Code

Peter Sollich

Microsoft Corporation

Page 2: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Agenda• Overview• Important Views

– Allocation Graph– Objects by Address– Time Line– Heap Graph

• How they are implemented- Instrumentation- Allocation notifications, Shadow Stack for Allocation Graph- Heap simulation, LiveObjectTable for Objects by Address- SampleObjectTable for Time Line- Building the Heap Graph

- The CLRProfiler API

Page 3: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Overview

• Focused on managed heap– Who allocates what– What objects survive– What is on the heap– Who is holding on to objects

• Instrumented application writes log• Separate tool to analyze log offline• Intrusive tool

– By default, every allocation, every call is logged– Expect 10 – 100 x slowdown– Logging can be turned off selectively for speedup– Not a tool to measure where time is spent

Page 4: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Agenda• Overview• Important Views

– Allocation Graph– Objects by Address– Time Line– Heap Graph

• How they are implemented- Instrumentation- Allocation notifications, Shadow Stack for Allocation Graph- Heap simulation in „LiveObjectTable“ for Objects by Address- „SampleObjectTable“ for Time Line- Building the Heap Graph

- The CLRProfiler API

Page 5: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Allocation Graph

What is it?• Every type, every method becomes a vertex• Edges from method to type if method allocates type• Edges from method A to method B if A calls B and B (directly or indirectly)

allocates• Sizes of vertices, width of edges proportional to cumulative total of

allocation• Detail is suppressed below an adjustable threshold• Recursion is eliminated to some extent

What is it useful for?• Which call stacks allocate which types?

Page 6: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Allocation Graph (continued)

• Show demo of simple parser example– What would we do in this case? Avoid string.Split, use lower level

calls.– We could also change string.Split to allocate a smaller int[] array and

grow if necessary.– What does CLRProfiler itself do for parsing? Show Source code for

ReadChar().

• Another example: drawing gc tick marks– Original version – new SolidBrush(), new Pen() inside loop– Improved version preallocates a table of these

• Real life example – a GUI library was allocating lots of objects while just hovering the mouse over form – we were able to pinpoint allocating methods

Page 7: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Objects by Address

What is it?• Picture of gc heap at particular point in time• Addresses are mapped to pixels – selectable resolution• Pixel color determined by object‘s type

What is it useful for?• Determine what gets promoted to higher generations• Get intuition for what is taking space on the heap• Get intuition for application phases• Can drill down by selecting address range and asking for

allocation graph, histograms

Page 8: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Objects by Address (continued)

• Demo application with finalizers– Fix the problem – using statement– Not so contrived as it seems – this happened in CLRProfiler with pens

and brushes!

• Real life example– Database application had bad performance, stressing gc– Tracked this down to useless finalizer– Still happens to a surprising extent in Longhorn – unfortunately not so

easy to fix there...

Page 9: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Time Line

What is it?• Picture of evolution of gc heap over time• Horizontal axis is time axis• Vertical axis is addresses• Pixel color determined by type of object at address at point in time

What is it useful for?• See at a glance whether heap is growing, how effective garbage collection

is• See how often gen 0, gen 1, gen 2 collections occur• Can drill down by selecting time interval or point in time and requesting

Allocation Graph, Histograms etc.

Page 10: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Time Line (continued)

• Demo view using previous example• Real life example

– Application using asynchronous I/O (network) had high working set– Tracked down to garbage collector unable to recycle memory due to

fragmentation caused by pinning– In CLRProfiler, noticed discrepancy between what the time line

showed (a large heap), and what Heap Graph said (not many objects actually live)

– Changed garbage collector to deal better with fragmentation caused by pinning, also need to change application or library to use less pinning.

Page 11: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Heap Graph

What is it?• Shows objects on the heap and their connections• For each object, only shows (a) shortest path from a garbage collector

root• Objects are grouped by signatures (type of parent, type of object, type of

objects referenced).• History feature – you can see what is growing• You can see new objects added since the last heap dump

What is it useful for?• Used mostly to investigate leaks – what is holding all these objects alive?

Page 12: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Real life example: Longhorn Explorer

• Opening and closing the clock on the side bar– Time Line shows the heap is growing – a leak?– Objects by Address shows phases of the application– Heap Graph shows growth – we are indeed leaking

• Explorer idle– Time Line shows garbage collector tuning problem – need to do

generation 1 garbage collection earlier– Objects by Address shows various ...Transform objects piling up in

generation 1– Can drill down and find out call stacks responsible for these

allocations. Need to investigate whether callers can call Dispose when done using the ...Transform objects.

Page 13: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Agenda• Overview• Important Views

– Allocation Graph– Objects by Address– Time Line– Heap Graph

• How they are implemented- Instrumentation- Allocation notifications, Shadow Stack for Allocation Graph- Heap simulation, LiveObjectTable for Objects by Address- SampleObjectTable for Time Line- Building the Heap Graph

- The CLRProfiler API

Page 14: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Instrumentation

• CLR provides profiling callbacks – no need to change CLR or application to profile

• Set environment variables (or registry keys) to indicate profiling and which profiler dll to loadset COR_ENABLE_PROFILING=0x1

set COR_PROFILER={8C29BC4E-1F57-461a-9B51-1200C32E6F1F}

• Profiler dll implements interface which CLR calls to notify profiler of important events

• Profiler dll associated with CLRProfiler responds to notifications by logging or updating its datastructures

Page 15: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Logging Allocations

• CLR calls ObjectAllocated notification, passing address and type of object allocated

• CLR calls FunctionEnter/FunctionExit notifications – profiler dll uses these (and others for tail calls, exceptions etc.) to maintain shadow stack

• Profiler dll responds to ObjectAllocated notification by logging allocation and current shadow stack

• Current format encodes object size, object type and call stack in a single integer.

• But ID of allocating thread and object address are also logged – still room for improvement!

Page 16: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Logging Allocations – Evolution of format

• Initial format – log all the names explicitly, e.g.:System.String 24

System.String.Concat System.String(System.String System.String)

MyNameSpace.MyClass.MyFunction void ()

...

• Next step: list type, method names just once, they occur many times (but added object address):t 33 System.String

f 163 System.String.Concat System.String(System.String System.String)

f 623 MyNameSpace.MyClass.MyFunction()

...

a 0x01234568 33 24 163 623

Page 17: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Evolution of log file format (continued)

• Now we compress object type, size and stacktrace into a single integer – they are often repeated as well:t 33 System.Stringf 163 System.String.Concat System.String(System.String System.String)f 623 MyNameSpace.MyClass.MyFunction()s 3453 33 24 163 623a 0x01234568 3453

• On the other hand, we needed to add the id of the allocating thread:! 968 0x1234568 3453

• Still room for improvement – thread ids can be smaller, addresses, stacktrace ids are often consecutive or close

Page 18: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Building the Allocation Graph

• For Allocation Graph, parser just increments a counter in table indexed by the id that encodes object size, type and call stack

• When done parsing, examine each non-zero entry• Retrieve call stack, make each entry into a vertex• Retrieve object type, make into a vertex• Generate appropriate edges• Weight of vertices and edges determined from count and

object size• Lay out and display resulting graph – hard to do well, still in

flux

Page 19: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Building Object by Address

• Uses allocation log entries• Also uses relocation log entries generated when garbage collector moves

entries – log file format is simple:u <oldObjectId> <newObjectId> <length>u 0x01cc4a58 0x01cc469c 296

• Object death notification would be handy, but we don‘t have it (yet) – using heuristics to infer object death – not completely accurate!

• How to build the simulation – „LiveObjectTable“– Needs to be fast and relatively compact– Started with sorted list, was too slow and bulky– Currently use (segmented) array indexed by object‘s address, a ½ size map of the real

garbage collector heap– Stored information: 1 bit for start of object, 22 bits to encode type, size, stacktrace, 23

bits for allocation time – minimum object size is 12 bytes in the real heap, 6 bytes in the map. Stored as 3 16-bit words, high bit marks obj start

Page 20: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

LiveObjectTable

Real object layout: Simulated layout (1/2 Size)

Object Header (4 Bytes)

Method Table (4 Bytes)

Contents (>= 4 Bytes)...

1TSS Lo (15 bits)

0TSS Hi (7 bits)

AT Lo (8 bits)

AT Hi (15 bits)

0

0 ... 00

TSS = type, size, stacktraceAT = allocation timeHigh bit marks start of object

Next Object Header (4 Bytes) ... 1Next TSS Lo ...

(15 bits)

Page 21: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Building Time Line

• Cannot show or track history of all objects• Hence, track history of changes at „sample addresses“ (1024

byte granularity today)• When an object gets allocated at or moved from or to a

sample address, make a note of the change• Data structure used is an array of lists of changes:

„SampleObjectTable“• Additional lists for garbage collection markers, user entries in

the log file• Final graph drawn by walking lists of changes, drawing

colored lines from change to change

Page 22: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

SampleObjectTable

0x123400001.524 Secs

String1.623 Secs

Int321.892 SecsFree Space

0x12340400

0x12340800

0x12340C00

1.589 SecsObject []

1.931 SecsString

2.054 SecsDouble

GarbageCollections

1.985 SecsGen 1 Coll.

1.765 SecsGen 0 Coll.

1.210 SecsGen 0 Coll.

Page 23: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Building Heap Graph

• Profiler dll uses ObjectReferences and RootReferences notifications to log appropriate information - log file format is very simple:– o <objectID> <typeID> <Size> {<referencedObjectId>}– o 0x01c91298 13 24 0x01c912b0 0x01c912fc 0x01c91320– r {<rootObjectId>}– r 0x020455d0 0x02044d98 0x02044984 0x02044570

• Parser of analysis tool builds full graph of objects• Analysis tool determines (a) shortest path from the root for each object in

a breadth-first traversal• For each object, signature is computed from type of parent, type of object,

and type of objects referenced• For each object on the path from the root, we create a vertex based on the

signature, and appropriate edges between the vertices• Finally, lay out and display resulting graph

Page 24: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Building Heap Graph (continued)

• Objects in Heap Graph contain allocation time (determined by looking them up in LiveObjectTable)– We can restrict the graph to a range of allocation times– Useful to determine recent heap growth– Useful to determine old objects surviving for too long

• Objects in Heap Graph also have allocation stack (also determined by looking them up in LiveObjectTable)– So we can tell you who allocated them

Page 25: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Agenda• Overview• Important Views

– Allocation Graph– Objects by Address– Time Line– Heap Graph

• How they are implemented- Instrumentation- Allocation notifications, Shadow Stack for Allocation Graph- Heap simulation, LiveObjectTable for Objects by Address- SampleObjectTable for Time Line- Building the Heap Graph

- The CLRProfiler API

Page 26: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

CLRProfiler API

• Allows application to control profiling:– Turn profiling on or off– Log comments (can also serve as time markers)– Trigger heap dumps– Enables more fancy investigations

• Compare heap dumps at specific points in application

• Determine which objects allocated between entering and exiting a method still survive later

• Etc.

• Implemented by a thin layer of managed code calling into profiling dll via P/Invoke.

Page 27: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

CLRProfiler API (continued)

• public class CLRProfilerControl• {• // log comments• public static void LogWriteLine(string comment) { ... }• public static void LogWriteLine(string format, params object[] args) { ... }

• // enable/disable profiling• public static bool AllocationLoggingActive { get { ... } set { ... } }• public static bool CallLoggingActive { get { ... } set { ... } }

• // dump heap• public static void DumpHeap() { ... }

• // is process being profiled?• public static bool ProcessIsUnderProfiler { get { ... } }• }

Page 28: CLRProfiler A Tool to Analyze the Heap Usage of Managed Code Peter Sollich Microsoft Corporation

Finally...

• Source code, binaries, and documentation for CLRProfiler are freely available:– http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE605

2-D7F4-4AEB-9B7A-94635BEEBDDA&displaylang=en– http://www.gotdotnet.com/team/clr/about_clr_performance.aspx also

has a link that will probably be more permanent...

• If you get the impression that you could write a profiler, too, you are probably right – it‘s not that hard, and there are plenty of interesting notifications from the CLR to play with...

• More examples and Documentation provided with VS• Questions?