39
380C Where are we & where we are going Managed languages • Dynamic compilation • Inlining • Garbage collection • What else can you do when you examine the heap a lot? – Why you need to care about workloads Alias analysis Dependence analysis Loop transformations EDGE architectures 1

380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Embed Size (px)

Citation preview

Page 1: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

380C

• Where are we & where we are going– Managed languages

• Dynamic compilation• Inlining• Garbage collection• What else can you do when you examine the heap a

lot?– Why you need to care about workloads– Alias analysis– Dependence analysis– Loop transformations– EDGE architectures

1

Page 2: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

2

380C lecture 18• Garbage Collection

– Why use garbage collection?– What is garbage?

• Reachable vs live, stack maps, etc.

– Allocators and their collection mechanisms• Semispace• Marksweep• Performance comparisons

• Mark Region– Incremental age based collection

• Write barriers: Friend or foe?• Generational • Beltway

Page 3: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Mark Region and Other Advances in Garbage

Collection

Kathryn S. McKinley Stephen M. BlackburnUniversity of Texas at Austin Australian National University

PLDI’08: Immix: A Mark-Region Collector With

Space Efficiency, Fast Collection, and Mutator Performance

Page 4: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Isn’t GC a bit retro?

4

“Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection.”

Paul Jansen, managing director of TIOBE Software, in Dr Dobbs, April 2008

“Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection.”

Paul Jansen, managing director of TIOBE Software, in Dr Dobbs, April 2008

Mark-CompactStyger, 1967

Mark-SweepMcCarthy, 1960

Semi-SpaceCheney, 1970

Page 5: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

GC FundamentalsThe Time–Space Tradeoff

5

Page 6: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

GC FundamentalsThe Time–Space Tradeoff

6

Our Goal

Page 7: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

GC FundamentalsAlgorithmic Components

Allocation Reclamation

7

Identification

Bump Allocation

Free List

Tracing(implicit)

Reference Counting(explicit)

Sweep-to-Free

Compact

Evacuate

3 1

Page 8: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Mark-Compact [Styger 1967]

Bump allocation + trace + compact

Mark-Compact [Styger 1967]

Bump allocation + trace + compact

GC FundamentalsCanonical Garbage Collectors

8

Sweep-to-Free

Compact

Evacuate

Mark-Sweep [McCarthy 1960]

Free-list + trace + sweep-to-free

Mark-Sweep [McCarthy 1960]

Free-list + trace + sweep-to-free

Semi-Space [Cheney 1970]

Bump allocation + trace + evacuate

Semi-Space [Cheney 1970]

Bump allocation + trace + evacuate

Page 9: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Mark-SweepFree List Allocation + Trace + Sweep-to-Free

9

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo

✓✓Space

efficientSpace

efficient

✓✓Simple,

very fast collection

Simple, very fast collection

Poor localityPoor locality

Page 10: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

10

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo

✓✓Space

efficientSpace

efficient

Mark-CompactBump Allocation + Trace + Compact

Expensive multi-pass collection

Expensive multi-pass collection

✓✓Good

localityGood

locality

Page 11: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Semi-SpaceBump Allocation + Trace + Evacuation

11

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo

✓✓Good

localityGood

locality

Space inefficient

Space inefficient

Space inefficient

Space inefficient

Page 12: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Mark-Regionwith Sweep-To-Region

12

Sweep-to-Free

Compact

Evacuate

Reclamation

Sweep-to-Region

Mark-SweepFree-list + trace + sweep-to-free

Mark-SweepFree-list + trace + sweep-to-free

Mark-CompactBump allocation + trace + compact

Mark-CompactBump allocation + trace + compact

Semi-SpaceBump allocation + trace + evacuate

Semi-SpaceBump allocation + trace + evacuate

Mark-RegionBump + trace + sweep-to-region

Mark-RegionBump + trace + sweep-to-region

Page 13: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Mark-RegionBump Allocation + Trace + Sweep-to-Region

13

✓✓Simple,

very fast collection

Simple, very fast collection

✓✓Space

efficientSpace

efficient

✓✓Good

localityGood

locality

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo

✓✓Excellent

performanceExcellent

performance

Page 14: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Naïve Mark-Region

14

• Contiguous allocation into regionsExcellent locality– For simplicity, objects cannot span regions

• Simple mark phase (like mark-sweep)– Mark objects and their containing region

• Unmarked regions can be freed

00

Page 15: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

ImmixEfficient Mark-Region Garbage Collection

15

Page 16: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Lines and Blocks

16

Small Regions

Large Regions

✗ Fragmentation (can’t fill blocks)

✓More contiguous allocation ✗ Fragmentation (false marking)

Lines & BlocksN pages approx 1 cache line

✓Less fragmentation Objects span lines

✓Fast common case Lines marked with objects

✗ Increased metadata o/h

✗ Constrained object sizes

00

TLB locality, cache locality Block > 4 X max object size

Free FreeRecyclable lines Recyclable lines

Page 17: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Allocation Policy(Recycling)

17

• Recycle partially marked blocks first Minimizes fragmentation Maximizes sharing of freed blocks

• Recycle in address order– We explored other options

• Allocate into free blocks last

Page 18: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Opportunistic Defragmentation

18

00

• Identify source and target blocks– (see paper for heuristics)

• Evacuate objects in source blocks– Allocate into target blocks

• Opportunistic– Leave in place if no space, or object pinned

• Opportunistically evacuate fragmented blocks– Lightweight, uses same allocation mechanism– No cost in common case (specialized GC)

Page 19: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Other Optimizations

19

Implicit Marking

✓Most objects small Small objects implicitly mark next line✓V. Fast common case Large objects mark lines exactly Implicit line mark

Line mark

Overflow Allocation

Multi-line objects may skip many small holes Overflow allocation (used on failure)✓Large objects uncommon✓V. effective solution

✓✓

Page 20: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Results

Complete data available at:

http://cs.anu.edu.au/~Steve.Blackburn/pubs

20

Page 21: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Evaluation20 Benchmarks Hardware

21

Collectors

Methodology

DaCapoSPECjvm98

SPEC jbb2000

MMTkJikes RVM 2.9.3(Perf ≈ HotSpot 1.5)

Replay compilerDiscard outliersReport 95th %ile

Full HeapImmix

MarkSweepMarkCompact

SemiSpaceGenerational

GenIXGenMS

GenCopyStickyStickyIXStickyMS

Core 2 Duo2.4GHz, 32KB L1, 4MB L2, 2GB RAM

AMD Athlon 3500+

2.2GHz, 64KB L1, 512KB L2, 2GB

RAMPowerPC 970

1.6GHz, 32KB L1, 512KB L2, 2GB

RAM

Please see the paper for details.

Page 22: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Mutator Time

22

Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo

Page 23: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Minimum Heap

23

Page 24: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

GC Time

24

Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo

Page 25: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Total Performance

25

Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo

Page 26: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Generational Performance

26

Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo

Page 27: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Sticky Performance

27

Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo

Page 28: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

PseudoJBB 2000

28

On 2.4GHz Core 2 Duo

Page 29: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

PseudoJBB 2000

29

On 2.4GHz Core 2 Duo

Page 30: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Prior Work

http://www.ibm.com/developerworks/ibm/library/i-garbage1/

• IBM product collector–Mark-Region not characterized– Collector not evaluated– Product and basis for other research

• [Domani et al 2000][Kermany & Petrank 2006]

30

Page 31: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Mark-Region Collection

31

Sweep-to-Free

Compact

Evacuate

Mark-SweepFree-list + trace + sweep-to-free

Mark-SweepFree-list + trace + sweep-to-free

Mark-CompactBump allocation + trace + compact

Mark-CompactBump allocation + trace + compact

Semi-SpaceBump allocation + trace + evacuate

Semi-SpaceBump allocation + trace + evacuate

Mark-RegionBump allocation + trace + sweep-to-region

Mark-RegionBump allocation + trace + sweep-to-region

Sweep-to-Region

Page 32: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

ImmixEfficient Mark-Region Collection

32

✓✓Simple,

very fast collection

Simple, very fast collection

✓✓Space

efficientSpace

efficient

✓✓Good

localityGood

locality

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo

✓✓Excellent

performanceExcellent

performance

Page 33: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Open Source

Code available in JikesRVM 2.9.3 onward.

http://www.jikesrvm.org

Complete data available at:

http://cs.anu.edu.au/~Steve.Blackburn/pubs

33

Page 34: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Research History

• PLDI 1998– Clinger & Hanson postulated the

radioactive decay model for object lifetimes

• Genesis of Older-First– [Stefanovic, McKinley, Moss OOPSLA’99]

34

Page 35: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Garbage Collection Hypotheses

• Generational hypothesis: younger objects die quickly, so collect them first

• Older-first hypothesis: the collector can collect less the longer it waits

35

Survival function s(v) for object lifetime distribution

younger older

0 1/2V V

Age ordered heap

s(v)

Page 36: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Older-first Algorithm

36

Page 37: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Next Steps• Beltway

– [BJMM PLDI’02]– Increments– Belts– Combines generational and older-first

• Ulterior Reference Counting – [BM OOPSLA’03]– Reference count on-per-object basis– Responsiveness and throughput

• MMTk: [BCM SIGMETRICS’04 ICSE’04]– Toolkit for building & understanding GC– Motivated today’s work

37

3 4 5 6 7 8 9 10

33 34 35 36 37 38 39 40

0 1

Page 38: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

Garbage Collection is the Answer to All Your Problems• Improves data and code locality

– [Huang et al. OOPSLA’02 ISMM’04, VEE’04]• Cooperative GC optimizations

– Colocation [Guyer OOPSLA’05]– Free-me [Guyer et al. PLDI’06]

• Finds leaks– [Bond ASPLOS’06, Jump POPL’07]

• Tolerates leaks– [Bond OOSLA’08]

• Helps with dynamic software updating!– [Subramaniam, Hicks ??’08]

• DaCapo Benchmarks– [Blackburn et al. OOPSLA’06 CACM’08]

38

Page 39: 380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap

380C

• Where are we & where we are going– Why you need to care about workloads– Managed languages

• Dynamic compilation• Inlining• Garbage collection

– Opportunity to improve data locality on-the-fly– Read: X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P.

Cheng, The Garbage Collection Advantage: Improving Program Locality, ACM Conference on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 69-80, Vancouver, Canada, October 2004.

– Alias analysis– Dependence analysis– Loop transformations– EDGE architectures