29
An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft Corporation ACM Conference on Object Oriented Programming Systems Languages & Applications Tampa, Florida

An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

An On-the-Fly Reference Counting

Garbage Collector for Java

Erez Petrank

Technion – Israel Institute of Technology

Joint work with Yossi Levanoni – Microsoft Corporation

ACM Conference on Object Oriented

Programming Systems Languages & Applications

Tampa, Florida

October 18, 2001

Page 2: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Levanoni & PetrankOn-the-Fly Reference Counting2

Garbage Collection Today• Two classic approaches:

– Tracing [McCarthy 1960]: trace reachable objects, reclaim objects not traced.

– Reference counting [Collins 1960]: keep reference count for each object, reclaim objects with count 0.

• Today’s advanced environments:– multiprocessors – huge memories

Page 3: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Levanoni & PetrankOn-the-Fly Reference Counting3

Motivation for RC• Reference Counting work is proportional

to work on creations and modifications.– Can tracing deal with tomorrow’s huge

heaps?

• Reference counting has good locality.• Tracing rules JVM’s, is it justified? • The Challenge:

– RC write barriers seem too expensive. – RC seems impossible to “parallelize”.

Page 4: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Levanoni & PetrankOn-the-Fly Reference Counting4

This work• An improved RC (suitable for Java)

– Reduced overhead on write barrier,– Concurrent with low overhead: on-the-fly,

no sync. operation in write barrier, multiprocessor.

– Thus: low latency, high performance.

• Implementation:– JVM: SUN’s Java Virtual Machine 1.2.2– Platform: 4-way IBM Netfinity 8500R server

with 550MHz Intel III Xeon and 2GB memory.

Page 5: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Levanoni & PetrankOn-the-Fly Reference Counting5

Agenda

IntroductionMotivationThe Algorithm• Related issues• Implementation and

Measurements • Conclusions

Page 6: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Levanoni & PetrankOn-the-Fly Reference Counting6

Terminology

Stop-the-World

Parallel

Concurrent

On-the-Fly

programGC

Page 7: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Levanoni & PetrankOn-the-Fly Reference Counting7

Basic Reference Counting• Each object has an RC field, new

objects get o.RC:=1.• When p that points to o1 is modified to

point to o2 we do: o1.RC--, o2.RC++.• if then o1.RC==0:

– Delete o1.– Decrement o.RC for all sons of o1.– Recursively delete objects whose RC is

decremented to 0.

Page 8: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Levanoni & PetrankOn-the-Fly Reference Counting8

Basic Reference Counting• Each object has an RC field, new objects

get o.RC:=1.• When p that points to o1

is modified to point to o2 we do: o1.RC--, o2.RC++.

• if then o1.RC==0:– Delete o1.– Decrement o.RC for all sons of o1.– Recursively delete objects whose RC is

decremented to 0.

o1 o2

p

Page 9: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Deferred Reference Counting

• Problem: overhead on updating program variables (locals) costs too much.

• Solution [Deutch & Bobrow] :– Don’t update RC for locals.– “Once in a while”: collect all objects with

o.RC=0 that are not referenced from local roots.

• Deferred RC reduces overhead by 80%. Used in most modern RC systems.

Page 10: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Multithreaded RC?• Problem:

– Parallel updates confuse counts:

– (And more: Update ref counts in parallel races.)

A

B DC

Thread 2: Read A.next;A.next D;B.RC- -; D.RC++

Thread 1: Read A.next;A.next C;B.RC- -; C.RC++

Page 11: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Multithreaded RC

• Problem:– Parallel updates confuse counts.– Update ref counts in parallel races.

• [DeTreville]:– Lock heap for each pointer modification.– Thread records its updates in a buffer. – Once in a while (snapshot alike):

• GC thread reads all buffers to update ref counts• Reclaims all objects with 0 rc that are not local.

Page 12: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

To Summarize…

• Overhead on write barrier is considered high.– Even with deferred RC of Deutch &

Bobrow.

• Using reference counting concurrently with program threads seems to bear high synchronization cost. – Lock or “compare & swap” for each

pointer update.

Page 13: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Improving RC

• Consider a pointer p that takes the following values between GC’s: O0,O1, O2, …, On .

• All RC algorithms perform 2n operations: O0.RC--; O1.RC++; O1.RC--; O2.RC++; O2.RC--; … ; On.RC++;

• But only two operations are needed:O0.RC-- and On.RC++

p

O1 O2 O3 On. . . . .O4O0

Page 14: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Improving RC cont’d

• Don’t record all pointer modifications.Record first modifications between GC’s (O0).

• During the collection, for each recorded ptr p: – find O0 by checking the record,

– find On by reading the heap during the collection.

• Apply only two operations for each such pointer: O0.RC-- and On.RC++

p

O1 O2 O3 On. . . . .O4O0

This reduces number of logging & counter updates by a factor of 100-1000 for normal benchmarks!

Page 15: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Improving Synch. Overhead

• Simple solutions bear unacceptable overhead:– DeTreville uses a lock for all pointer

modifications– Simple alternatives require 3 compare-

and-swap’s• Our second contribution:

– A carefully designed write barrier (and an observation) allows elimination of all sync. operations from the write barrier.

Page 16: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

The write barrierUpdate(Object **slot, Object *new){ Object *old = *slot if (!IsDirty(slot)) { log( slot, old ) SetDirty(slot) } *slot = new}

Observation:If two threads:1. invoke the write barrier

in parallel, and 2. both log an old value,then both record the same old value.

Page 17: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Intermediate Algorithm:Snapshot Oriented, Concurrent• Use write barrier with program threads. • To collect:

– Stop all threads– Scan roots (locals)– get the buffers with modified slots – Clear all dirty bits. – Resume threads– For each modified slot:

• decrease rc for old value (written in buffer),• increase rc for current value (“read heap”),

– Reclaim non-local objects with rc 0.

Page 18: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

The Sliding View AlgorithmOn-th-Fly

• Do all collection as threads run: – Read threads buffers (one thread at a time),– Clear all dirty bits,– Update reference counts,– Read roots of each thread, one at a time, – Reclaim (recursively) objects with rc 0.

• Note: rc’s are not correct for any specific point in time, yet, with care, most dead objects may be reclaimed!

• Borrow ideas from [Lamport et. Al.]

Sliding View

Page 19: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Cycles Collection

• Our solution: use a tracing algorithm infrequently.

• Currently this is the most efficient solution. Cycle collectors have high cost.

• We propose a new on-the-fly mark & sweep algorithm that works best with the same sliding view.Can also be used “on its own”.

Page 20: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Implementation for Java

• Based on Sun’s JDK1.2.2 for Windows NT• Main features

– 2-bit RC field per object (á la [Wise et. al.])– A supplemental sliding view tracing

algorithm– A custom allocator for on-the-fly RC:

• Multi leveled fine grained locking• Supports sporadic reclamation of objects• Supports sweeping the heap

Page 21: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Performance Measurements

• First multiprocessor measurements in a “normal” environment! – (Previous measured reports assumed one

CPU is free for GC all the time.)

• Benchmarks:– Server benchmarks

• SPECjbb2000 --- simulates business-like transactions in a large firm

• MTRT --- a multi-threaded ray tracer

– Client benchmarks• SPECjvm98 --- a suite of mostly single-threaded client

benchmarks

Page 22: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Improved RC• How many RC updates are eliminated?

BenchmarkNo of storesNo of “first” stored

Ratio of “first” stores

jbb71,011,357264,1151/269

Compress64,905511/1273

Db33,124,78030,6961/1079

Jack135,174,7751,5461/87435

Javac22,042,028535,2961/41

Jess26,258,10727,3331/961

mpegaudio5,517,795511/108192

Page 23: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

SPECjbb Latency(Max Transaction Time)

0

2000

4000

6000

8000

10000

Milliseconds

# Threads

SPECjbb -- M ax. Response Time (600M B)

RC 16 16 47 78 110 146 245 329

Original 7433 8037 8463 6923 7857 7536 6593 5997

1 2 4 6 8 10 15 20

Page 24: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

SPECjbb ThroughputSPECjbb -- performance vs. # threads (600MB)

-6.0%

-4.0%

-2.0%

0.0%

2.0%

4.0%

6.0%

Threads

Cha

nge

in T

hrou

ghpu

t

(%)

RC 0.4% 4.0% -5.4% -2.0% -1.0% -2.2% -0.3% 2.4%

1 2 4 6 8 10 15 20

Page 25: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

MTRT Throughput

MTRT -- Improvement in Execution Time

-2.0%

0.0%

2.0%

4.0%

6.0%

8.0%

10.0%

12.0%

# Threads

Tim

e

( seco

nd

s

)

RC 4.9% 5.0% 7.2% 5.6% 11.4% 0.2% -0.1%

1 2 3 4 8 12 16

Page 26: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

SPECjbb Heap Utilization

SPECjbb --- Heap Usage

0

50

100

150

200

250

300

350

# Threads

MB

All

oca

ted

( no

t F

ree

)

RC 27 44 77 108 170 171 251 329

Original 26 42 74 104 135 166 243 320

1 2 4 6 8 10 15 20

Page 27: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Client PerformanceSPECjvm98 -- Total Execution Time

0.0%

1.0%

2.0%

3.0%

4.0%

GC Version

%

slo

we

r

% Slower ExecutionTime

3.6%

RC

Page 28: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Related Work

• On-the-fly tracing: – Dijkstra et. al. (1976), Steele (1976), Lamport

(1976), – Kung & Song (1977), Gries (1977) Ben-Ari

(1982,1984), Huelsbergen et. al. (1993,1998) – Doligez-Gonthier-Leroy (1993-4), Domani-

Kolodner-Petrank (2000)

• Concurrent reference counting: – DeTreville (1990), – Martinez et. al. (1990), Lins (1992)– Plakal & Fischer (2001), – Bacon et. al. (2001)

Page 29: An On-the-Fly Reference Counting Garbage Collector for Java Erez Petrank Technion – Israel Institute of Technology Joint work with Yossi Levanoni – Microsoft

Conclusions

• A new algorithm for reference counting.– Low overhead on pointer modification– On-the-fly

• Implementation for Java• Measurements show high throughput

and low latency.• To be out soon: A matching paper on

the sliding view tracing collector.