25
1 An Efficient On-the- Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot K. Kolodner - IBM Haifa Research Lab

1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

  • View
    218

  • Download
    3

Embed Size (px)

Citation preview

Page 1: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

1

An Efficient On-the-Fly Cycle Collection

Harel Paz, Erez Petrank - Technion, Israel

David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center

Elliot K. Kolodner - IBM Haifa Research Lab

Page 2: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

2

Garbage Collection Manual de-allocation may cause

notorious bugs (memory leaks, dangling pointers).

Garbage collection (GC): automatic recycling of dynamically allocated memory. Garbage: objects that are not live, but are

not free either.

Page 3: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

3

Reference Counting

Each object has an rc field. New objects get o.rc:=1.

When p that points to o1 is modified to point to o2 we do: o1.rc--, o2.rc++.

if o1.rc==0: Decrement rc for all sons of o1. Recursively delete objects whose rc is

decremented to 0. Delete o1.

o1 o2

p

Page 4: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

4

Three Main Drawbacks of RC

High overhead Costly parallelism Inability to reclaim cycles

Drastic improvement by Levanoni-Petrank

2001

This work

Page 5: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

5

Cyclic Structures Reclamation Problem

A garbage cycle denotes a strongly connected component in the objects graph which is unreachable from the program roots.

Garbage

cycle

p

1

1

1

2a

b

c21

Page 6: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

6

Collecting Garbage Cycles in Reference Counting Systems

Reference counting collectors employ one of 2 avenues to collect garbage cycles:

A backup tracing collector. A cycle collector.

This work proposes a new concurrent cycle collection.

Contributions: More efficient than previous concurrent cycle collector. Solves termination problem. First throughput comparison between cycle collection

and a backup tracing collector.

Page 7: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

7

Cycle Collection Basic Idea - 1

Observation 1: Garbage cycles can only be created when a rc is decremented to a non-zero value.

Objects whose rc is decremented to a non-zero value become candidates.

Page 8: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

8

Cycle Collection Basic Idea - 2

Observation 2: In a garbage cycle all the reference counts are due to internal pointer of the cycle.

For each candidate’s sub-graph, check if external pointers point to this sub-graph.

Terms: Sub-graph of O: graph of objects

reachable from O. External pointer (to a sub-graph): a

pointer from a non sub-graph object to a sub-graph object.

Internal pointer (of a sub-graph): a pointer between 2 sub-graph objects.

o

o1 o2

o4 o5

a

o3

Page 9: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

9

Goal: Compute Counts for External Pointers Only

1r

2a

1b

c2

d2

Not a garbage

cycle

a garbage

cycle

1r

1a

1b

c2

d2

edge r->a deleted

1r

0a

0b

c0

d1

rc’s when ignoring

internal edges

Page 10: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

10

Implementing the Cycle Collector Idea

1r

2a

1b

c2

d21121001

00

1100

Object is colored black/ gray/ white. Whenever a rc of an object ‘a’ is decremented to a non-zero value, perform 3 local traversals over the graph of objects of ‘a’. Mark: Updates rc’s to reflect only

pointers that are external to the graph of ‘a’, marking nodes in gray. Scan: Restores rc’s of the externally reachable objects, coloring them in black. Rest of the nodes are marked as garbage (white). Collect: collects garbage objects (the white ones).

Page 11: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

11

Concurrent Cycle Collection A concurrent collector: a collector that runs

concurrently with the program threads.

Concurrent cycle collection is more complex: objects graph may be modified while the collector scans it.

Cannot rely on repeated traversals of a graph to read the same set of nodes and edges.

Using the algorithm above may produce incorrect results.

Concurrent program

GC

Page 12: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

12

Safety Problem - Example

A mutator deletes the edge c->d, between the MarkGray and Scan procedures.

b

d

c

e

a2

1

2

22

11

1

110

00

0

1

The Scan phase incorrectly infers live objects (a & b) to be garbage.

Page 13: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

13

Confronting Drawbacks in Previous Work

Previous concurrent cycle collector by Bacon & Rajan added overhead to achieve safety in light of inconsistent view of the heap. Overhead reduces efficiency. Completeness could not be achieved.

Our solution: use a fixed-view of the heap! Multiple heap traces consider the same graph

each time.

Page 14: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

14

Getting a Snapshot of the Heap

A snapshot of the heap could be taken by a concurrent collector.

Levanoni-Petrank’s snapshot: Copy-on-first-write mechanism: for each

pointer modified for the first time after a collection:

Save its snapshot value in a buffer. Mark the pointer “dirty” (no need to be logged again).

The cycle collector traverses each object according to the pointers’ values as existed in the snapshot time.

Concurrent program

GC

Page 15: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

15

Cycle Collection on Heap’s Snapshot

The standard (non-concurrent) cycle collection correctly identifies garbage cycles on a snapshot. It is not disturbed by mutator activity.

All garbage cycles are collected. A garbage cycle created, must exist in next snapshot.

Only garbage cycles are collected. A non reachable cycle in the snapshot is indeed a

garbage cycle.

Concurrent program

GC

Page 16: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

16

Levanoni-Petrank’s RC Consider a pointer p that takes the following

values between GC’s: O0,O1, O2, …, On .

All RC algorithms perform 2n operations: O0.rc--; O1.rc++; O1.rc--; O2.rc++; O2.rc--; … ; On.rc++;

p

O1 O2 O3 On. . . . .O4O0

But only 2 operations are needed:O0.rc--,On.rc++

Page 17: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

17

Less Cycle Candidates Previous algorithms: object whose rc is

decremented to a non-zero value is considered as a candidate.

The Levanoni-Petrank’s write-barrier does not log most of the decrements. Does it “miss” cycles?

The new cycle collection algorithm collects all cycles, although performing less work.

p

O1 O2 O3 On. . . . .O4O0

Page 18: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

18

More in the Paper

Reducing pauses further by stopping each thread separately (instead of all together). Care with new races…

More techniques to reduce the number of traced objects.

Concurrent

On-the-Fly

program

GC

Page 19: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

19

Measurements Implemented in Jikes with two collectors:

The sliding-views reference-counting collector. The age-oriented collector:

Uses mark and sweep for the young generation and reference counting for the old generation.

Measurements: Throughput comparison between cycle collection

and a backup tracing collector. Characteristic comparison to the previous on-

the-fly cycle collector.

Page 20: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

20

Work ReductionWork ratio compared to Bacon & Rajan

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Wo

rk r

ati

o

candidates handled objects traced

Page 21: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

21

Work Reduction with the Age-Oriented Collector

Work ratio between RC and age oriented

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Wo

rk r

ati

o

candidates handled objects traced

Page 22: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

22

Throughput Comparison of Cycle-Collection with Backup

Tracing

SPECjbb2000 with 4-8 warehouses - Reference Counting

0.85

0.9

0.95

1

1.05

1.1

1.15

256 320 384 448 512 576 640 704

Heap size

Th

rou

gh

pu

t ra

tio

: cycle

co

llecti

on

/backu

p t

racin

g

4 warehouses

5 warehouses

6 warehouses

7 warehouses

8 warehouses

Page 23: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

23

Throughput Comparison of Cycle-Collection with Backup

Tracing

SPECjbb2000 with 4-8 warehouses - Age Oriented

0.85

0.9

0.95

1

1.05

1.1

1.15

256 320 384 448 512 576 640 704

Heap size

Th

rou

gh

pu

t ra

tio

: cycle

co

llecti

on

/backu

p t

racin

g

4 warehouses

5 warehouses

6 warehouses

7 warehouses

8 warehouses

Page 24: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

24

Related Work Cycle collection:

Cyclic reference counting with local mark-scan. Martinez, Wachenchauzer and Lins [1990].

Cyclic reference counting with lazy mark-scan. Lins [1992].

Concurrent cycle collection in reference counted systems. Bacon and Rajan [2001].

Other: An on-the-fly reference counting garbage collector

for Java. Levanoni and Petrank [2001]. Age-Oriented Concurrent Garbage Collection. Paz,

Petrank, and Blackburn [2005].

Page 25: 1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot

25

Conclusions Cycle collection may be efficiently executed

on-the-fly by using Levanoni-Petrank’s RC with the efficient standard cycle collector.

Today’s benchmarks: Reference counting for full heap: prefer a

backup tracing. Reference counting for old generation: slight

preference to cycle collection. Eyes for the future: with large heaps cycle

collection may outperform backup tracing.