49
Precise and Efficient Garbage Collection in VMKit with MMTk LLVM Developer’s Meeting Nicolas Geoffray [email protected]

Precise and Efficient Garbage Collection in VMKit with …llvm.org/devmtg/2009-10/Geoffray_GarbageCollectionVMKit.pdfPrecise Garbage Collection • Write code that locates pointers

Embed Size (px)

Citation preview

Precise and Efficient Garbage Collection in VMKit with MMTk

LLVM Developer’s MeetingNicolas Geoffray

[email protected]

Background

• VMKit: Java and .Net on top of LLVM

• Uses LLVM’s JIT for executing code

• Uses Boehm for GC

• Performance bottlenecks

• No dynamic optimization

• Conservative GC

2

CPU-intensive Benchmarks (JGF)

3

0

1.25

2.50

3.75

5.00

series lufact heapsort crypt fft sor matmult

Tim

e no

rmal

ized

to

VM

Kit

VMKit Jikes Sun

MacOSX 10.58 x X86_32

2.66GHz 12Go

VM-intensive Benchmarks (Dacapo)

0

0.38

0.75

1.13

1.50

antlr fop luindex bloat jython pmd

Tim

e no

rmal

ized

to

VM

Kit

VMKit Jikes Sun

MacOSX 10.58 x X86_32

2.66GHz 12Go

4

Execution Overheads

7%7%10%

11% 65%

antlr

5%7%18%

17%

53%

bloat

9%2%8%6%

75%

fop

24%

4%12%

11%

49%

jython

5%2%8%12%

73%

luindex

4%20%

1%

20%17%

38%

pmd

Application Allocations Collections

System.arraycopy Interface calls Others5

Execution Overheads

7%7%10%

11% 65%

antlr

5%7%18%

17%

53%

bloat

9%2%8%6%

75%

fop

24%

4%12%

11%

49%

jython

5%2%8%12%

73%

luindex

4%20%

1%

20%17%

38%

pmd

Application Allocations Collections

System.arraycopy Interface calls Others6

21% 14% 20%

35% 23% 37%

Goal: replace Boehm with MMTk

• MMTk is JikesRVM’s GC

• Framework for writing GCs

• Multiple GC Implementations (Copying, Mark and trace, Immix)

• Copying collectors require precise stack scanning

• Locate pointers on the stack

7

But... it’s in Java?

• Yes, but nothing to be afraid of:

• Use of Magic tricks

• No use of runtime features (exceptions, inheritance)

• No use of standard library

• Use VMKit’s AOT compiler

• Transform MMTk into a .bc file

8

Outline

• Introduction

• Precise garbage collection

• Compiling MMTk with VMJC

• Putting it all together

• What’s left

9

• Introduction

• Precise garbage collection

• Compiling MMTk with VMJC

• Putting it all together

• What’s left

10

Precise Garbage Collection

• Write code that locates pointers in the stack

• llvm.gcroot in JIT-generated code

• llvm.gcroot in VMKit’s runtime written in C++

• Use LLVM’s GC framework to generate stack maps

• Caml stack maps for llvm-g++ generated code

• JIT stack maps for JIT-generated code

11

Precise Garbage Collection

12

VMKit.cppCaml Stackmap

VMKit.exellvm-g++

App.java

JIT Stackmap

JITted App

Stackmapsfor precise stack

scanning

Stack Scanning

• Problem: interweaving of different kinds of functions

• Application’s managed (Java or C#) functions: trusted

• VMKit’s C++ functions: trusted

• Application’s JNI functions: untrusted

• Solution: create a side-stack for frame addresses

• Updated upon entry of a kind of method

• VMKit knows the kind of each frame on the thread stack

13

Type of methods

• Trusted

• Has a stack map, so can manipulate objects (llvm.gcroot)

• Saves frame pointer (llvm::NoFramePointerElim)

• Untrusted

• Has no stack map, so should not manipulate objects

• May not save the frame pointer

14

Stack Scanning Example (1)

15

VMKit.main (C++)

Stack Scanning Example (1)

16

VMKit.main (C++)

App.main (Java)push(Enter Java)

Stack Scanning Example (1)

17

VMKit.main (C++)

App.main (Java)

App.function (Java)

push(Enter Java)

Stack Scanning Example (1)

18

VMKit.main (C++)

App.main (Java)

App.function (Java)

VMKit.runtime (C++)

push(Enter Java)

push(Enter native)

Stack Scanning Example (1)

19

VMKit.main (C++)

App.main (Java)

App.function (Java)

VMKit.runtime (C++)

App.function2 (Java)

push(Enter Java)

push(Enter native)

push(Enter Java)

Stack Scanning Example (1)

20

VMKit.main (C++)

App.main (Java)

App.function (Java)

VMKit.runtime (C++)

App.function2 (Java)

push(Enter Java)

push(Enter native)

push(Enter Java) fp

Stack Scanning Example (1)

21

VMKit.main (C++)

App.main (Java)

App.function (Java)

VMKit.runtime (C++)

App.function2 (Java)

push(Enter Java)

push(Enter native)

push(Enter Java)

fp

fp

Stack Scanning Example (1)

22

VMKit.main (C++)

App.main (Java)

App.function (Java)

VMKit.runtime (C++)

App.function2 (Java)

push(Enter Java)

push(Enter native)

push(Enter Java)

fp

fp

fp

Stack Scanning Example (1)

23

VMKit.main (C++)

App.main (Java)

App.function (Java)

VMKit.runtime (C++)

App.function2 (Java)

push(Enter Java)

push(Enter native)

push(Enter Java)

fp

fp

fp

fp

Stack Scanning Example (2)

24

VMKit.main (C++)

Stack Scanning Example (2)

25

VMKit.main (C++)

App.main (Java)

push(Enter Java)

Stack Scanning Example (2)

26

VMKit.main (C++)

App.main (Java)

App.function (JNI)

push(Enter Java)

push(Enter JNI)

Stack Scanning Example (2)

27

VMKit.main (C++)

App.main (Java)

App.function (JNI)

App.function2 (JNI)

push(Enter Java)

push(Enter JNI)

Stack Scanning Example (2)

28

VMKit.main (C++)

App.main (Java)

App.function (JNI)

App.function2 (JNI)

VMKit.jniRuntime (Java)

push(Enter Java)

push(Enter JNI)

push(Enter native)

Stack Scanning Example (2)

29

VMKit.main (C++)

App.main (Java)

App.function (JNI)

App.function2 (JNI)

VMKit.jniRuntime (Java)

push(Enter Java)

push(Enter JNI)

push(Enter native)

saved fp

Stack Scanning Example (2)

30

VMKit.main (C++)

App.main (Java)

App.function (JNI)

App.function2 (JNI)

VMKit.jniRuntime (Java)

push(Enter Java)

push(Enter JNI)

push(Enter native)

saved fp

fp

Stack Scanning Example (2)

31

VMKit.main (C++)

App.main (Java)

App.function (JNI)

App.function2 (JNI)

VMKit.jniRuntime (Java)

push(Enter Java)

push(Enter JNI)

push(Enter native)

saved fp

fp

fp

Running the GC

A precise GC scans the stacks at safe points: point during execution where the GC can know the type

of each value on the stack

32

Single-threaded Application

• GC always triggered at safe points

• gcmalloc instrunctions

• Collector::collect()

33

Multi-threaded Application

• When entering a GC, must wait for all threads to join

• Don’t use signals! or no safe point

• Use a thread-local variable to poll on method entry and backward branches

• Scan stacks of threads blocked in JNI or system calls

34

Application changes for GC

public static void runLoop(int a) {

while (a--) System.out.println(“Hello World”);

}

35

Application changes for GC

public static void runLoop(int a) {

if (getThreadID().doGC) GC()

while (a--) {

System.out.println(“Hello World”);

if (getThreadID().doGC) GC()

}

}

36

• Introduction

• Precise garbage collection

• Compiling MMTk with VMJC

• Putting it all together

• What’s left

37

What is VMJC?

• An Ahead of Time compiler (AOT)

• Generates .bc files from .class files

• Use of llvm tools to generate platform-dependant files

• shared library: llc -relocation-model=pic + gcc

• executable: llc + ld vmkit + gcc

38

Goal: compile MMTk with VMJC

• Generate a .bc file that can be linked with VMKit

• Interface MMTK → VMKit (e.g. threads synchronization, stack scanning)

• Interface VMKit → MMTk (e.g. gcmalloc)

39

Why MMTk does not need a Java runtime?

• No use of runtime features

• synchronizations, exceptions, inheritance

• No use of standard library

• HashMap, LinkedList, ArrayList

40

How MMTk is manipulating pointers?

• Definition of Magic classes and methods

• Address, Word, Offset

• Word Address.loadWord(Offset)

• Magic classes and methods translated by the compiler [VEE’09]

• Similar mechanism than Inline ASM for C

41

Example (Frampton [VEE’09])

42

@NoBoundsCheck voidprefetchObjects( ObjectReference[]buffer){ for(inti=0;i<buffer.length;i++){ ObjectReferencecurrent =buffer[i]; current.prefetch(); } }

voidprefetchObjects( OOP*buffer, intsize){ for(inti=0;i<size;i++){ OOPo=buffer[i]; asmvolatile( "prefetchnta(%0)":: "r"(o)); } }

Inline ASM in C Magic in Java

• Introduction

• Precise garbage collection

• Compiling MMTk with VMJC

• Putting it all together

• What’s left

43

Option 1: Object File

• Create a .o file of MMTk

• gcc mmtk.o vmkit.o -o vmkit

• But...

• No inlining in application code

44

Option 2: LLVM Bitcode File

• Create a .bc file of MMTk

• vmkit (-load mmtk.bc) -java HelloWorld

• Late binding of allocations in VMKit code

• gcmalloc in C++ are linked at runtime

• Inlining in Java code

• new in applications are inlined with MMTk’s malloc

45

Option 3: Everything is Bitcode

• Create a .bc file of MMTk

• Create a .bc file of VMKit

• Link, optimize and run

46

• Introduction

• Precise garbage collection

• Compiling MMTk with VMJC

• Putting it all together

• What’s left

47

What’s left

• Implementing the MMTK → VMKit interface

• Interactions between the GC and the VM

• Finish implementation with read/write barriers

• In VMKit code, in managed code

• Run benchmarks!

• Benchmark with different GCs from MMTk

48

http://vmkit.llvm.org

49