33
Design and Design and Implementation of the Implementation of the Joeq Virtual Machine Joeq Virtual Machine Sun Microsystems Labs Mountain View, CA John Whaley Stanford University August 26, 2003

Design and Implementation of the Joeq Virtual Machine Sun Microsystems Labs Mountain View, CA John Whaley Stanford University August 26, 2003

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Design and Design and Implementation of the Implementation of the Joeq Virtual MachineJoeq Virtual Machine

Sun Microsystems LabsMountain View, CA

John WhaleyStanford University

August 26, 2003

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

2

About me

• Worked on Java VMs since JDK 1.0– 1996: Extended AWT to support pen input– 1997: Clean-room Java VM written in C++– 1998: Jalapeno: designed opt compiler, …– 1999: MIT Flex: dataflow framework, etc.– 2000: IBM Tokyo JIT: x86 performance– 2001: joeq virtual machine

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

3

Key Features

• Implemented in 100% Java– Includes native methods to manipulate

addresses, memory, registers directly.

• Native vs. hosted execution– Native: run directly on hardware– Hosted: run on top of another VM

• Bootstrap to native via reflection• Supports both GC and explicit

deallocation

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

4

Key Features

• Compiler and program analysis framework

• Multiple languages: Java, C, C++, …– Single intermediate representation

• Static, quasi-static, and dynamic compilation– Single unified compiler infrastructure

• Online and offline profiling system• M:N thread scheduler

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

5

Motivation/Purpose

• Started Ph.D. studies, needed a research infrastructure

• Purpose:– Try out new ideas– Do research– Publish papers

• Not out to:– Compete with other VMs– Make a shippable product– Change the world

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

6

Other Options

• SUIF– Written in C++– Limited support for Java– No dynamic compilation or runtime system– EDG frontend: not 100% gcc compatible

• Jalapeno– Written in Java– Very familiar with the system– Supports Java only– Not available outside of IBM

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

7

Other Options

• MIT Flex compiler– Written in Java– Familiar with system– Open-source GPL– Statically-compiled Java only

• Kaffe, etc.– Written in C– Poor design, poor performance

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

8

Why Another VM?

• General problem with established projects:– Established users and code base

made it difficult to make major changes.

– Wanted to fix the design "mistakes" of Jalapeno and MIT Flex compiler

– More productive in Java than in C++

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

9

Design Goals

• Ease of trying out new research ideas– Implemented in Java– Modularity.– Lots of reusable code, use of software

patterns.• Support Java and C/C++

– A single intermediate representation– Support GC and explicit deallocation

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

10

Design Goals

• Support static, quasi-static, dynamic compilation.– Unified compiler framework.– Compiler implemented in Java.– Allow "maybe" responses due to

incomplete information.– General code patching mechanism.– Profile framework allows online/offline

profiling.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

11

Design Goals

• Get something up and running quickly.– Make compiler, runtime easy to debug– Hijack class libraries from running VM– LGPL: can borrow code from other open-

source projects– Goal: Self-bootstrapping after one month

• Make it available for others to use.– Documentation, etc.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

12

Not Design Goals

• Performance leader– An endless pit, takes a lot of effort– Performance just needs to be

“reasonable”– Should be designed for good

performance if someone wanted to put in the effort

• 100% conformance to specification– If programs work, that’s good enough.– No access to good test suites, anyway.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

13

ELF objectfile

IRQuad

Controller

Profiler

Quadbackend

Bytecodebackend

BytecodeIR

SUIF fileloader

SUIF toQuad

Bytecodedecoder

Compiled codeplus metadata

Profile datafile

Object filedata section

Executable codein memory

ELF filecode section

COFF filecode section

Garbagecollector

Memoryheaps

Thread scheduler,synchronization,

stack walker

type checking

Introspection,verification,

Systeminterface

Class/membermetadata

Optimizationsand analyses

Bytecode/Quadinterpreters External

libraries

Java classfile

Java classfile loader

Disassembleto Quad

ELF binaryloader

FRONT-END

SUIF file

COMPILER DYNAMIC

BACK-END

INTERPRETER

MEMORY MANAGER RUN-TIME SUPPORT

Bytecode toQuad

System Overview

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

14

Consequences of 100% Java

• Implementation purity– Self-applicable– VM code is great for program analysis, makes a

great test suite

• Portability– >95% of the code is system-independent– Hosted execution

• Easier software engineering– Exceptions, GC, software patterns, existing

tools

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

15

Consequences of 100% Java

• Java is not a panacea of portability– Hosted execution works OK on most

VMs– Native bootstrapping is horribly VM-

dependent• Internal class library changes cause Joeq

to break

– Supporting multiple JDK versions is difficult

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

16

Bootstrapping technique

• Use reflection and code analysis to determine root set of methods and objects

• Dump the objects and code into an object file (COFF or ELF format)

• Use a standard linker to generate an executable

• Easy support for static and quasi-static compilation, cross-language calls, dynamic linking, etc.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

17

Bootstrapping trickiness

• Custom class loaders– Have to hijack class loader and wrap it

• Files, etc. must be reinitialized– Some state stored in native code

• Objects created during image write– Finalizer threads, reflection caches,

character encodings, …

• Reflection doesn’t work on all objects– Throwable backtrace, ThreadLocal, etc.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

18

Consequences of bootstrapping technique

• Standard file formats very useful– Use existing tools and debuggers

• Big startup time improvement on applications (30x)– Skips all of the initialization code, JIT

startup costs• Large object files, number of

relocations cause problems with some tools.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

19

Consequences of bootstrapping technique

• Automatic discovery of necessary code: time-consuming, too conservative.

• Hardwired class list: smaller and faster, but breaks often.

• Problem: Instantiating an object means class is initialized, which brings in class initializer and many more objects

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

20

Consequences of bootstrapping technique

• Bootstrapping process is a major pain– Time-consuming: reflection is

inefficient– Difficult to debug– Process breaks with different JDK

versions, environment variables, command line options, locales, etc.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

21

Class library implementation

• GNU Classpath: too incompatible, too buggy

• Hijack Sun class library by class merging– Make a “mirror” class with the same name.– Special class loader merges the classes.

• Easy implementation of native methods.– Native code is just normal Java code.

• Perfect compatibility, easy updates

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

22

Consequences of mirror classes

• Types don’t match, so javac complains– Cast to java.lang.Object, then back down.

• Doesn’t work on different class libraries.• Many changes between subversions.

– Use a hierarchy of mirror classes

• Incompatible changes lead to many hacks.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

23

Multiple language support

• Joeq has support for:– Java class files– SUIF files

• C, C++, Fortran, …

– x86 object code

• All are translated into a single intermediate representation, the Quad.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

24

Quad intermediate representation

• Analyses and optimizations are instantly applicable to all languages

• Cross-language inlining and optimization– Elimination of JNI overhead

• Support for raw address manipulation in Java falls out naturally

• Type-accurate garbage collection for well-behaved C/C++ programs

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

25

Quad intermediate representation

• Generic interfaces for operators– Lots of shared code

• Types are optional– Type analysis will construct type

information• Doesn’t support all esoteric C/C++

features– Computed labels, C++ nastiness, etc.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

26

Hierarchy of Operators

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

27

Memory management

• Memory management is abstracted into different heaps– Each heap has its own

allocation/deallocation policy• Interface for querying garbage

collection policies– Type-accurate, semi-accurate, conservative– GC-safe points or at any instruction– Thread-local allocation pools

• Working out an interface with JMTk

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

28

Consequences of memory management framework

• Debugging– Run under hosted execution mode– Image snapshots– 100% type-accurate is hard

• Coordinating threads for GC– Making a general interface is tricky

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

29

Thread scheduler

• M:N thread scheduler– Lightweight Java threads– Thread switch at any instruction– Uses local thread queues and work-stealing

• Timer ticks by using setitimer interrupts (Linux) or a separate thread (Windows)

• Thread-local information stored off of fs register

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

30

Consequences of Java thread scheduler

• Accessing threads in a machine-independent way is not easy

• Linux pthread implementation is broken– Lots of bugs, race conditions, inefficiencies– Changing stack pointer is not always

supported– Use of fs register is not always supported

• Windows support is much nicer (?)

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

31

Running an Open-Source Project

• Lots of interest, but very few people actually follow thru

• Not many people have the skills– Of those, not many have the time

• Of those, even fewer have the perseverance– The result is that there have only been minor

contributions by others

• Documentation, testing, file releases, updating the web site all take time.

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

32

Running an Open-Source Project

• What’s needed:– Nightly build scripts and regression

testing– Implementation hackers– People interested in GC

August 26, 2003 Design and Implementation of the Joeq Virtual Machine

33

Conclusion: What I’ve learned

• Software patterns are useful– Joeq: 100K lines of code

• Modular design is key– Trying out new type checker: ~2 hours

• For maximum efficiency, design the system to be easily debuggable.

• Preemptively eliminate obvious problems.

• Its more fun to write code when you also write the compiler.