Building Efficient and Highly Run-Time Adaptable Virtual Machines

Diego Garbervetsky, LaFHIS, UBA, Argentina - Stefan Marr, JKU, Linz, Austria

Building Efficient and Highly Run-Time

Adaptable Virtual MachinesGuido Chari, LaFHIS, UBA, CONICET, Argentina

Fully-Reflective Execution Environments (FREE)

Every entity at both, the application and the VM-levels, must provide reflective capabilities for its observation and

modification at run time.

MOPFine-Grained Scoping

GlobalMethodObjec

t

point1.setX(2)point2.setX(2)point3.setX(2)

Read-only Exampledef setX(arg) x = arg return x

Point 1

xy

mo

Point 2

xy

mo

Point 3

xy

mo

def fieldWrite(field, value) throw exception()

def fieldWrite(field, value)def returnValue(val) return val + 2

11

11

11

2

2

Ex

3

_NO_METAOBJECT

Starting Point

Extremely slow Smalltalk

interpreter!

Insights for optimizing

reflective systems

What are the fundamental performance overheads of a FREE? How can we optimize them as much as

possible?

def setX(arg) x = arg return x

Difference with Standard VMsdef IntercessionHandling(frame, operation){ if (getGlobalMetaobject(operation)) return delegateToGlobal(operation); if (frame.getMetaobject(operation)) return delegateToFrame(operation); if (rcvr().getMetaobject(operation)) return delegateToRcvr(operation); executeOperationInVM;}

It is hard for the compiler to speculate on the meta behavior because it can not guess the current metaobject for each scope on

each subsequent IH

4 Operations1 Arg Read,1 Field Write,1 Field Read,1 Return

Intercession Handling (IH)✤ Ubiquitous: every

operation must be intercepted.

✤ Complex: every interception must consider scoping conditions.

✤ Tests that depend on run-time values and jeopardize optimizations.

✤ Lookup and marshaling for delegation to language-level.

Conjectures for Optimization

✤ Stable Semantics: Moderated dynamicity at run time.

✤ Low-local metavariability: IH sites similar to call sites: most monomorphic, some polymorphic, few megamorphic.

Optimization Model

✤ Aggressive and exhaustive speculation of the meta-model: speculate on each observed metaobject at every scoping condition for every IH site.

✤ Mitigate as much as possible the overhead of the speculation guards.

Rcvr

Speculate for Each Metaobject + Scope + IH Site

def fieldWrite(field, value) throw writeException()def fieldWrite(field, value)def returnValue(val) return val + 2


def fieldWrite(field, value) write in DBdef fieldRead(field) read in DB

2

1

0

3

_NO_METAOBJECT

Field write

Read arg

Global

Frame

30 1 0 1 2

0

Global

Frame Rcvr

0 0 0

Observe the run-time behavior until become stable

Speculate For Each Metaobject + Scope + IH Site

def fieldWrite(field, value) throw writeException()

def fieldWrite(field, value);def returnValue(val) return val + 2


def fieldWrite(field, value) write in DBdef fieldRead(field) read in DB

2

1

0

3

_NO_METAOBJECT

Return

Field read

Global

Frame Rcvr

0 0 20

Global

Frame Rcvr

0 0 03

JIT Compiling the Fast Pathdef IntercessionHandling(frame){ executeVMStandardArgRead Globa

lFram

e Rcvr0 0 0

if (globalMO(writeField) == 3) write in DB; else if (frameMO(writeField) == 1 or rcvr.MO(op) == 1) throw writeException() else if (rcvr.MO(writeField) == 2) Nop else executeVMStandardWrite

Global

Frame Rcvr

30 1 0 1 2

0

Global

Frame Rcvr

0 0 03 if (globalMO(readField) == 3) read in DB else executeVMStandardRead

Global

Frame Rcvr

0 0 20

if (rcvr.MO(ret) == 2) return val + 2 else executeVMStandardReturn}

Argument read

Field write

Field read

Return

Global and Frame scoping: optimizable. Instance Scoping: still need to access

memory

Mitigating Speculation Guards: Metaobjects in Layouts

Shapes are usually needed in the context of a method


Global

Frame Rcvr

0 0 0

Return

20,0 1,2

if (rcvr.shape == 1) return val + 2; else executeVMStandardReturn

1

No extra memory access for instances

if (rcvr.MO(op) == 2) return val + 2 else executeVMStandardReturn}

1

Inherent Peak Performance Overhead

Mean peak performance overhead: Micro 0.97x

Baseline TruffleSOM: Truffle+ Graal: Java: ~2.7x (~V8)

TruffleMATE: TruffleSOM +

MOP

(+) Inherent Peak Performance Overhead

Mean peak performance overhead: Macro 1.02x

Overall peak

performance overhead:

0.99x

Peak Performance of Using the VM Reflective Capabilities

ExhaustivelyBaseline:

TruffleMATE

Overall mean peak performance overhead: 1-3.4x

Breaking Assumptions

Mega 18.5x, Mono 1.10x

Severe performance degradation when assumptions do not hold

i = 0foreach (point in list) i += point.x

Results

✤ Ran in most cases quite efficiently.

✤ Positive indicator for our optimization model.

✤ Still room for improvements.

✤ High-local meta variability leads to severe performance degradation.

Open Paths

✤ Go deeper into the VM (memory, garbage collection).

✤ Would a reflective compiler enable significant improvements?

✤ Statistics such as code bloat, length of dispatch chains, etc.

TruffleMATE: https://github.com/charig/truffleMate

https://github.com/charig/truffleMate

Science

Building Efficient and Highly Run-Time Adaptable Virtual Machines