85
Optimizing Binary Translation of Dynamically Generated Code Byron Hawkins Brian Demsky University of California, Irvine Derek Bruening Qin Zhao Google, Inc.

Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Optimizing Binary Translation of

Dynamically Generated Code

Byron HawkinsBrian Demsky

University of California, Irvine

Derek BrueningQin Zhao

Google, Inc.

Page 2: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

● Profiling● Bug detection● Program analysis● Security

Page 3: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

SPEC CPU 2006

● 12% overhead*

● 21% overhead*

*geometric mean

Page 4: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

SPEC CPU 2006

● 12% overhead*

● 21% overhead*

*geometric mean

Page 5: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Octane JavaScript Benchmark

15x overhead on Chrome V8

4.4x overhead on Mozilla Ion

18x overhead on Chrome V8

8x overhead on Mozilla Ion

Page 6: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Octane JavaScript Benchmark

15x overhead on Chrome V8

4.4x overhead on Mozilla Ion

18x overhead on Chrome V8

8x overhead on Mozilla Ion

Page 7: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

New Era of Dynamic Code

● Back in 2003...– Browsers: one single-phase JIT engine– Microsoft Office: negligible dynamic code

● A decade later...– Browsers: at least 2 multi-phase JIT engines– Microsoft Office: one multi-phase JIT

● Active at startup of all applications

Page 8: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

New Era of Dynamic Code

● Back in 2003...– Browsers: one single-phase JIT engine– Microsoft Office: negligible dynamic code

● A decade later...– Browsers: at least 2 multi-phase JIT engines– Microsoft Office: one multi-phase JIT

● Active at startup of all applications

Page 9: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Evaluation Platform

● Optimize binary translation of dynamic code

● Maintain performance for static code

● DynamoRIO on 64-bit Linux for x86

Goals

Page 10: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Evaluation Platform

● Optimize binary translation of dynamic code

● Maintain performance for static code

● DynamoRIO on 64-bit Linux for x86

Goals

Page 11: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Outline

● Background on binary translation– Current optimizations for statically compiled code

– Dynamic code → wasting translation overhead● Coarse-grained detection of code changes

● New optimizations– Manual annotations

– Automated inference

● Performance results● Related Work

Page 12: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Outline

● Background on binary translation– Current optimizations for statically compiled code

– Dynamic code → wasting translation overhead● Coarse-grained detection of code changes

● New optimizations– Manual annotations

– Automated inference

● Performance results● Related Work

Page 13: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Outline

● Background on binary translation– Current optimizations for statically compiled code

– Dynamic code → wasting translation overhead● Coarse-grained detection of code changes

● New optimizations– Manual annotations

– Automated inference

● Performance results● Related Work

Page 14: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

SPEC Benchmark App

foo() bar()

A

DynamoRIO Code Cache

BB Cache

Trace Cache

Translate application into code cache as it runs

Page 15: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

SPEC Benchmark App

foo() bar()

A

C

DynamoRIO Code Cache

BB Cache

Trace Cache

Translate application into code cache as it runs

Page 16: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

SPEC Benchmark App

foo() bar()

A

C

D

DynamoRIO Code Cache

BB Cache

Trace Cache

Translate application into code cache as it runs

Page 17: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

SPEC Benchmark App

foo() bar()

A

C

D

E

DynamoRIO Code Cache

BB Cache

Trace Cache

Translate application into code cache as it runs

Page 18: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

SPEC Benchmark App

foo() bar()

A

C

D

EIndirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

Translate application into code cache as it runs

Page 19: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

SPEC Benchmark App

foo() bar()

A

C

D

E

F

Indirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

Correlate indirect branch targets via hashtable

Page 20: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

ACDE

F?

SPEC Benchmark App

foo() bar()

A

C

D

E

F

Indirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

Hot paths are compiled into traces (10% speedup)

Page 21: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cost● Translate code● Build traces

Benefit● Repeated execution of translated code● Optimized traces

– Can beat native performance on SPEC benchmarks

Page 22: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cost● Translate code● Build traces

Benefit● Repeated execution of translated code● Optimized traces

– Can beat native performance on SPEC benchmarks

Page 23: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C

D

E

F

ACDE

F?

foo() bar()

A

C

D

E

F

Indirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

JIT Compiled Function

What if the target code is dynamically generated?

Page 24: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C'

D'

E

F

ACDE

F?

JIT Compiled Function

foo() bar()

F

A

C

D

EIndirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

The code may be changed frequently at runtime

Page 25: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C'

D'

E

F

AC

ED

F?

foo() bar()

F

A

C

D

EIndirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

JIT Compiled Function

Corresponding translations become invalid

Page 26: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C'

D'

E

F

AC

ED

F?

JIT Compiled Function

foo() bar()

F

A

C

D

EIndirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

Stale translations must be deleted for retranslation

Page 27: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C'

D'

E

F

AC

ED

F?

JIT Compiled Function

foo() bar()

F

A

C

D

EIndirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

Stale translations must be deleted for retranslation→ “cache consistency”

Page 28: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

A

B C'

D'

E

F

AC

ED

F?

JIT Compiled Function

foo() bar()

F

A

C

D

EIndirect Branch Lookup

DynamoRIO Code Cache

BB Cache

Trace Cache

Stale translations must be deleted for retranslation→ How to detect code changes?

Page 29: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes

Page 30: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

Page 31: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

● Instrument traces to check freshness

Page 32: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

● Instrument traces to check freshness– DynamoRIO supports standalone basic blocks

→ too much overhead!

Page 33: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

● Instrument traces to check freshness– DynamoRIO supports standalone basic blocks

→ too much overhead!

● Leverage page permissions and faults

Page 34: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

● Instrument traces to check freshness– DynamoRIO supports standalone basic blocks

→ too much overhead!

● Leverage page permissions and faults– Make code pages artificially read-only

Page 35: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

● Instrument traces to check freshness– DynamoRIO supports standalone basic blocks

→ too much overhead!

● Leverage page permissions and faults– Make code pages artificially read-only

– Intercept page faults and invalidate translations

Page 36: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

● Instrument traces to check freshness– DynamoRIO supports standalone basic blocks

→ too much overhead!

● Leverage page permissions and faults– Make code pages artificially read-only

– Intercept page faults and invalidate translations

→ Acceptable overhead (for rare occurrence)

Page 37: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Detecting Code Changes on x86

● Monitor all memory writes– Too much overhead!

● Instrument traces to check freshness– DynamoRIO supports standalone basic blocks

→ too much overhead!

● Leverage page permissions and faults– Make code pages artificially read-only

– Intercept page faults and invalidate translations

→ How does this work?

Page 38: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

r-x

r-x

rwx

bar() bar()

Page 39: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

r-x

r-x

rwx

bar() bar()XPage fault

Page 40: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

r-x

r-x

rwx

bar() bar()X

Page 41: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Chrome V8

foo_2()

DynamoRIO Code Cache

compile_js() compile_js()

rwx

r-x

rwx

bar() bar()

Allow write

Page 42: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Chrome V8

foo_2()

DynamoRIO Code Cache

compile_js() compile_js()

rwx

r-x

rwx

bar_2() bar()

compile_more_js()

Thread B

Thread A

Allow write!

Page 43: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Chrome V8

foo_2()

DynamoRIO Code Cache

compile_js() compile_js()

rwx

r-x

rwx

bar_2() bar()

compile_more_js()

Thread B

Thread A

Concurrent Writer Problem

All translations from the modified page must be removed

Page 44: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

● For non-JIT modules:– System call hooks (program startup only)

– Self-modifying code (very rare)

● For JIT engines:– Code generation

– Code optimization

– Code adjustment for reuse

Page 45: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

● For non-JIT modules:– System call hooks (program startup only)

– Self-modifying code (very rare)

● For JIT engines:– Code generation

– Code optimization

– Code adjustment for reuse

Page 46: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

r-x

r-x

rwx

bar()

JIT writes a second function to unused space in the page

Page 47: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

r-x

r-x

rwx

bar()

DynamoRIO must invalidate all translations from the page

Page 48: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

r-x

r-x

rwx

bar() bar()

Trivial code changes require flushing all translations

Page 49: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

r-x

r-x

rwx

bar() bar()

Even a data change requires flushing all translations

Page 50: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

● For non-JIT modules:– System call hooks (program startup only)

– Self-modifying code (very rare)

● For JIT engines:– Code generation

– Code optimization

– Code adjustment for reuse

● Disable traces to reduce translation overhead?

Page 51: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Cache Consistency Overhead

● For non-JIT modules:– System call hooks (program startup only)

– Self-modifying code (very rare)

● For JIT engines:– Code generation

– Code optimization

– Code adjustment for reuse

● Disable traces to reduce translation overhead?

25% slowdown on Octane!

Page 52: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

● Add annotation framework to DynamoRIO– Macros compile into DynamoRIO hooks

– Native execution skips hook (2 direct jumps)

● Annotate the target application– Specify which memory allocations contain code

– Notify DynamoRIO of all JIT code writes

Page 53: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

● Add annotation framework to DynamoRIO– Macros compile into DynamoRIO hooks

– Native execution skips hook (2 direct jumps)

● Annotate the target application– Specify which memory allocations contain code

– Notify DynamoRIO of all JIT code writes

Page 54: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

● Add annotation framework to DynamoRIO– Macros compile to DynamoRIO hooks

– Native execution skips hook (2 direct jumps)

● Annotate the target application– Specify which memory allocations contain code

– Notify DynamoRIO of all JIT code writes

void* OS::Allocate(const size_t size, int prot) {

void* mbase = mmap(0, size, prot,

MAP_PRIVATE | MAP_ANONYMOUS);

if (mbase == MAP_FAILED) return NULL;

if (IS_EXECUTABLE(prot))

DYNAMORIO_MANAGE_CODE_AREA(mbase, size);

return mbase;

}

Page 55: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

● Add annotation framework to DynamoRIO– Macros compile to DynamoRIO hooks

– Native execution skips hook (2 direct jumps)

● Annotate the target application– Specify which memory allocations contain code

– Notify DynamoRIO of all JIT code writes

Page 56: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

● Add annotation framework to DynamoRIO– Macros compile to DynamoRIO hooks

– Native execution skips hook (2 direct jumps)

● Annotate the target application– Specify which memory allocations contain code

– Notify DynamoRIO of all JIT code writes

void CpuFeatures::FlushICache(void* start,

size_t size) {

/* no native action for Intel x86 */

DYNAMORIO_FLUSH_FRAGMENTS(start, size);

}

Page 57: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

● Add annotation framework to DynamoRIO– Macros compile to DynamoRIO hooks

– Native execution skips hook (2 direct jumps)

● Annotate the target application– Specify which memory allocations contain code

– Notify DynamoRIO of all JIT code writes

→ How does this work?

Page 58: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Annotated JIT Writes

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

rwx

r-x

rwx

bar()

Annotation handler flushes only the written region

Page 59: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Annotated JIT Writes

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

rwx

r-x

rwx

bar() bar()

Annotation handler flushes only the written basic block

Page 60: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Annotated JIT Writes

Chrome V8

foo()

DynamoRIO Code Cache

compile_js()

foo()

compile_js()

rwx

r-x

rwx

bar() bar()

Data writes can be safely ignored

Page 61: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Problems with Annotations

● Source code may not be available● COTS binaries may be preferred● Application may be difficult to annotate

– Chrome V8 requires 4 annotations● Trivial to place the annotations correctly

– Mozilla Ion requires 17 annotations● Complex analysis required to correctly place annotations

Page 62: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Problems with Annotations

● Source code may not be available● COTS binaries may be preferred● Application may be difficult to annotate

– Chrome V8 requires 4 annotations● Trivial to place the annotations correctly

– Mozilla Ion requires 17 annotations● Complex analysis required to correctly place annotations

Page 63: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Inference Approach

● Infer which store instructions are writing code

● Instrument those store instructions

– Fine-grained cache consistency policy

– Avoid page faults

● Other writers → default cache consistency

Page 64: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Inference Approach

● Infer which store instructions are writing code

● Instrument those store instructions

– Fine-grained cache consistency policy

– Avoid page faults

● Other writers → default cache consistency

● Incorrect inference → negligible overhead

Page 65: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Infer Code-Writing Instructions

Chrome V8 DynamoRIO Code Cache

compile_js() compile_js()

r-x

r-x

A A

Handle up to ~10 page faults (heuristic)

Page 66: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Infer Code-Writing Instructions

Chrome V8 DynamoRIO Code Cache

compile_js() compile_js()

r-x

r-xJIT

A A

Flag the faulting page as a JIT code page

Page 67: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Parallel Memory Mapping

A

Physical Memory

Chrome V8

r-x

r-x

AJIT

rwx

A'

Map a second writable page to the same location

Page 68: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

A

Chrome V8

r-x

rwx

r-x

DynamoRIO Code Cache

compile_js()JIT

A

compile_js()

A'

A

Redirect the store's target to the writable mapping

Page 69: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conserving DGC Translations

A

Chrome V8

r-x

rwx

r-x

DynamoRIO Code Cache

compile_js()JIT

A

compile_js()

A'

A

Locate and remove stale translations

Page 70: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Comparing Approaches

Annotations

+ Lower virtual memory usage (for 32-bit)

+ Simple to implement

Inference

+ Requires no source code changes

Page 71: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Speedup – Octane on V8

Score Overhead SpeedupOriginalDynamoRIO 2,271 15.80x -

AnnotationDynamoRIO 14,532 2.47x 6.40x

InferenceDynamoRIO

14,257 2.52x 6.28x

Native 35,889 - -

Page 72: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Speedup – Octane on V8

Score Overhead SpeedupOriginalDynamoRIO 2,271 15.80x -

AnnotationDynamoRIO 14,532 2.47x 6.40x

InferenceDynamoRIO

14,257 2.52x 6.28x

Native 35,889 - -

Page 73: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Speedup – Octane on V8

Score Overhead SpeedupOriginalDynamoRIO 2,271 15.80x -

AnnotationDynamoRIO 14,532 2.47x 6.40x

InferenceDynamoRIO

14,257 2.52x 6.28x

Native 35,889 - -

Page 74: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Speedup – Octane on Ion

Score Overhead SpeedupOriginalDynamoRIO 7,185 4.36x -

AnnotationDynamoRIO 11,914 2.27x 1.92x

InferenceDynamoRIO

13,797 2.15x 2.03x

Native 31,340 - -

Page 75: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Speedup – Octane on Ion

Score Overhead SpeedupOriginalDynamoRIO 7,185 4.36x -

AnnotationDynamoRIO 11,914 2.27x 1.92x

InferenceDynamoRIO

13,797 2.15x 2.03x

Native 31,340 - -

Page 76: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Speedup – Octane on Ion

Score Overhead SpeedupOriginalDynamoRIO 7,185 4.36x -

AnnotationDynamoRIO 11,914 2.27x 1.92x

InferenceDynamoRIO

13,797 2.15x 2.03x

Native 31,340 - -

Page 77: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Speedup – Octane on V8

Page 78: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Benchmark Overhead

Page 79: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

SPEC CPU 2006SPEC CPU 2006

SPEC CPU SPEC Int SPEC fp

OriginalDynamoRIO 12.27% 17.73% 8.60%

InferenceDynamoRIO

12.35% 17.88% 8.60%

Performance is maintained for programsthat do not dynamically generate code

Page 80: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Related Work

Platform Cache Consistency Policy

DynamoRIO Page protection → flush all

Pin Instrument trace heads → flush trace

QEMU Software TLB hook → flush basic block

Valgrind Instrument basic block → flush

Transmeta Page protection → flush region (HW support)

Librando Page protection → hash compare and flush

Page 81: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

Conclusion

● New trend towards dynamic code generation

● We explore two optimization approaches

– Annotation-driven code change notification

– Code change inference

● We improve performance of binary translation on the Octane benchmark by more than 6x

Page 82: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

DynamoRIO Annotations

Compiled annotation in x64 Linux

Page 83: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

DynamoRIO Annotations

Native execution jumps over the annotation

Page 84: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

DynamoRIO Annotations

DynamoRIO transforms the annotation into a handler call

Page 85: Optimizing Binary Translation of Dynamically Generated Codeplrg.eecs.uci.edu/~bhawkins/papers/cgo-2015.slides.pdf · Conserving DGC Translations Add annotation framework to DynamoRIO

DynamoRIO Annotations

DynamoRIO transforms the annotation into a handler call