Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
Berkeley Parlab
1. Motivation
2. Overview
6. Evaluation for JavaScript 4. Evaluation for LLVM IR
Liang Gong & Cuong Nguyen* * Name are in alphabetical order. Each author makes equal contributions to the project.
5. Techniques and Challenges for JavaScript
Original Javascript: var a = b; Simplified Transformed Javascript: var a = J$.W(9, 'a', J$.R(5, 'b', b), a);
J$.W = function( … ) { … } J$.R = function( … ) { … }
Code Transformation:
Jalangi Framework Runtime Code:
Different places that may import JavaScript code • Inline, between <script> and </script> • External file: src a2ribute of a <script> tag • HTML event handler a2ribute, such as onclick or onmouseover • URL: javascript: protocol • Ajax: jQuery.getScript() or importScript() • Generated by Script: src_elem.innerHTML = “function(){}”
More complex code execution model/environment: • JS Code may be triggered at different Kme (page loading, specific event, asynchronous call) • Different JS Context
We add the following hooks: • Binary OperaKon • Unary OperaKon • Variable Read/Write • Field Get/Put • FuncKon/Method Call/Enter/Return • Script Enter/Return • Object/FuncKon/Regexp/Array Literal • CondiKon/Switch
With hooks you can do: • Debugging • Performance analysis • Monitoring dynamic behaviors • Record and replay • Almost any dynamic analysis
Detect a NaN in AWer diagnosing, confirm that is a bug.
Check NaN Bug [ < 100 Loc]
[object Object].d <= undefined c <= undefined [object Object].refcount <= NaN
Found interesting operations in the following websites’ homepages
Simply loading the Facebook homepage, the analysis detects hundreds of this kind of interesKng operaKons.
Operating uninitialized variable
0
10
20
30
40
50
60
70
Vector Chart Bitmap Game Vector Chart Mobile HTML5 CharKng
Mobile HTML5 Gaming
AWer TransformaKon Before TransformaKon
Slow down is not obvious
Demo Available
Shadow Execution Frameworks for LLVM IR and JavaScript
Motivation • Lesson learnt: code instrumentation/transformation and
computation of extra information are foundation to dynamic analysis.
• Goal: minimize the efforts to build dynamic analysis by generalizing this approach with high scalability.
Contributions • Design and implement an LLVM Shadow Execution System.
• Based on instrumentation, concolic execution and shadow execution. • Perform instrumentation at the instruction level. The analysis is
heavyweight but powerful. • Support multiple languages: C/C++, Java, OpenCL, Fortran, etc.
• Design and implement a JavaScript Dynamic Analysis Framework. • JavaScript is the most popular programming language for client-side web programming. • Facilitate analyzing any real-world webpages. • Analysis module plug in and remove on the fly. • Powerful analysis interaction based on web console. • Detect real-world bugs using the framework.
Shadow Execution • Associate a shadow value with each concrete value. • Carry useful information: error propagation, array
bound checking, symbolic value, etc. • Perform side-by-side with the concrete execution.
Concrete execution assists shadow execution.
Challenges • Modeling dynamic allocation and pointer
arithmetic in shadow execution. • Handle all tricky C pointer arithmetic
(castings, pointer arithmetic, function pointers, etc.).
• Array/struct and their combinations. • Handling uninterpreted calls. • With/without side effects + return
arbitrary types (pointer, array, struct). • Model malloc.
• Achieving scalability: fast variable lookup and lazy construction of objects.
Evaluation of the Core Interpreter • Robustness: interpret 50 commonly used Linux core utilities
(rm/ls/mkdir …). • Check for each memory write that the interpretation and
concrete execution writes the same value. • Check at each branch that two executions do not diverge.
• Performance:
Evaluation of the Usability • Implement an array-out-of-bound analysis (< 300 LoC). • Can trace back to the root causes of the problem. • Differentiate harmful and benign bugs. • Demo is available.
• Implement a NaN detection analysis (in progress).
Portable to:
[object Object].now = end – start; NaN “30% 0”
Core Interpreter (Resemble a compiler) • Front end: instrument all 57 LLVM instructions + compute stack frame
size for local/global variables + create fast lookup indices for variables. • Back end: re-interpret all 57 LLVM instructions with shadow execution
+ model heap/stack and runtime environment.
2x in average 4x in the worst case
Input: target application under analysis. Output: instrumented/transformed codes with hooks. Extensibility: override hooks to perform additional analysis.
%a = alloca double %b = alloca double %1 = load %a %2 = load %b %3 = add %1 %2
…
LLVM bitcode
Instrumenter (LLVM Pass)
Callback Interpreter (Dynamically loaded/
Perform analysis)
my_alloca(double) %a = alloca double
… my_load(%my_a) %1 = load %a
… my_add(%my_1, %my_2) %3 = add %1 %2
…
LLVM bitcode
Analysis Result
3. Techniques and Challenges for LLVM IR
LLVM Shadow Execution Framework Interesting Analysis Applications
Check NaN:• Find a bug in jQuery-‐1.8.3• Find meaningless(maybe) operations
Hack Frontend Code:• Modify frontend dynamic effects• For program understanding purposes
• Graphviz online• AttackMap
Generate Runtime Call Graph
Analyze Performance Issue:• Function or operation counterCheck Polymorphic getField Statements:• Function or operation counter