Upload
sandra-judkins
View
221
Download
0
Embed Size (px)
Citation preview
Program Analysis for Security
John Mitchell
CS 155 Spring 2012
Software bugs are serious problems
Thanks: Isil and Thomas Dillig
3
Several different approaches
• Instrument code for testing– Heap memory: Purify– Perl tainting (information flow)– Java race condition checking
• Black-box testing– Fuzzing and penetration testing– Black-box web application security analysis
• Static code analysis– FindBugs, Fortify, Coverity, MS tools, …
Entry
1
2 3
4
Software
Exit
Behaviors
Entry
1
2
4
Exit
1 2 41 2 4
1 3 4
1 2 4 1 3 4
1 2 3 1 2 4 1 3 4
1 2 4 1 2 3 1 3 4
1 2 3 1 2 3 1 3 4
1 2 4 1 2 4 1 3 4
. . .
1 2 4 1 3 4
Manual testingonly examines small subset of behaviors
4
Static analysis goals
• Bug finding– Identify code that the programmer wishes to
modify or improve• Correctness
– Verify the absence of certain classes of errors
Note: some fundamental limitations…
Outline
• General discussion of tools– Fundamental limitations– Basic method based on abstract states
• More details on one specific method– Property checkers from Engler et al., Coverity– Sample security-related results
• Quick tour of another tool– JavaScript confinement verifier– Proves isolation; not “just” bug finding
Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, …
Program Analyzers
Code Report Type Line
1 mem leak 324
2 buffer oflow 4,353,245
3 sql injection 23,212
4 stack oflow 86,923
5 dang ptr 8,491
… … …
10,502 info leak 10,921
Program Analyzer
Spec
potentially reports manywarnings
may emit false alarms
analyze large code bases
false alarm
false alarm
Soundness, CompletenessProperty Definition
Soundness If the program contains an error, the analysis will report a warning.“Sound for reporting correctness”
Completeness If the analysis reports an error, the program will contain an error.“Complete for reporting correctness”
Complete IncompleteSo
und
Uns
ound
Reports all errorsReports no false alarms
Reports all errorsMay report false alarms
Undecidable Decidable
Decidable
May not report all errorsMay report false alarms
Decidable
May not report all errorsReports no false alarms
Software
. . .
Behaviors
SoundOver-approximation of
Behaviors
False Alarm
ReportedError
approximation is too coarse……yields too many false alarms
Modules
entry
X 0
Is Y = 0 ?
X X + 1 X X - 1
Is Y = 0 ?
Is X < 0 ? exit
crash
yes
noyes
no
yes no
Does this program ever crash?
entry
X 0
Is Y = 0 ?
X X + 1 X X - 1
Is Y = 0 ?
Is X < 0 ? exit
crash
yes
noyes
no
yes no
infeasible path!… program will never crash
Does this program ever crash?
entry
X 0
Is Y = 0 ?
X X + 1 X X - 1
Is Y = 0 ?
Is X < 0 ? exit
crash
yes
noyes
no
yes no
X = 0
X = 0
X = 1
X = 1
X = 1
X = 1
X = 1
X = 2
X = 2
X = 2
X = 2
X = 2
X = 3
X = 3
X = 3
X = 3
non-termination!… therefore, need to approximate
Try analyzing without approximating…
X X + 1 f
din
dout
dout = f(din)
X = 0
X = 1
dataflow elements
transfer functiondataflow equation
X X + 1 f1
din1
dout1 = f1(din1)
Is Y = 0 ? f2
dout2
dout1
din2 dout1 = din2
dout2 = f2(din2)
X = 0
X = 1
X = 1
X = 1
dout1 = f1(din1)
djoin = dout1 ⊔ dout2
dout2 = f2(din2)f1 f2
f3
dout1
din1 din2
dout2
djoin
din3
dout3
djoin = din3
dout3 = f3(din3)
least upper bound operatorExample: union of possible values
What is the space of dataflow elements, ?What is the least upper bound operator, ?⊔
entry
X 0
Is Y = 0 ?
X X + 1 X X - 1
Is Y = 0 ?
Is X < 0 ? exit
crash
yes
noyes
no
yes no
X = 0
X = 0
X = posX = T
X = neg
X = 0
X = T X = T
X = T
Try analyzing with “signs” approximation…
terminates...… but reports false alarm… therefore, need more precision
lost precision
X = T
X = T
X = pos X = 0 X = neg
X =
X neg X postrue
Y = 0 Y 0
false
X = T
X = pos X = 0 X = neg
X =
signs lattice Boolean formula latticerefined signs lattice
entry
X 0
Is Y = 0 ?
X X + 1 X X - 1
Is Y = 0 ?
Is X < 0 ? exit
crash
yes
noyes
no
yes no
X = 0true
X = 0Y=0
X = posY=0 X = neg Y0
X = posY=0X = negY0
X = posY=0
X = pos Y=0
X = neg Y0
X = 0 Y0
Try analyzing with “path-sensitive signs” approximation…
terminates...… no false alarm… soundly proved never crashes
no precision loss
refinement
Outline
• General discussion of tools– Fundamental limitations– Basic method based on abstract states
• More details on one specific method– Property checkers from Engler et al., Coverity– Sample security-related results
• Quick tour of another tool– JavaScript confinement verifier– Proves isolation; not “just” bug finding
Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, …
21
Bugs to Detect
Some examples• Crash Causing Defects• Null pointer dereference• Use after free• Double free • Array indexing errors• Mismatched array
new/delete• Potential stack overrun• Potential heap overrun• Return pointers to local
variables• Logically inconsistent
code
• Uninitialized variables• Invalid use of negative values• Passing large parameters by
value• Underallocations of dynamic
data• Memory leaks• File handle leaks• Network resource leaks• Unused values• Unhandled return codes• Use of invalid iterators
Slide credit: Andy Chou
22
Example: Check for missing optional args
•Prototype for open() syscall:
•Typical mistake:
•Result: file has random permissions
•Check: Look for oflags == O_CREAT without mode argument
int open(const char *path, int oflag, /* mode_t mode */...);
fd = open(“file”, O_CREAT);
23
Example: Chroot protocol checker
•Goal: confine process to a “jail” on the filesystem• chroot() changes filesystem root for a process
•Problem• chroot() itself does not change current working directory
chroot() chdir(“/”)
open(“../file”,…) Error if open before chdir
25
Tainting checkers
26
Example Code
#include <stdlib.h>#include <stdio.h>
void say_hello(char * name, int size) { printf("Enter your name: "); fgets(name, size, stdin); printf("Hello %s.\n", name);}
int main(int argc, char *argv[]) { if (argc != 2) { printf("Error, must provide an input buffer size.\n"); exit(-1); } int size = atoi(argv[1]); char * name = (char*)malloc(size); if (name) { say_hello(name, size); free(name); } else { printf("Failed to allocate %d bytes.\n", size); }}
27
atoi
main
exit free malloc
printffgets
say_hello
Callgraph
28
atoi
main
exit free malloc
printffgets
say_hello
Reverse Topological Sort
12
3 4 5 6 7
8
Idea: analyze function before you analyze caller
29
atoi
main
exit free malloc
printffgets
say_hello
Apply Library Models
12
3 4 5 6 7
8
Tool has built-in summaries of library function behavior
30
atoi
main
exit free malloc
printffgets
say_hello
Bottom Up Analysis
12
3 4 5 6 7
8
Analyze function using known properties of functions it calls
31
atoi
main
exit free malloc
printffgets
say_hello
Bottom Up Analysis
12
3 4 5 6 7
8
Analyze function using known properties of functions it calls
32
atoi
main
exit free malloc
printffgets
say_hello
Bottom Up Analysis
12
3 4 5 6 7
8
Finish analysis by analyzing all functions in the program
33
Finding Local Bugs
#define SIZE 8void set_a_b(char * a, char * b) {char * buf[SIZE];if (a) {b = new char[5];} else {if (a && b) {
buf[SIZE] = a;return;
} else {delete [] b;
}*b = ‘x’;}*a = *b;}
34
char * buf[8];
if (a)
b = new char [5]; if (a && b)
buf[8] = a; delete [] b;
*b = ‘x’;
END
*a = *b;
a !a
a && b !(a && b)
Control Flow Graph
Represent logical structure of code in graph form
35
char * buf[8];
if (a)
b = new char [5]; if (a && b)
buf[8] = a; delete [] b;
*b = ‘x’;
END
*a = *b;
a !a
a && b !(a && b)
Path Traversal
Conceptually: Analyze each path through control graph separately
Actually Perform some checking computation once per node; combine paths at merge nodes
Conceptually
Actually
36
char * buf[8];
if (a)
if (a && b)
delete [] b;
*b = ‘x’;
END
*a = *b;
!a
!(a && b)
Apply Checking
Null pointersUse after freeArray overrun
See how three checkers are run for this path
• • Defined by a state diagram, with state
transitions and error states
Checker
• • Assign initial state to each program var• State at program point depends on
state at previous point, program actions• Emit error if error state reached
Run Checker
37
char * buf[8];
if (a)
if (a && b)
delete [] b;
*b = ‘x’;
END
*a = *b;
!a
!(a && b)
Apply Checking
Null pointersUse after freeArray overrun
“buf is 8 bytes”
38
char * buf[8];
if (a)
if (a && b)
delete [] b;
*b = ‘x’;
END
*a = *b;
!a
!(a && b)
Apply Checking
Null pointersUse after freeArray overrun
“buf is 8 bytes”
“a is null”
39
char * buf[8];
if (a)
if (a && b)
delete [] b;
*b = ‘x’;
END
*a = *b;
!a
!(a && b)
Apply Checking
Null pointersUse after freeArray overrun
“buf is 8 bytes”
“a is null”
Already knew a was null
40
char * buf[8];
if (a)
if (a && b)
delete [] b;
*b = ‘x’;
END
*a = *b;
!a
!(a && b)
Apply Checking
Null pointersUse after freeArray overrun
“buf is 8 bytes”
“a is null”
“b is deleted”
41
char * buf[8];
if (a)
if (a && b)
delete [] b;
*b = ‘x’;
END
*a = *b;
!a
!(a && b)
Apply Checking
Null pointersUse after freeArray overrun
“buf is 8 bytes”
“a is null”
“b is deleted”
“b dereferenced!”
42
char * buf[8];
if (a)
if (a && b)
delete [] b;
*b = ‘x’;
END
*a = *b;
!a
!(a && b)
Apply Checking
Null pointersUse after freeArray overrun
“buf is 8 bytes”
“a is null”
“b is deleted”
“b dereferenced!”
No more errorsreported for b
43
False Positives
•What is a bug? Something the user will fix.
•Many sources of false positives• False paths• Idioms• Execution environment assumptions• Killpaths• Conditional compilation• “third party code”• Analysis imprecision• …
44
char * buf[8];
if (a)
b = new char [5]; if (a && b)
buf[8] = a; delete [] b;
*b = ‘x’;
END
*a = *b;
a !a
a && b !(a && b)
A False Path
45
char * buf[8];
if (a)
if (a && b)
buf[8] = a;
END
!a
a && b
False Path Pruning
Integer Range Disequality Branch
46
char * buf[8];
if (a)
if (a && b)
buf[8] = a;
END
!a
a && b
False Path Pruning
“a in [0,0]” “a == 0 is true”
Integer Range Disequality Branch
47
char * buf[8];
if (a)
if (a && b)
buf[8] = a;
END
!a
a && b
False Path Pruning
“a in [0,0]” “a == 0 is true”
“a != 0”
Integer Range Disequality Branch
48
char * buf[8];
if (a)
if (a && b)
buf[8] = a;
END
!a
a && b
False Path Pruning
“a in [0,0]” “a == 0 is true”
“a != 0”
Impossible
Integer Range Disequality Branch
49
Environment Assumptions
•Should the return value of malloc() be checked?
int *p = malloc(sizeof(int));*p = 42;
OS Kernel:Crash machine.
File server:Pause filesystem.
Spreadsheet:Lose unsaved changes.
Game:Annoy user.
Library:?
Medical device:malloc?!
Web application:200ms downtime
IP Phone:Annoy user.
50
Statistical Analysis
•Assume the code is usually right
int *p = malloc(sizeof(int));*p = 42;
int *p = malloc(sizeof(int));if(p) *p = 42;
int *p = malloc(sizeof(int));*p = 42;
int *p = malloc(sizeof(int));*p = 42;
int *p = malloc(sizeof(int));if(p) *p = 42;
int *p = malloc(sizeof(int));*p = 42;
int *p = malloc(sizeof(int));if(p) *p = 42;
int *p = malloc(sizeof(int));if(p) *p = 42;
3/4deref
1/4deref
51
Application to Security Bugs
•Stanford research project• Ken Ashcraft and Dawson Engler, Using Programmer-Written Compiler
Extensions to Catch Security Holes, IEEE Security and Privacy 2002• Used modified compiler to find over 100 security holes in Linux and BSD• http://www.stanford.edu/~engler/
•Benefit• Capture recommended practices, known to experts, in tool available to all
Sanitize integers before use
Linux: 125 errors, 24 false; BSD: 12 errors, 4 false
array[v]while(i < v) …
v.clean Use(v)v.tainted
Syscall param
Network packet
copyin(&v, p, len)
any<= v <= any
memcpy(p, q, v)copyin(p,q,v)copyout(p,q,v)
ERROR
Warn when unchecked integers from untrusted sources reach trusting sinks
53
Example security holes
/* 2.4.9/drivers/isdn/act2000/capi.c:actcapi_dispatch */
isdn_ctrl cmd;
...
while ((skb = skb_dequeue(&card->rcvq))) {
msg = skb->data;
...
memcpy(cmd.parm.setup.phone,
msg->msg.connect_ind.addr.num,
msg->msg.connect_ind.addr.len - 1);
•Remote exploit, no checks
54
Example security holes
/* 2.4.5/drivers/char/drm/i810_dma.c */
if(copy_from_user(&d, arg, sizeof(arg)))
return –EFAULT;
if(d.idx > dma->buf_count)
return –EINVAL;
buf = dma->buflist[d.idx];
Copy_from_user(buf_priv->virtual, d.address, d.used);
•Missed lower-bound check:
55
User-pointer inference
•Problem: which are the user pointers?• Hard to determine by dataflow analysis• Easy to tell if kernel believes pointer is from user!
•Belief inference• “*p” implies safe kernel pointer• “copyin(p)/copyout(p)” implies dangerous user ptr• Error: pointer p has both beliefs.
•Implementation: 2 pass checkerinter-procedural: compute all tainted pointers
local pass to check that they are not dereferenced
56
Results for BSD and Linux
•All bugs released to implementers; most serious fixed
Gain control of system 18 15 3 3Corrupt memory 43 17 2 2Read arbitrary memory 19 14 7 7Denial of service 17 5 0 0Minor 28 1 0 0Total 125 52 12 12
Linux BSDViolation Bug Fixed Bug Fixed
Outline
• General discussion of tools– Fundamental limitations– Basic method based on abstract states
• More details on one specific method– Property checkers from Engler et al., Coverity– Sample security-related results
• Quick tour of another tool– JavaScript confinement verifier– Proves isolation; not “just” bug finding
Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, …
Sandboxing Untrusted JavaScript 58
Two ProblemsSandbox Design: ensure that sandboxed code obtains access to ANY protected resource ONLY via the API
API Confinement: verify that sandboxed code cannot use the API to obtain direct access to a security-critical resource
Evolution of Standardized JavaScript• ECMA-262 3rd edition (ES3) - was around when I started this work in 2008• ECMA-262 5th edition (ES5) - released in Dec 2009
– has a strict mode (ES5-strict)
Sandboxing Untrusted JavaScript 59
Solution to the Sandboxing and API Confinement problems for ES5-strict
Plan:1. Subset SES of ES5-strict2. Sandboxing technique for untrusted SES code3. Confinement analysis technique for SES APIs4. Application: Yahoo! ADSafe
Sandboxing Untrusted JavaScript 60
The language ES5-strictES5-strict = ES3 with the following restrictions
Restriction (relative to ES3) Rationale
No delete on variable names
No prototypes for scope objects
No with
No this coercion
Safe built-in functions
No .caller, .callee on arguments object
No .caller, .arguments on function objects
No arguments and formal parameters aliasing
Lexical scoping
No ambient access to Global object
Closure-based encapsulation
Sandboxing Untrusted JavaScript 61
Scope-bounded eval
Example: eval(“function(){return x}”, x)
Explicitly list free variables of s
• Run-time restriction: Free(Parse(s)) {⊆ x1,…, xn}
• Allows an upper bound on side-effects of executing s
eval(s, x1,…, xn)
Implemented by de-sugaring to(evalnoFree(“function(x1,…, xn){“+ s +“}”))(x1,…, xn)
Sandboxing Untrusted JavaScript 62
Next few slides
API Confinement: Verify that sandboxed code cannot use the API to obtain direct access to a security-critical resource
1. Subset SES of ES5-strict2. Sandboxing technique for untrusted SES code3. Confinement analysis technique for SES APIs4. Application: Yahoo! ADSafe
JavaScript API Confinement 63
API Confinement is Complex
Resources,
DOM
f1r1
r4r3
r2
Untrusted JS
Invoke
r2
Return r2
Access r2
r3 r4
Side-effect r4
u1
Repeat
Precision-Efficiency tradeoff
APICritical, e.g.,
window.location
Sandboxing Untrusted JavaScript 64
Key Properties of API Implementations•Code is part of the trusted computing base•Small in size, relative to the application•Written in a disciplined manner•Developers have an incentive in keeping the code simple
Insights: •Conservative and scalable static analysis techniques can do well•Can soundly establish API Confinement•Can warn developers away from using complex coding patterns
Sandboxing Untrusted JavaScript 65
Challenges & Techniques
Hurdles:• Forall quantification on untrusted code • Analysis of eval(s, x1,…, xn)in general
Techniques:• Flow-Insensitive and Context-Insensitive Points-to analysis• Abstract eval(s, x1,…, xn) by the set of all statements that can be written using free variables {x1,…, xn}
Confine(t, critical): For all untrusted terms s in SESlight,
Sandboxing Untrusted JavaScript 66
Verifying Confine(t, critical)
Trusted code t
eval with free vars ”test”,“api”
Environment(Built-ins)
+
+Datalog Solver(least fixed point)
Inference Rules (SES semantics)
Stack(“test”, l) ∧Critical(l) ?
NOT CONFINED
CONFINED
true
false
Abstraction
Our decision procedure and implementation
Sandboxing Untrusted JavaScript 67
Express Analysis in Datalog (Whaley et. al.)
Program t
l1:var y = {};
l2:var x = y;
l3:x.f = y;
Facts(t)
Stack(y, l1)
Assign(x, y)
Store(x, “f”, y)
abstract
• Abstract programs as Datalog facts
• Abstract the semantics of SES as Datalog inference rules
Stack(x, l) :- Assign(x, y), Stack(y, l)Heap(l, f, m) :- Store(x, f, y), Stack(x, l), Stack(y, m)
• Execution of program t is abstracted by the least-fixed-point of Facts(t) under the inference rules
Sandboxing Untrusted JavaScript 68
Complete set of PredicatesAbstracting terms Abstracting Heaps & Stacks
Assign(x, y) Throw(l, x) Heap(l, x, m) Stack(x, l)
Load(x, y, f) Catch(l, x) Prototype(l, m) FuncType(l)
Store(x, f, y) TP(l, x) ObjType(l) ArrayType(l)
Formal(l, i, x) FormalRet(l, x) NotBuiltin(l) Critical(l)
Actual(x, i, z, y, l) Instance(l, x)
Global(x) Annotation(x, y)
Sufficient to model implicit type conversions, reflection, exceptions
Sandboxing Untrusted JavaScript 69
Next few slides1. Subset SES of ES5-strict2. Sandboxing technique for untrusted SES code3. Confinement analysis technique for SES APIs4. Application: Yahoo! ADSafe
Implementation: We implemented the decision procedure in the form of an automated tool ENCAP• built on top of Datalog engine: bddbddb• available online at: http://code.google.com/p/es-lab/Encap
JavaScript API Confinement 70
Application: Yahoo! ADSafe
Analysis (1st attempt)• annotated the API implementation (2000
LOC)• desugared it to SES and ran ENCAP• obtained NOT CONFINED
- culprits: ADSAFE.go and ADSAFE.lib
Sandbox: JSLint Filter
API: ADSAFE object
ADSafe: Mechanism for safely embedding ads This was an actual exploit
Analysis (2nd attempt)• fixed this bug• ran ENCAP again• obtained CONFINED
Result: ADSafe DOM API safely confines DOM objects under the SES threat model, assuming the annotations hold
Summary
•General discussion of tools− Fundamental limitations− Basic method based on abstract states
•More details on one specific method− Property checkers from Engler et al., Coverity− Sample security-related results
•Quick tour of JavaScript tool− JavaScript confinement verifier− Proves isolation; not “just” bug finding
Slides from: S. Bugrahe, A. Chou, I&T Dillig, D. Engler, A. Taly, …