83
PROGRAM ANALYSIS AND CYBER SECURITY Noam Rinetzky Slides credit: Tom Ball, Dawson Engler, Roman Manevich, Erik Poll, Mooly Sagiv, Jean Souyris, Eran Tromer, Avishai Wool, Eran Yahav

Program Analysis and Cyber Security

  • Upload
    ruby

  • View
    50

  • Download
    1

Embed Size (px)

DESCRIPTION

Noam Rinetzky Slides credit: Tom Ball, Dawson Engler , Roman Manevich , Erik Poll, Mooly Sagiv , Jean Souyris , Eran Tromer , Avishai Wool, Eran Yahav. Program Analysis and Cyber Security. Software is Everywhere. Software is Everywhere. Software is Everywhere. Unreliable. - PowerPoint PPT Presentation

Citation preview

Page 1: Program Analysis  and Cyber Security

PROGRAM ANALYSIS AND CYBER SECURITY

Noam Rinetzky

Slides credit: Tom Ball, Dawson Engler, Roman Manevich, Erik Poll, Mooly Sagiv, Jean Souyris, Eran Tromer, Avishai Wool, Eran Yahav

Page 2: Program Analysis  and Cyber Security

Software is Everywhere

Page 3: Program Analysis  and Cyber Security

Software is Everywhere

Page 4: Program Analysis  and Cyber Security

Software is Everywhere

Unreliable

Page 5: Program Analysis  and Cyber Security
Page 6: Program Analysis  and Cyber Security

December 31, 2008

Page 7: Program Analysis  and Cyber Security

Zune bug

1 while (days > 365) { 2 if (IsLeapYear(year)) { 3 if (days > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 }

7December 31, 2008

Page 8: Program Analysis  and Cyber Security

Zune bug

1 while (366 > 365) { 2 if (IsLeapYear(2008)) { 3 if (366 > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 }

8

Suggested solution: wait for tomorrow December 31, 2008

Page 9: Program Analysis  and Cyber Security

Therac-25 leads to 3 deaths and 3 injuries

Software error exposes patients to radiation overdose (100X of intended dose)

1985 to 1987

Page 10: Program Analysis  and Cyber Security

10February 25, 1991

Page 11: Program Analysis  and Cyber Security

11

Patriot Bug - Rounding Error Time measured in 1/10 seconds Binary expansion of 1/10:

0.0001100110011001100110011001100.... 24-bit register

0.00011001100110011001100 error of

0.0000000000000000000000011001100... binary, or ~0.000000095 decimal

After 100 hours of operation error is 0.000000095×100×3600×10=0.34

A Scud travels at about 1,676 meters per second, and so travels more than half a kilometer in this time

Suggested solution: reboot every 10 hours

Page 12: Program Analysis  and Cyber Security

Northeast Blackout

14 August, 2003

Page 13: Program Analysis  and Cyber Security

Toyota recalls 160,000 Prius hybrid vehicles

Programming error can activate all warning lights, causing the car to think its engine has failed October 2005

Page 14: Program Analysis  and Cyber Security

Boeing's 787 Vulnerable to Hacker Attack

security vulnerability in onboard computer networks could allow passengers to access the plane's control systems

January 2008

Page 15: Program Analysis  and Cyber Security

Unreliable Software is Exploitable

The Sony PlayStation Network

breach: An identity-theft bonanza

Massive Sony PlayStation data breach

puts about 77 million people at higher

risk of fraud

(April 2011)

RSA hacked, information leaks

RSA's corporate network suffered

what RSA describes as a successful

advanced persistent threat attack,

and "certain information" was

stolen that can somehow affect

the security of SecurID authentication(March 2011)

Stuxnet Worm Still Out of Control at Iran's Nuclear Sites, Experts Say

The Stuxnet worm, named after initials found in its code, is the most sophisticated cyberweapon ever created.(December 2010)Security Advisory for Adobe Flash Player, Adobe

Reader and Acrobat

This vulnerability could cause a crash and potentially allow

an attacker to take control of the affected system.

There are reports that this vulnerability is being exploited in

the wild in targeted attacks via a Flash (.swf) file embedded

in a Microsoft Excel (.xls) file delivered as an email

attachment.

(March 2011)

RSA tokens may be behind major network security problems at Lockheed MartinLockheed Martin remote access network, protected by SecurID tokens, has been shut down(May 2011)

Page 16: Program Analysis  and Cyber Security

Percentage of Remotely Exploitable Vulnerabilities

(source: IBM X-Force)

Page 17: Program Analysis  and Cyber Security

Buffer Overflow Exploits

void foo (char *x) { char buf[2]; strcpy(buf, x); } int main (int argc, char *argv[]) { foo(argv[1]); }

./a.out abracadabraSegmentation fault

Stack grows this way

Memory addresses

Previous frameReturn address

Saved FPchar* xbuf[2]

abracadabr

Page 18: Program Analysis  and Cyber Security

Buffer Overflow Exploits int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16];

strcpy(password_buffer, password); if(strcmp(password_buffer, ”pass1") == 0) auth_flag = 1; if(strcmp(password_buffer, ”pass2") == 0) auth_flag = 1; return auth_flag;}int main(int argc, char *argv[]) { if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); } else printf("\nAccess Denied.\n"); }

(source: “hacking – the art of exploitation, 2nd Ed”)

Page 19: Program Analysis  and Cyber Security

Input Validation

Applicationevil input

1234567890123456 -=-=-=-=-=-=-=-=-=-=-=-=-=- Access Granted. -=-=-=-=-=-=-=-=-=-=-=-=-=-

Page 20: Program Analysis  and Cyber Security

What can we do about it?

Page 21: Program Analysis  and Cyber Security

August 13, 2003

I just want to say LOVE YOU SAN!!soo much

(W32.Blaster.Worm / Lovesan worm)

21

Billy Gates why do you make this possible ? Stop making moneyand fix your software!!

What can we do about it?

Page 22: Program Analysis  and Cyber Security

What can we do about it?

Monitoring Testing Static Analysis Formal Verification

Run time

Design Time

Page 23: Program Analysis  and Cyber Security

Monitoring (runtime defenses)

StackGuard ProPolice PointGuard Security monitors (ptrace)

OS Kernel

monitoredapplication

(Outlook)monitor

user space

open(“/etc/passwd”, “r”)

Page 24: Program Analysis  and Cyber Security

Testing

build it; try it on a some inputs

printf (“x == 0 => should not get that!”)

Page 25: Program Analysis  and Cyber Security

Testing

Valgrind memory errors, race conditions, taint analysis Simulated CPU Shadow memory

Invalid read of size 4 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326) by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd

Page 26: Program Analysis  and Cyber Security

Testing

Valgrind memory errors, race conditions

Parasoft Jtest/Insure++ memory errors + visualizer, race conditions, exceptions …

IBM Rational Purify memory errors IBM PureCoverage detect untested

paths Daikon dynamic invariant detection

Page 27: Program Analysis  and Cyber Security

Testing

Useful and challenging Random inputs Guided testing (coverage) Bug reproducing

But … Observe some program behaviors What can you say about other

behaviors?

Page 28: Program Analysis  and Cyber Security

Formal verification

Mathematical model of software : Var Z = [x0, y1]

Logical specification { 0 < x } = { ε State | 0 < (x) }

Machine checked formal proofs

{ 0 < x } y:= x ; y:=y+1 { 1 < y }

{ 0 < x } y:= x { 0 < x ∧ y = x } { 0 < y } y:= y+1 { 1< y } { ? }

{ 0 < x ∧ y = x } → { 0 < y }

Page 29: Program Analysis  and Cyber Security

Formal verification

Mathematical model of software State = Var Integer S = [x0, y1]

Logical specification { 0 < x } = { S ε State | 0 < S(x) }

Machine checked formal proofs

{ P } stmt1 { Q’ } { P’ } stmt2 { Q } { Q’ } → { P’ }

{ P } stmt1; stmt2 { Q }

Page 30: Program Analysis  and Cyber Security

L4.verified [Klein+,’09]

Microkernel IPC, Threads, Scheduling, Memory management

Functional correctness (using Isabelle/HOL)+ No null pointer de-references.+ No memory leaks. + No buffer overflows.+ No unchecked user arguments+ …

Kernel/proof co-design Implementation - 2.5 py (8,700 LOC) Proof – 20 py (200,000 LOP)

Page 31: Program Analysis  and Cyber Security

What can we do about it?

Monitoring Testing Static Analysis Formal Verification

Run time

Design Time

Page 32: Program Analysis  and Cyber Security

Static Analysis

Lightweight formal verification

Formalize software behavior in a mathematical model (semantics)

Prove (selected) properties of the mathematical model Automatically, typically with

approximation of the formal semantics

Page 33: Program Analysis  and Cyber Security

Why static analysis?

Some errors are hard to find by testing arise in unusual circumstances/uncommon

execution paths buffer overruns, unvalidated input, exceptions, ...

involve non-determinism race conditions

Full-blown formal verification too expensive

Page 34: Program Analysis  and Cyber Security

What is Static analysis

“The algorithmic discovery of properties of a program by inspection of its source text1” -- Manna, Pnueli

1 Does not have to literally be the source text, just means w/o running it

Develop theory and tools for program correctness and robustness

Reason statically (at compile time) about the possible runtime behaviors of a program

Page 35: Program Analysis  and Cyber Security

Static Analysis

x = ?if (x > 0) { y = 42;} else { y = 73; foo();} assert (y == 42);

Bad news: problem is generally undecidable

Page 36: Program Analysis  and Cyber Security

universe

Static Analysis

Central idea: use approximation

Under Approximation

Exact set of configurations/behaviors

Over Approximation

Page 37: Program Analysis  and Cyber Security

Over Approximation

x = ?if (x > 0) { y = 42;} else { y = 73; foo();} assert (y == 42);

Over approximation: assertion may be violated

Page 38: Program Analysis  and Cyber Security

Lose precision only when required Understand where precision is lost

Precision

main(…) { printf(“assertion may be vioalted\n”);}

Page 39: Program Analysis  and Cyber Security

Examplemain(int i) { int x=3,y=1;

do { y = y + 1; } while(--i > 0) assert 0 < x + y}

Determine what states can arise during any execution

Challenge: set of states is unbounded

Page 40: Program Analysis  and Cyber Security

Abstract Interpretation [Cousot,’79] main(int i) { int x=3,y=1;

do { y = y + 1; } while(--i > 0) assert 0 < x + y}

Recipe1)Abstraction2)Transformers3)Exploration

Challenge: set of states is unbounded Solution: compute a bounded representation of (a superset) of program states

Determine what states can arise during any execution

Page 41: Program Analysis  and Cyber Security

1) Abstractionmain(int i) { int x=3,y=1;

do { y = y + 1; } while(--i > 0) assert 0 < x + y}

concrete state

abstract state

: Var Z

#: Var{+, 0, -, ?}

x y i

3 1 7 x y i

+ + +

3 2 6

x y i

Page 42: Program Analysis  and Cyber Security

2) Transformersmain(int i) { int x=3,y=1;

do { y = y + 1; } while(--i > 0) assert 0 < x + y}

concrete transformer

abstract transformer x y i

+ + 0

x y i

3 1 0y = y + 1

x y i

3 2 0

x y i

+ + 0

y = y + 1

+ - 0 + ? 0

+ 0 0 + + 0

+ ? 0 + ? 0

Page 43: Program Analysis  and Cyber Security

3) Exploration

+ + ? + + ?

x y i

main(int i) { int x=3,y=1;

do { y = y + 1; } while(--i > 0) assert 0 < x + y}

+ + ?

+ + ?

? ? ?

x y i

+ + ?

+ + ?

+ + ?

+ + ?

+ + ?

+ + ?

Page 44: Program Analysis  and Cyber Security

main(int i) { int x=3,y=1;

do { y = y - 2; y = y + 3; } while(--i > 0) assert 0 < x + y;}

+ ? ?

+ ? ?

x y i

+ ? ?

+ + ?

? ? ?

44

3) Exploration’

False alarms (false

positive)

Page 45: Program Analysis  and Cyber Security

Goal: exploring program states

initialstates

badstates

45

reachablestates

Page 46: Program Analysis  and Cyber Security

Technique: explore abstract states

initialstates

badstates

46

reachablestates

Page 47: Program Analysis  and Cyber Security

Technique: explore abstract states

initialstates

badstates

47

reachablestates

Page 48: Program Analysis  and Cyber Security

Technique: explore abstract states

initialstates

badstates

48

reachablestates

Page 49: Program Analysis  and Cyber Security

Technique: explore abstract states

initialstates

badstates

49

reachablestates

Page 50: Program Analysis  and Cyber Security

50

Sound: cover all reachable states

initialstates

badstates

reachablestates

Page 51: Program Analysis  and Cyber Security

51

Imprecise abstraction

initialstates

badstates

51

reachablestates False alarms

(false positive)

Page 52: Program Analysis  and Cyber Security

52

Testing is unsound: miss some reachable states

initialstates

badstates

reachablestates

Page 53: Program Analysis  and Cyber Security

53

Testing is unsound: miss some reachable errors

initialstates

badstates

reachablestates

False negatives(Unsound)

No false positives

(complete)

Page 54: Program Analysis  and Cyber Security

How to find “the right” abstraction?

Pick an abstract domain suited for your property Numerical domains Domains for reasoning about the heap …

Combination of abstract domains

54

Page 55: Program Analysis  and Cyber Security

55

Intervals Abstraction

0 2 312345

4

6

x

y

1

y [3,6]

x [1,4]

Page 56: Program Analysis  and Cyber Security

56

Interval Lattice

[0,0][-1,-1][-2,-2]

[-2,-1]

[-2,0]

[1,1] [2,2]

[-1,0] [0,1] [1,2]…

[-1,1] [0,2]

[-2,1] [-1,2]

[-2,2]

……

[2,]…

[1,]

[0,]

[-1,]

[-2,]

[- ,]

[- ,-2]…

[-,-1]

[- ,0]

[-,1]

[- ,2]

(infinite lattice, infinite height)

Page 57: Program Analysis  and Cyber Security

57

Example

int x = 0;if (?) x++;if (?) x++;

x [0,0]

x

x [0,1]

x [0,2]

x=0

ifx++

ifx++

exit

x [0,0] x [1,1]

x [1,2]

[a1,a2] [b1,b2] = [min(a1,b1), max(a2,b2)]

Page 58: Program Analysis  and Cyber Security

58

Polyhedral Abstraction

abstract state is an intersection of linear inequalities of the form a1x2+a2x2+…anxn c

represent a set of points by their convex hull

(image from http://www.cs.sunysb.edu/~algorith/files/convex-hull.shtml)

Page 59: Program Analysis  and Cyber Security

59

McCarthy 91 functionproc MC (n : int) returns (r : int) var t1 : int, t2 : int;begin if n > 100 then r = n - 10; else t1 = n + 11; t2 = MC(t1); r = MC(t2); endif; end

var a : int, b : int;begin /* top */ b = MC(a); end

if (n>=101) then n-10 else 91

Page 60: Program Analysis  and Cyber Security

60

McCarthy 91 functionproc MC (n : int) returns (r : int) var t1 : int, t2 : int;begin /* top */ if n > 100 then /* [|n-101>=0|] */ r = n - 10; /* [|-n+r+10=0; n-101>=0|] */ else /* [|-n+100>=0|] */ t1 = n + 11; /* [|-n+t1-11=0; -n+100>=0|] */ t2 = MC(t1); /* [|-n+t1-11=0; -n+100>=0; -n+t2-1>=0; t2-91>=0|] */ r = MC(t2); /* [|-n+t1-11=0; -n+100>=0; -n+t2-1>=0; t2-91>=0; r-t2+10>=0; r-91>=0|] */ endif; /* [|-n+r+10>=0; r-91>=0|] */end

var a : int, b : int;begin /* top */ b = MC(a); /* [|-a+b+10>=0; b-91>=0|] */end

if (n>=101) then n-10 else 91

if (n>=101) then n-10 else 91

Page 61: Program Analysis  and Cyber Security

Following the recipe (in a nutshell)

1) Abstraction

Concrete state Abstract statex

t n n n

xt

n

2) Transformers

n

xt

nt n

x n

t->n = x

61

Page 62: Program Analysis  and Cyber Security

Example: shape (heap) analysis

tx

n

xt n

xt n n

xt n n

xtt

x

ntt

ntx

tx

tx

emp

void stack-init(int i) { // test for i = 4 Node* x = null;

do {

Node t = malloc(…)

t->n = x;

x = t;

} while(--i>0)

Top = x;

} assert(acyclic(Top))

tx

n n

xt n n

xt n n n

xt n n n

xt n n n

top62

Page 63: Program Analysis  and Cyber Security

xt n n

tx

n

xt n

xt n n

xtt

xn

tt

ntx

tx

tx

emp

xt n

n

xt n

nn

xt

n

t n

x n

xt n

n

3) Exploration

xt n

Top n

ntx Top

tx Top x

t n

Top n

void stack-init(int i) { Node* x = null;

do {

Node t = malloc(…)

t->n = x;

x = t;

} while(--i>0)

Top = x;

}

assert(acyclic(Top))63

Page 64: Program Analysis  and Cyber Security

Astree [Cousot+,’02-05]

Prove absence of runtime errors in safety critical C code Synchronous, sequential programs

(Avionics) Properties: division by 0, floating point

overflow, …

Verified primary flight control software of the Airbus A340 fly-by-wire system (132,000 LOC)

Page 65: Program Analysis  and Cyber Security

Astree [Cousot+,’02-05]

Airbus code

Analysed program k LOC False alarms Analysis time

Sequential 1 18 3 1 h 14

Sequential 2 37 2 10 min

Synchronous 1 100 3 7 h 15

Synchronous 2 76 0 6 h

Synchronous 3 500 2 30 h

Metrics (2.6 GHz, 16 Gb RAM PC)

Page 66: Program Analysis  and Cyber Security

Driver’s Source Code in C

PreciseAPI Usage Rules(SLIC)

Defects

100% pathcoverage

Rules

Static Driver Verifier

Environment model

Static Driver VerifierSLAM

Page 67: Program Analysis  and Cyber Security

State machine for locking

Unlocked Locked

Error

Rel Acq

Acq

Rel

state { enum {Locked,Unlocked}

s = Unlocked;}

KeAcquireSpinLock.entry { if (s==Locked) abort; else s = Locked;}

KeReleaseSpinLock.entry { if (s==Unlocked) abort; else s = Unlocked;}

SLAMLocking rule in SLIC

Page 68: Program Analysis  and Cyber Security

SLAM (now SDV) [Ball+,’11] 100 drivers and 80 SLIC rules.

The largest driver ~ 30,000 LOC

Total size ~450,000 LOC

The total runtime for the 8,000 runs (driver x rule)

30 hours on an 8-core machine

20 mins. Timeout

Useful results (bug / pass) on over 97% of the runs

Caveats: pointers (imprecise) & concurrency (ignores)

Page 69: Program Analysis  and Cyber Security

69

Scaling

initialstates

badstates

reachablestates

false positives

initialstates

badstates

reachablestates false

negatives

Sound Complete

Page 70: Program Analysis  and Cyber Security

70

initialstates

badstates

reachablestates

false negative

s

false positives

Unsound static analysis

Page 71: Program Analysis  and Cyber Security

Unsound static analysis

Static analysis No code execution

Trade soundness for scalability Do not cover all execution paths But cover “many”

Page 72: Program Analysis  and Cyber Security

FindBugs [Pugh+,’04] Analyze Java programs (bytecode) Looks for “bug patterns”

Bug patterns Method() vs method() Override equal(…) but not hashCode() Unchecked return values Null pointer dereference

App17 KLOC NP bugs Other Bugs Bad Practice Dodgy

Sun JDK 1.7 597 68 180 594 654Eclipse 3.3 1447 146 259 1079 653Netbeans 6 1022 189 305 3010 1112glassfish 2176 146 154 964 1222jboss 178 30 57 263 214

Page 73: Program Analysis  and Cyber Security

PREfix [Pincus+,’00]

Developed by Pinucs, purchased by Microsoft

Automatic analysis of C/C++ code Memory errors, divide by zero Inter-procedural bottom-up analysis Heuristic - choose “100” paths Minimize effect of false positiveProgram KLOC Time Mozilla browser

540 11h

Apache 49 15m

2-5 warnings per KLOC

Page 74: Program Analysis  and Cyber Security

PREfast

Analyze Microsoft kernel code + device drivers Memory errors, races,

Part of Microsoft visual studio

Intra-procedural analysis

User annotationsmemcpy( __out_bcount( length ) dest, __in_bcount( length ) src, length );PREfix + PREfast found 1/6 of bugs fixed in Windows Server’03

Page 75: Program Analysis  and Cyber Security

Coverity [Engler+, ‘04]

Looks for bug patterns Enable/disable interrupts, double

locking, double locking, buffer overflow, …

Learns patterns from common

Robust & scalable 150 open source program -6,000 bugs Unintended acceleration in Toyota

Page 76: Program Analysis  and Cyber Security

Summary

Cyber-security threats are real, and here to stay

Software is the new battlefront

Automatic program analysis techniques are critical both for defense and offense

Page 77: Program Analysis  and Cyber Security

Sound SA vs. Testing

Can find rare errorsCan raise false alarms

Cost ~ program’s complexity

Can handle limited classes of programs and still be useful

Can miss errorsFinds real errors

Cost ~ program’s execution

No need to efficiently handle rare cases

Sound SA Testing

78

Can miss errors Can raise false alarms

Cost ~ program’s complexity

No need to efficiently handle rare cases

Unsound SA

Page 78: Program Analysis  and Cyber Security

Sound SA vs. Formal verification

Fully automatic

Applicable to a programming language

Can be very imprecise May yield false alarms

Requires specification and loop invariants

Program specific

Relatively complete Provides counter

examples Provides useful

documentation Can be mechanized

using theorem provers

Sound Static Analysis Formal verification

79

Page 79: Program Analysis  and Cyber Security

Bill Gates, WinHec’02

Things like even software verification, this has been the Holy Grail of computer science for many decades but now in some very key areas, for example, driver verification we’re building tools that can do actual proof about the software and how it works in order to guarantee the reliability.

Page 80: Program Analysis  and Cyber Security

Conclusions

Tool helps Find significant bugs

improve reliability Improve productivity

Long way to go Limited classes of bugs Precision - false positive render tool

useless Scaling – program analysis is difficult

Part of a larger picture

Page 81: Program Analysis  and Cyber Security

More

Verification condition generators Verifying the verifiers Pointer analysis Concurrent + distributed systems Learning abstractions Symbolic execution Model checking SMT solvers

Page 82: Program Analysis  and Cyber Security

Additional security applications of static analysis and formal verification

Automatic Test/Exploit Generation [Avgerinos Cha Hao Brumley ‘11] Perform static analysis to search for potential vulnerabilities,

generate an initial input that triggers the bug, and generate exploit input

Automatic Patch-based Exploit Generation Automatically generate exploits from the patch binary and the

original vulnerable program binary and sometimes in minutes of time

Automatic Malware Dissection and Trigger-based Behavior Analysis Automatic exploration of program execution paths in malware

to uncover trigger conditions (such as the time used in time bombs and commands in botnet programs) and trigger-based behavior

See BitBlaze project: http://bitblaze.cs.berkeley.edu

Page 83: Program Analysis  and Cyber Security

The End