25
Securing software by enforcing data-flow integrity Manuel Costa Joint work with: Miguel Castro, Tim Harris Microsoft Research Cambridge University of Cambridge

Securing software by enforcing data-flow integrity

  • Upload
    ervin

  • View
    54

  • Download
    2

Embed Size (px)

DESCRIPTION

Securing software by enforcing data-flow integrity. Manuel Costa Joint work with: Miguel Castro, Tim Harris Microsoft Research Cambridge University of Cambridge. Software is vulnerable. use of unsafe languages is prevalent most “packaged” software written in C/C++ many software defects - PowerPoint PPT Presentation

Citation preview

Page 1: Securing software by enforcing data-flow integrity

Securing software by enforcing data-flow integrity

Manuel Costa

Joint work with:Miguel Castro, Tim Harris

Microsoft Research CambridgeUniversity of Cambridge

Page 2: Securing software by enforcing data-flow integrity

Software is vulnerable

• use of unsafe languages is prevalent– most “packaged” software written in C/C++

• many software defects– buffer overflows, format strings, double frees

• many ways to exploit defects– corrupt control-data: stack, function pointers– corrupt non-control-data:

function arguments, security variables

defects are routinely exploited

Page 3: Securing software by enforcing data-flow integrity

Approaches to securing software• remove/avoid all defects is hard• prevent control-data exploits

– protect specific control-data StackGuard, PointGuard– detect control-flow anomalies Program Shepherding, CFI– attacks can succeed without corrupting control-flow

• prevent non-control-data exploits– bounds checking on all pointer dereferences CRED– detect unsafe uses of network data

Vigilante, [Suh04], Minos, TaintCheck, [Chen05], Argos, [Ho06]– expensive in software

no good solutions to prevent non-control-data exploits

Page 4: Securing software by enforcing data-flow integrity

Data-flow integrity enforcement

• compute data-flow in the program statically– for every load, compute the set of stores that

may produce the loaded data

• enforce data-flow at runtime– when loading data, check that it came from an

allowed store

• optimize enforcement with static analysis

Page 5: Securing software by enforcing data-flow integrity

Data-flow integrity: advantages

• broad coverage– detects control-data and non-control-data attacks

• automatic– extracts policy from unmodified programs

• no false positives– only detects real errors (malicious or not)

• good performance– low runtime overhead

Page 6: Securing software by enforcing data-flow integrity

Outline

• data-flow integrity enforcement

• optimizations

• results

Page 7: Securing software by enforcing data-flow integrity

Data-flow integrity• at compile time, compute reaching definitions

– assign an id to every store instruction– assign a set of allowed source ids to every load

• at runtime, check actual definition that reaches a load– runtime definitions table (RDT) records id of last store to each

address – on store(value,address): set RDT[address] to store’s id– on load(address): check if RDT[address] is one of the allowed

source ids

• protect RDT with software-based fault isolation

Page 8: Securing software by enforcing data-flow integrity

Example vulnerable program

• non-control-data attack

• very similar to a real attack on a SSH server

buffer overflow in this function allows the attacker to set authenticated to 1

int authenticated = 0;char packet[1000];

while (!authenticated) { PacketRead(packet);

if (Authenticate(packet)) authenticated = 1;} if (authenticated) ProcessPacket(packet);

Page 9: Securing software by enforcing data-flow integrity

Static analysis

• computes data flows conservatively– flow-sensitive intraprocedural analysis– flow-insensitive interprocedural analysis

• uses Andersen’s points-to algorithm• scales to very large programs

• same assumptions as analysis for optimization– pointer arithmetic cannot navigate between

independent objects– these are the assumptions that attacks violate

Page 10: Securing software by enforcing data-flow integrity

Instrumentation

check that authenticated was written here or here

SETDEF authenticated 1int authenticated = 0;char packet[1000];

while (CHECKDEF authenticated in {1,8} !authenticated) { PacketRead(packet);

if (Authenticate(packet)){ SETDEF authenticated 8 authenticated = 1;

}}CHECKDEF authenticated in {1,8} if (authenticated) ProcessPacket(packet);

Page 11: Securing software by enforcing data-flow integrity

SETDEF authenticated 1int authenticated = 0;char packet[1000];

while (CHECKDEF authenticated in {1,8} !authenticated) { PacketRead(packet);

if (Authenticate(packet)){ SETDEF authenticated 8 authenticated = 1;

}}CHECKDEF authenticated in {1,8} if (authenticated) ProcessPacket(packet);

Runtime: detecting the attack

RDT slot for authenticated

authenticatedstored here

stores disallowed above 0x40000000

10

Memory layout

17

Vulnerable program

Attack detected!definition 7 not in

{1,8}

Page 12: Securing software by enforcing data-flow integrity

Also prevents control-data attacks

• user-visible control-data (function pointers,…)– handled as any other data

• compiler-generated control-data– instrument definitions and uses of this new data– e.g., enforce that the definition reaching a ret is

generated by the corresponding call

Page 13: Securing software by enforcing data-flow integrity

Efficient instrumentation: SETDEF

lea ecx,[_authenticated] test ecx,0C0000000h je L int 3L: shr ecx,2 mov word ptr [ecx*2+40001000h],1

• SETDEF _authenticated 1 is compiled to:

get address of variable

prevent RDT tampering

set RDT[address] to 1

Page 14: Securing software by enforcing data-flow integrity

Efficient instrumentation: CHECKDEF

lea ecx,[_authenticated] shr ecx,2 mov cx, word ptr [ecx*2+40001000h]

cmp cx, 1 je L cmp cx,8 je L int 3 L:

•CHECKDEF _authenticated {1,8} is compiled to:

get address of variable

get definition id from RDT[address]

check definition in {1,8}

Page 15: Securing software by enforcing data-flow integrity

SETDEF authenticated 1int authenticated = 0;char packet[1000];while (CHECKDEF authenticated in {1,8} !authenticated) { PacketRead(packet); if (Authenticate(packet)){ SETDEF authenticated 8 authenticated = 1;

}}CHECKDEF authenticated in {1,8} if (authenticated) ProcessPacket(packet);

SETDEF authenticated 1int authenticated = 0;char packet[1000];while (CHECKDEF authenticated in {1} !authenticated) { PacketRead(packet); if (Authenticate(packet)){ SETDEF authenticated 1 authenticated = 1;

}}CHECKDEF authenticated in {1} if (authenticated) ProcessPacket(packet);

Optimization: renaming definitions• definitions with the same set of uses share one id

Page 16: Securing software by enforcing data-flow integrity

Other optimizations• removing SETDEFs and CHECKDEFs

– eliminate CHECKDEFs that always succeed– eliminate redundant SETDEFs– uses static analysis, but does not rely on any

assumptions that may be violated by attacks

• remove bounds checks on safe writes• optimize set membership checks

– check consecutive ids using a single comparison

Page 17: Securing software by enforcing data-flow integrity

Evaluation

• overhead on SPEC CPU and Web benchmarks• contributions of optimizations• ability to prevent attacks on real programs

Page 18: Securing software by enforcing data-flow integrity

Runtime overhead

0

0.5

1

1.5

2

2.5

3

gzip vpr mcf crafty bzip2 twolf art equake ammp

no

rmal

ized

exe

cuti

on

tim

e

baseintraproc DFIinterproc DFI

Page 19: Securing software by enforcing data-flow integrity

Memory overhead

0.0

0.5

1.0

1.5

2.0

gzip vpr mcf crafty bzip2 twolf art equake ammp

no

rmal

ized

mem

ory

usa

ge

Page 20: Securing software by enforcing data-flow integrity

Contribution of optimizations

1

10

100

1000

gzip vpr mcf crafty bzip2 twolf art equake ammp

no

rmal

ized

exe

cuti

on

tim

e without renaming

with renaming

with all optimizations

Page 21: Securing software by enforcing data-flow integrity

Overhead on SPEC Web

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

10 20 30 40 50 60 70 80

number of clients

no

rmal

ized

th

rou

gh

pu

t

data-flow integrity

maximum overhead of 23%

Page 22: Securing software by enforcing data-flow integrity

Preventing real attacks

Application Vulnerability Exploit Detected?

NullHttpd heap-based buffer overflow

overwrite cgi-bin configuration data

yes

SSH integer overflow and heap-based buffer overflow

overwrite authenticated variable

yes

STunnel format string overwrite return address

yes

Ghttpd stack-based buffer overflow

overwrite return address

yes

Page 23: Securing software by enforcing data-flow integrity

Conclusion

• enforcing data-flow integrity protects software from attacks– handles non-control-data and control-data attacks– works with unmodified C/C++ programs– no false positives– low runtime and memory overhead

Page 24: Securing software by enforcing data-flow integrity

Overhead breakdown

0%

20%

40%

60%

80%

100%

gzip vpr mcf crafty bzip2 twolf art equake ammp

% c

on

trib

uti

on

to

ove

rhea

d

store id bounds check load id compare id

Page 25: Securing software by enforcing data-flow integrity

Contribution of optimizations

0%

20%

40%

60%

80%

100%

gzip vpr mcf crafty bzip2 twolf art equake ammp

% c

on

trib

uti

on

to

op

tim

izat

ion

remove bounds check remove setdef/checkdefoptimize membership test move setdefs