21
Analysis Of Stripped Binary Code Laune Harris University of Wisconsin – Madison [email protected] www.paradyn.org

Analysis Of Stripped Binary Code Laune Harris University of Wisconsin – Madison [email protected]

Embed Size (px)

Citation preview

Analysis Of Stripped Binary Code

Laune HarrisUniversity of Wisconsin – Madison

[email protected]

2

856c: 55 856d: 89e5 856f : 83ec08 8572: e8ddffffff 857b: c9 857c: c3 857d: 55 857e: 89e5 8581: 83ec18858b: e8bfffffff8591: c9 8592: c3

Binary code

3

856c: 55 856d: 89e5 856f : 83ec08 8572: e8ddffffff 857b: c9 857c: c3 857d: 55 857e: 89e5 8581: 83ec18858b: e8bfffffff8591: c9 8592: c3

push %ebpmov %esp, %ebpsub 8, %espcall 857dleaveretpush %ebpmov %esp, %ebpsub %eax, %ebpcall 866cleaveret

Binary code (with assembly)

4

856c: 55 856d: 89e5 856f : 83ec08 8572: e8ddffffff 857b: c9 857c: c3 857d: 55 857e: 89e5 8581: 83ec18858b: e8bfffffff8591: c9 8592: c3

push %ebpmov %esp, %ebpsub 8, %espcall fooleaveretpush %ebpmov %esp, %ebpsub %eax, %ebpcall printf leaveret

main

foo

Binary code (with symbol info)

5

A lot of code is stripped

•Commercial applications (usually)

•Proprietary libraries (often)

•Viruses

•OS libraries and utilities (depends on OS and OS version)

6

Steps in symbol reconstruction

•Find and name functions

•Find function size

7

Finding functions

•Build a call graph and traverse it to find function start addresses

•Opportunistic parsing: use existing symbol names and addresses where available

•Works on a spectrum of binaries ranging from binaries with all symbols to fully stripped binaries

8

push %ebp856c: main

Call Graph creation

9

push %ebpmov %esp, %ebpsub 8, %espcall 857dleaveret

856c: 856d: 856f: 8572: 857b: 857c:

main

Call Graph creation

10

push %ebpmov %esp, %ebpsub 8, %espcall func857dleaveretpush %ebp

856c: 856d: 856f: 8572: 857b: 857c: 857d:

main

func857d

Call Graph creation

11

push %ebpmov %esp, %ebpsub 8, %espcall func857dleaveretpush %ebpmov %esp, %ebpsub %eax, %ebpcall 865ecall 866d leaveret

856c: 856d: 856f: 8572: 857b: 857c: 857d: 857e: 8581: 858b: 8591:8596: 8597:

main

func857d

Call Graph creation

12

Parsing Functions

•Disassemble function’s code by traversing intra-procedural control flow graph

•Highest address determines function size

13

Error Detection And Recovery

•CFG exit points are sometimes hard to identify

•Assume branches that are not obvious exits are intra-procedural

•Errors result in overestimation of function size

•Overlapping functions indicate error

14

Problems and Solutions

•Functions that are only called indirectly•Problem: static call graph traversal does not discover these functions

•Solution: examine gaps in text space and use heuristics to find functions

15

Problems and Solutions cont’d

•Indirect Jumps•Problem: need to find targets to complete CFG

•Solution: parse jump tables to find possible targets

16

Problems and Solutions cont’d

•Exception handling code•Problem: creates code blocks that appear unreachable

•Solution: get block addresses from exception table

17

Test Programs

paradyn 5.44 3.51 13,676

condor_starter 22.60 2.50 8,168

gimp 2.61 2.20 4,329

eon 10.44 0.51 1,163

om3 0.43 0.30 732

alara 3.65 0.26 948

bubba 0.09 0.02 66

size (MB)

unstripped

size (MB)

stripped

number of

functions

18

Evaluation

•Parse time (includes CFG creation)•~1.4x faster than prev. parser (with cfg)•~1.7x slower than prev. parser (without cfg)

•Stripped parse time•Varies: 1.2x - 1.9x slower than unstripped

•Symbol recreation •80% - 98% of original functions

19

Related Work

•Binary rewriters/instrumentation tools•eel, emil, etch, goblin, leel, plto

•Disassemblers (lots available)•IDAPro, Objdump, dumpbin, etc

•Symbol table reconstructors•dress, objdump-output-beautifier

20

Status

•Implemented on x86

•Ready for measurement and instrumentation

•Good start for security, but needs work

21

Future Work

•Develop more accurate heuristics to identify code in unlit areas of the binary

•Data flow analyses•Port to other platforms•Support unconventional function

constructs•Comprehensive comparison with other

tools•Evaluation on obfuscated code