37

Server-Side Dynamic Code Analysis - Indefinite Studies

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Server-Side Dynamic Code Analysis

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud

Nancy University - Loria{guizaniw|jean-yves.marion|reynaudd}@loria.fr

http://lhs.loria.fr

October 14, 2009

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 2: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

DescriptionWhat everybody doesWhat we do

Outline

1 ProblemDescriptionWhat everybody doesWhat we do

2 TraceSurfer: How It WorksGetting the traceSur�n' the wavesPacker analysisServer-side analysis

3 Conclusion

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 3: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

DescriptionWhat everybody doesWhat we do

Description

Problem:

packing e�ectively prevents static analysis

packing is easy

unpacking is expensive (in terms of human time andcomputation resources)

heavily protected binaries are obscure

What we want:

a solution to speed-up the analysis of self-modifying code

a solution to detect suspicious behaviours in unknown binaries

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 4: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

DescriptionWhat everybody doesWhat we do

What everybody does

emulate the program

build the set W of written memory addresses and the set X ofexecuted addresses

if W ∩X 6= ∅, dump memory and �x the Import Address Tableto rebuild an executable

problems:

anti-emulation techniques (prefetch tricks, undocumentedinstructions, MMX/SSE*/3DNow/FPU, obscure systemcalls...)anti-dumping techniquesmultiple code layers...

there is a number of existing automatic unpackers: Pandora'sBochs, Ether, Renovo, Sa�ron, VxStripper, PolyUnpack, OmniUnpack, AVs...

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 5: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

DescriptionWhat everybody doesWhat we do

What we do

We generalize this approach:

we also monitor the set R of read memory addresses

we follow nested self-modifying code (= code waves)

we detect code protection techniques (→ alert)

we plot them as a structure between code waves (→ visualize)

W X →

R1 W1 X1

R2 W2 X2

... ... ...

Rn Wn Xn

→ alert and visualize

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 6: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

DescriptionWhat everybody doesWhat we do

Understanding code waves

Here are the code waves for a program packed with a simple packer:

Figure: aspack

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 7: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

DescriptionWhat everybody doesWhat we do

Understanding code waves

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 8: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

DescriptionWhat everybody doesWhat we do

Understanding code waves

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 9: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Outline

1 ProblemDescriptionWhat everybody doesWhat we do

2 TraceSurfer: How It WorksGetting the traceSur�n' the wavesPacker analysisServer-side analysis

3 Conclusion

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 10: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

The tracer

this is what a trace looks like:

we get that with Pin, the tool comes in two versions:

1 slow and stable: 150 lines of C++, 200x slowdown2 fast and furious: 1400 lines of C++, 10x slowdown

we could also use an emulator or a debugger

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 11: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

The surfer

we are going to associate each memory address m at step x

with a read/write/execution level: Read(m, x), Write(m, x),Exec(m, x)

initially, for all m, we haveExec(m, 0) = Read(m, 0) = Write(m, 0) = 0

we then apply some transition rules...

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 12: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Transition rules

if we execute an instruction at address m:

Exec(m, x + 1) = Write(m, x) + 1

if the instruction at address m reads memory address m′:

Read(m′, x + 1) = Exec(m, x)

if the instruction at address m writes memory address m′:

Write(m′, x + 1) = Write(m, x)

note that the r/w/x levels are easy to compute: when you

touch something, you give it your level. When you execute,

you gain a level.

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 13: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

The patterns

With this information, we can detect the following code protectiontechniques:

self-modifying code:

code decryptionblind self-modi�cation

integrity checking

code scrambling

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 14: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Visualization

We can now visualize waves and patterns as a graph:

Figure: Yoda Protector

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 15: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Visualization

We can now visualize waves and patterns as a graph:

Figure: Allaple

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 16: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Visualization

We can now visualize waves and patterns as a graph:

Figure: Pelock

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 17: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Visualization

We can now visualize waves and patterns as a graph:

Figure: Telock

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 18: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Packer Analysis

First experiment

1 take a normal program with a known behaviour (in this case:hostname.exe)

2 protect it with 16 di�erent packers

3 analyse the packed versions with TraceSurfer

We expect to �nd larger traces with dynamic code in the packedversions.The analysis is a success ifTraceSurfer(Packer(hostname)) ≡ hostname.

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 19: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Results 1/2

Protection Name Success Trace (Mb) Sur�ng Time Waves

None (original �le) yes 11,9 0m8s 1

AcProtect yes 154.0 5m09s 18

Aspack yes 17.8 0m12s 2

Expressor yes 96.6 1m7s 2

FSG yes 49.2 0m34s 2

Mew yes 49.2 0m35s 2

Molebox yes 455.0 5m18s 3

Npack yes 35.0 0m24s 2

Packman yes 14.1 0m10s 2

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 20: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Results 2/2

Protection Name Success Trace (Mb) Sur�ng Time Waves

None (original �le) yes 11,9 0m8s 1

Pec2 yes 51.6 0m36s 3

Pelock yes 297 6m06s 9

Pespin no 35.3 0m48s 3

RLPack yes 56.6 0m41s 2

Telock no 111 3m39s 14

UPX yes 13.8 0m10s 2

Winupack yes 58.9 0m41s 2

Yoda Protector yes 97.8 1m41s 4

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 21: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Server-Side Analysis

Second experiment

1 take malware samples from a honeypot

2 �lter out non-executable �les

3 analyse the executable �les with TraceSurfer in batch mode

The analysis is a success if TraceSurfer(sample) 6= ⊥.

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 22: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

The High Security Lab

1 master node with the malware repositoryto distribute the work

12 slave nodes with 2 virtual Windows XPimages each (8 cores, 16 Gb of RAM pernode)

isolated from the network, physicallysecured with biometric access control

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 23: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Input Set

nb of �les from the honeypot: 62,498

nb of executable �les: 59,554

the analysis by TraceSurfer took 34h10m (we runapproximately 1700+ binaries/hour)

success rate of 81.28%

pe�le with 2600 PEiD signatures: a packer is detected in2.92% of the samples

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 24: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Experimental Results 1/3

Nb of Waves Nb of Binaries

1 wave 318 0.66%

2 waves 4,184 8.64%

3 waves 516 1.07%

4 waves 589 1.22%

5 waves 42,455 87.71%

6 waves 86 0.18%

7 waves 41 0.08%

8 waves 92 0.19%

9 waves 10 0.02%

10 waves 38 0.08%

... ... ...

15 waves 1 0.00%

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 25: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Experimental Results 2/3

Code Protection Used Nb of Binaries

Code decryption 44,046 91.00%

Blind self-modifying code 43,805 90.50%

Integrity checking 42,665 88.14%

Code scrambling 601 1.24%

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 26: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Getting the traceSur�n' the wavesPacker analysisServer-side analysis

Experimental Results 3/3

Anti-Virtualization Technique Nb of Binaries

At least one 71 0.15%

SIDT 65 0.13%

SLDT 0 0.00%

SGDT 0 0.00%

STR 0 0.00%

VMWare channel 14 0.03%

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 27: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Outline

1 ProblemDescriptionWhat everybody doesWhat we do

2 TraceSurfer: How It WorksGetting the traceSur�n' the wavesPacker analysisServer-side analysis

3 Conclusion

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 28: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Limitations

limitations of the implementation:

the detection of dynamic code is sound and complete modulokernel-mediated writesPin has not been made for malware analysis: theinstrumentation fails in pathological cases

limitations of the type system:

we do not follow the data-�ow (yet!), so the labels �integritychecking� / �decryption� and �blind self-modi�cation� areheuristicmultithreading not modelled in the type systemthe analysis is speci�c to a single execution trace of theprogram

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 29: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Conclusion

we have designed a generic technique to analyse traces

it can signi�cantly speed up manual analysis by providing anew visualization

it can automatically detect suspicious behaviours in unknownbinaries

it is scalable: thousands of binaries can be analysed daily

Ongoing development:

add detection for more anti-virtualization / anti-debugging /anti-emulation / anti-sandboxing techniques

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 30: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Bonus Slides

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 31: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

More Results 1/4

nb of �les from the honeypot: 25,118

nb of executable �les: 23,104

success rate of 80.81%

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 32: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

More Results 2/4

Nb of Waves Nb of Binaries % of Analysed Files

1 wave 2,289 12.26%

2 waves 10,117 54.19%

3 waves 2,911 15.59%

4 waves 638 3.42%

5 waves 730 3.91%

6 waves 1,148 6.15%

7 waves 53 0.28%

8 waves 39 0.21%

9 waves 295 1.58%

10 waves 307 1.64%

... ... ...

43 waves 1 0.01%

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 33: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

More Results 3/4

Code Protection Used Nb of Binaries

Code decryption 12,261 65.67%

Blind self-modifying code 8,157 43.69%

Integrity checking 1,747 9.36%

Code scrambling 1,092 5.85%

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 34: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

More Results 4/4

Anti-Virtualization Technique Nb of Binaries

At least one 117 0.63%

SIDT 56 0.30%

SLDT 2 0.01%

SGDT 6 0.03%

STR 0 0.00%

VMWare channel 58 0.31%

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 35: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

IDA Integration

all this information can be integrated in static analysis tools:dynamic CFG reconstruction / disassembly resynchronisation

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 36: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Disassembly Resynchronisation

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis

Page 37: Server-Side Dynamic Code Analysis - Indefinite Studies

ProblemTraceSurfer: How It Works

Conclusion

Disassembly Resynchronisation

Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis