Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
ProblemTraceSurfer: How It Works
Conclusion
Server-Side Dynamic Code Analysis
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud
Nancy University - Loria{guizaniw|jean-yves.marion|reynaudd}@loria.fr
http://lhs.loria.fr
October 14, 2009
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
DescriptionWhat everybody doesWhat we do
Outline
1 ProblemDescriptionWhat everybody doesWhat we do
2 TraceSurfer: How It WorksGetting the traceSur�n' the wavesPacker analysisServer-side analysis
3 Conclusion
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
DescriptionWhat everybody doesWhat we do
Description
Problem:
packing e�ectively prevents static analysis
packing is easy
unpacking is expensive (in terms of human time andcomputation resources)
heavily protected binaries are obscure
What we want:
a solution to speed-up the analysis of self-modifying code
a solution to detect suspicious behaviours in unknown binaries
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
DescriptionWhat everybody doesWhat we do
What everybody does
emulate the program
build the set W of written memory addresses and the set X ofexecuted addresses
if W ∩X 6= ∅, dump memory and �x the Import Address Tableto rebuild an executable
problems:
anti-emulation techniques (prefetch tricks, undocumentedinstructions, MMX/SSE*/3DNow/FPU, obscure systemcalls...)anti-dumping techniquesmultiple code layers...
there is a number of existing automatic unpackers: Pandora'sBochs, Ether, Renovo, Sa�ron, VxStripper, PolyUnpack, OmniUnpack, AVs...
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
DescriptionWhat everybody doesWhat we do
What we do
We generalize this approach:
we also monitor the set R of read memory addresses
we follow nested self-modifying code (= code waves)
we detect code protection techniques (→ alert)
we plot them as a structure between code waves (→ visualize)
W X →
R1 W1 X1
R2 W2 X2
... ... ...
Rn Wn Xn
→ alert and visualize
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
DescriptionWhat everybody doesWhat we do
Understanding code waves
Here are the code waves for a program packed with a simple packer:
Figure: aspack
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
DescriptionWhat everybody doesWhat we do
Understanding code waves
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
DescriptionWhat everybody doesWhat we do
Understanding code waves
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Outline
1 ProblemDescriptionWhat everybody doesWhat we do
2 TraceSurfer: How It WorksGetting the traceSur�n' the wavesPacker analysisServer-side analysis
3 Conclusion
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
The tracer
this is what a trace looks like:
we get that with Pin, the tool comes in two versions:
1 slow and stable: 150 lines of C++, 200x slowdown2 fast and furious: 1400 lines of C++, 10x slowdown
we could also use an emulator or a debugger
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
The surfer
we are going to associate each memory address m at step x
with a read/write/execution level: Read(m, x), Write(m, x),Exec(m, x)
initially, for all m, we haveExec(m, 0) = Read(m, 0) = Write(m, 0) = 0
we then apply some transition rules...
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Transition rules
if we execute an instruction at address m:
Exec(m, x + 1) = Write(m, x) + 1
if the instruction at address m reads memory address m′:
Read(m′, x + 1) = Exec(m, x)
if the instruction at address m writes memory address m′:
Write(m′, x + 1) = Write(m, x)
note that the r/w/x levels are easy to compute: when you
touch something, you give it your level. When you execute,
you gain a level.
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
The patterns
With this information, we can detect the following code protectiontechniques:
self-modifying code:
code decryptionblind self-modi�cation
integrity checking
code scrambling
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Visualization
We can now visualize waves and patterns as a graph:
Figure: Yoda Protector
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Visualization
We can now visualize waves and patterns as a graph:
Figure: Allaple
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Visualization
We can now visualize waves and patterns as a graph:
Figure: Pelock
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Visualization
We can now visualize waves and patterns as a graph:
Figure: Telock
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Packer Analysis
First experiment
1 take a normal program with a known behaviour (in this case:hostname.exe)
2 protect it with 16 di�erent packers
3 analyse the packed versions with TraceSurfer
We expect to �nd larger traces with dynamic code in the packedversions.The analysis is a success ifTraceSurfer(Packer(hostname)) ≡ hostname.
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Results 1/2
Protection Name Success Trace (Mb) Sur�ng Time Waves
None (original �le) yes 11,9 0m8s 1
AcProtect yes 154.0 5m09s 18
Aspack yes 17.8 0m12s 2
Expressor yes 96.6 1m7s 2
FSG yes 49.2 0m34s 2
Mew yes 49.2 0m35s 2
Molebox yes 455.0 5m18s 3
Npack yes 35.0 0m24s 2
Packman yes 14.1 0m10s 2
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Results 2/2
Protection Name Success Trace (Mb) Sur�ng Time Waves
None (original �le) yes 11,9 0m8s 1
Pec2 yes 51.6 0m36s 3
Pelock yes 297 6m06s 9
Pespin no 35.3 0m48s 3
RLPack yes 56.6 0m41s 2
Telock no 111 3m39s 14
UPX yes 13.8 0m10s 2
Winupack yes 58.9 0m41s 2
Yoda Protector yes 97.8 1m41s 4
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Server-Side Analysis
Second experiment
1 take malware samples from a honeypot
2 �lter out non-executable �les
3 analyse the executable �les with TraceSurfer in batch mode
The analysis is a success if TraceSurfer(sample) 6= ⊥.
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
The High Security Lab
1 master node with the malware repositoryto distribute the work
12 slave nodes with 2 virtual Windows XPimages each (8 cores, 16 Gb of RAM pernode)
isolated from the network, physicallysecured with biometric access control
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Input Set
nb of �les from the honeypot: 62,498
nb of executable �les: 59,554
the analysis by TraceSurfer took 34h10m (we runapproximately 1700+ binaries/hour)
success rate of 81.28%
pe�le with 2600 PEiD signatures: a packer is detected in2.92% of the samples
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Experimental Results 1/3
Nb of Waves Nb of Binaries
1 wave 318 0.66%
2 waves 4,184 8.64%
3 waves 516 1.07%
4 waves 589 1.22%
5 waves 42,455 87.71%
6 waves 86 0.18%
7 waves 41 0.08%
8 waves 92 0.19%
9 waves 10 0.02%
10 waves 38 0.08%
... ... ...
15 waves 1 0.00%
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Experimental Results 2/3
Code Protection Used Nb of Binaries
Code decryption 44,046 91.00%
Blind self-modifying code 43,805 90.50%
Integrity checking 42,665 88.14%
Code scrambling 601 1.24%
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Getting the traceSur�n' the wavesPacker analysisServer-side analysis
Experimental Results 3/3
Anti-Virtualization Technique Nb of Binaries
At least one 71 0.15%
SIDT 65 0.13%
SLDT 0 0.00%
SGDT 0 0.00%
STR 0 0.00%
VMWare channel 14 0.03%
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Outline
1 ProblemDescriptionWhat everybody doesWhat we do
2 TraceSurfer: How It WorksGetting the traceSur�n' the wavesPacker analysisServer-side analysis
3 Conclusion
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Limitations
limitations of the implementation:
the detection of dynamic code is sound and complete modulokernel-mediated writesPin has not been made for malware analysis: theinstrumentation fails in pathological cases
limitations of the type system:
we do not follow the data-�ow (yet!), so the labels �integritychecking� / �decryption� and �blind self-modi�cation� areheuristicmultithreading not modelled in the type systemthe analysis is speci�c to a single execution trace of theprogram
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Conclusion
we have designed a generic technique to analyse traces
it can signi�cantly speed up manual analysis by providing anew visualization
it can automatically detect suspicious behaviours in unknownbinaries
it is scalable: thousands of binaries can be analysed daily
Ongoing development:
add detection for more anti-virtualization / anti-debugging /anti-emulation / anti-sandboxing techniques
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Bonus Slides
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
More Results 1/4
nb of �les from the honeypot: 25,118
nb of executable �les: 23,104
success rate of 80.81%
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
More Results 2/4
Nb of Waves Nb of Binaries % of Analysed Files
1 wave 2,289 12.26%
2 waves 10,117 54.19%
3 waves 2,911 15.59%
4 waves 638 3.42%
5 waves 730 3.91%
6 waves 1,148 6.15%
7 waves 53 0.28%
8 waves 39 0.21%
9 waves 295 1.58%
10 waves 307 1.64%
... ... ...
43 waves 1 0.01%
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
More Results 3/4
Code Protection Used Nb of Binaries
Code decryption 12,261 65.67%
Blind self-modifying code 8,157 43.69%
Integrity checking 1,747 9.36%
Code scrambling 1,092 5.85%
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
More Results 4/4
Anti-Virtualization Technique Nb of Binaries
At least one 117 0.63%
SIDT 56 0.30%
SLDT 2 0.01%
SGDT 6 0.03%
STR 0 0.00%
VMWare channel 58 0.31%
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
IDA Integration
all this information can be integrated in static analysis tools:dynamic CFG reconstruction / disassembly resynchronisation
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Disassembly Resynchronisation
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis
ProblemTraceSurfer: How It Works
Conclusion
Disassembly Resynchronisation
Wadie Guizani, Jean-Yves Marion, Daniel Reynaud Server-Side Dynamic Code Analysis