1
Malware Analysis and Instrumentation
Andrew Bernat and Kevin RoundyParadyn Project
Paradyn / Dyninst WeekMadison, Wisconsin
May 2-4, 2011
Forensic analysts need help
Malware Analysis and Instrumentation 2
90% of malware resists analysis[1]
Malware attacks cost billions of dollars annually[2]
65% of users feel effect of cyber crime[3]
69% cybercrimes are resolved[3]
28 days on average to resolve a cybercrime[3]
[1] McAfee. 2008 [2] Computer Economics. 2007 [3] Norton. 2010
7a 77 0e 20 e9 3d
e0 09 e8 68 c0 45
be 79 5e 80 89 08
27 c0 73 1c 88 48
6a d8 6a d0 56 4b
fe 92 57 af 40 0c
b6 f2 64 32 f5 07
b6 66 21 0c 85 a5
94 2b 20 fd 5b 95
Malware Binary
Malware Analysis and Instrumentation 3
7a 77 0e 20 e9 3d
e0 09 e8 68 c0 45
be 79 5e 80 89 08
27 c0 73 1c 88 48
6a d8 6a d0 56 4b
fe 92 57 af 40 0c
b6 f2 64 32 f5 07
b6 66 21 0c 85 a5
94 2b 20 fd 5b 95
Malware Binary
Binary code identification
Control- and data-flow analysis
Instrumentation
Effectiveness on malware
The needed toolbox
Forensic analysts need help
Malware Analysis and Instrumentation
Dyninst
Dyninst is a toolbox for analysts
4
program
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
Dyninst
CFG
loop,block,
function,instructioninstrument-
ation
functionreplace-
ment
callstack
walking
forward &backward
slices
loopanalysis
processcontrol
libraryinjection symbol
tablereading,writing
binaryrewriting
machinelanguageparsing
Control flow
analyzer
Instrumenter
Data flow analyzer
Analysis tool
Dyninst
Dyninst is a toolbox for analysts
Malware Analysis and Instrumentation
Mutator Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
program
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
Dyninst
Control flow
analyzer
Instrumenter
Data flow analyzer
CFGCFG
5
loop,block,
function,instructioninstrument-
ation
functionreplace-
ment
callstack
walking
forward &backward
slices
loopanalysis
processcontrol
libraryinjection symbol
tablereading,writing
binaryrewriting
machinelanguageparsing
Analysis tool
Dyninst is a toolbox for analysts
Malware Analysis and Instrumentation 6
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
printf(…)
counter++if (pred)
callback(…)
getTarget(insn)
Code snippetsMutator
Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
program
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
CFG
Dyninst
Control flow
analyzer
Instrumenter
Data flow analyzer
Analysis tool
Dyninst
Dyninst on malware
Malware Analysis and Instrumentation 7
printf(…)
counter++if (pred)
callback(…)
getTarget(insn)
Code snippetsMutator
Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
Malware defeats static analysis &is sensitive to instrument-ationmalwar
e binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
CFG
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
Control flow
analyzer
Instrumenter
Data flow analyzer
Analysis tool
Dyninst
Control flow
analyzer
Instrument-er
Data flow analyzer
Dyninst on malware
Malware Analysis and Instrumentation 8
printf(…)
counter++if (pred)
callback(…)
getTarget(insn)
Code snippetsMutator
Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
Malware defeats static analysis &is sensitive to instrument-ationmalwar
e binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
CFGCFGCFG
SR- Dyninststatic-dynamic analysis
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
Control flow
analyzer
Sensitivity Resistant Instrumenter
Data flow analyzer
Outline
Malware Analysis and Instrumentation 9
Anti-analysis tricksHybrid static-dynamic analysisSensitivity resistanceResults
H.A.
Anti
S.R.
Res.
9
PC-sensitive code
Obfuscated control flow
Unpacked code
Overwritten code
Anti-patching
Address-space probing
PC-sensitive codecall-pop pairs, return-address manipulation, call-stack tampering & probing
Anti-analysis tricks
Malware Analysis and Instrumentation 10
Obfuscated control flowindirect control flow, stack tampering, overlapping code, signal-based ctrl flow
Unpacked codeall-at-once, block-, loop-, function-at-a-time, to empty or allocated space
Overwritten codesingle operand or opcode, whole instruction, function, code section, buffer
Anti-patchingchecksum whole regions, probe for patches, use code as data, move stack ptr
Anti
Address-space probingscans & probes of locations that should be un-allocated
An
ti-a
naly
sis
An
ti-
instr
um
en
tati
on
03 04 05 06 07 08 09 0a 0b 0c 0d
e8 03 00 00 00 e9 eb 04 5d 45 55 c3
CALL JMP40d00a 459dd4f7
JMP POP INC PUSH RET40d00e ebp ebp ebp
anti-patching
storm worm
Obfuscated control flow
Malware Analysis and Instrumentation 11
obfuscated control flow
40d002
address-space probing
unpacked code
overwritten code
obfuscated control flow
Entry Point
pc-sensitive code
Anti
storm worm
Unpacked code
Malware Analysis and Instrumentation 12
Entry Point
7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be
79 5e 80 89 08 27 c0 73 1c 88 48 6a d8
6a d0 56 4b fe 92 57 af 40 0c b6 f2 64
32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd
5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83
a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01
7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be
79 5e 80 89 08 27 c0 73 1c 88 48 6a d8
6a d0 56 4b fe 92 57 af 40 0c b6 f2 64
32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd
5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83
a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01
obfuscated control flow
unpacked code
obfuscated control flow
Anti
12
anti-patchingaddress-space
probing
overwritten code
pc-sensitive code
Overwritten code
Malware Analysis and Instrumentation 13
Upack packer
obfuscated control flow
overwritten code
obfuscated control flow
Anti
13
anti-patchingaddress-space
probing
pc-sensitive code
unpacked code
7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be
79 5e 80 89 08 27 c0 73 1c 88 48 6a d8
6a d0 56 4b fe 92 57 af 40 0c b6 f2 64
32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd
79 5e 80 89 08 27 c0 73 1c 88 48 6a d8
5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83
a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01
Entry Point
PC Sensitive code
Malware Analysis and Instrumentation 14
obfuscated control flow
overwritten code
obfuscated control flow
Anti
14
anti-patchingaddress-space
probing
pc-sensitive code
unpacked code
Local Data Access
call
pop esiadd esi, eaxmov ebx, ptr[esi]
data
Use call to get current PC
Pop PC into register
Construct pointer and dereference
e.g., ASProtect
anti-patching
obfuscated control flow
Anti-patching
Malware Analysis and Instrumentation 15
checksum routine
protected codexor eax, eax
cmp eax, .chksumjne .fail
e.g., PECompact
Checksumming detects instrumentation [Aucsmith 96]
add eax, ptr[ebx]add 4, ebxcmp ebx, 0x41000jne .loop
jmp
instrument-ation is detected
pass failfail
calculate checksum of protected regioncompare to expected value
Anti
15
address-space probing
unpacked code
overwritten code
pc-sensitive code
Address-space probing
Malware Analysis and Instrumentation 16
obfuscated control flow
overwritten code
obfuscated control flow
Anti
16
anti-patchingaddress-space
probing
pc-sensitive code
unpacked code
segv_handler() { ptr += PAGESIZE; goto RESTART:}
int *ptr = 0;
sigaction(SIGSEGV, segv_handler);
while(1) {RESTART: *ptr; ptr += PAGESIZE;}
data
code
code
instrumentation
Memory Scan
Malware Analysis and Instrumentation 17
Code discovery algorithmHybrid algorithm:
? ?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution
H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
Malware Analysis and Instrumentation 18
Code discovery algorithm
?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution ?
Hybrid algorithm:
H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
Malware Analysis and Instrumentation 19
Code discovery algorithm
?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution ?
Hybrid algorithm:
H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
Malware Analysis and Instrumentation 20
Code discovery algorithm
?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution ?
Hybrid algorithm:
H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
Malware Analysis and Instrumentation 21
Code discovery algorithm
Parse from known entry points
Instrument control flow that may lead to new codeResume execution ?
Hybrid algorithm:
H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
Malware Analysis and Instrumentation 22
Instrumentation-based discoveryH.A.Invalid control transfers
Indirect control transfers
Exception-based control transfers
push eax
ret
call 401000
Invalid Region
call ptr[eax]
?
jmp eax
?
xor eax, eaxmov ebx, ptr[eax]
Exception Handler
Overwritten code discovery
Malware Analysis and Instrumentation 23
Dyninst
write
RWX
23
H.A.
RWXRWX
Update after overwrite
1. Handle overwrite signala) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution
2. Update CFG when writes enda) remove overwritten and
unreachable blocksb) parse at entry points to
overwritten regionsc) remove write permissionsd) resume execution
Overwritten code discovery
Malware Analysis and Instrumentation 24
Dyninst
R-XR-X
code write handler
CFG update routine
write
Update after overwrite
1. Handle overwrite signala) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution
2. Update CFG when writes enda) remove overwritten and
unreachable blocksb) parse at entry points to
overwritten regionsc) remove write permissionsd) resume execution
cb
RWX
cb
R-X
24
H.A.
Dyninst
Overwritten code discovery
Malware Analysis and Instrumentation 25
Update after overwrite1. Handle overwrite signal
a) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution
2. Update CFG when writes enda) remove overwritten and
unreachable blocksb) parse at entry points to
overwritten regionsc) remove write permissionsd) resume execution
R-X R-XR-X RWX
code write handler
CFG update routine
cb
write
cb
25
H.A.
PC-sensitivity analysis
Malware Analysis and Instrumentation 26
SR-Dyninst
S.R.
call...data...pop esiadd esi, eaxmov ebx, ptr[esi]...
process
main:
reloc_main:push <orig>jmp 0pop esiadd esi, eaxmov ebx, ptr[esi]...
Relocate Analyze
Anti-anti patching
Malware Analysis and Instrumentation 27
S.R.
checksum routinexor eax, eax
cmp eax, .chksumjne .fail
add eax, ptr[ebx]add 4, ebxcmp ebx, 0x41000jne .loop
pass failfail
data
code
code
instrumentation
patch
patch
patch
add 4, ebxcmp ebx, 0x41000jne .loop
emulate(add eax, ptr[ebx])
restore state
save statejmp 863828
shadow memory
Address-space scanning
Malware Analysis and Instrumentation 28
S.R.
scan routinexor eax, eax
call chk_mem
mov ptr[eax], ebxadd 4, eaxcmp eax, 0x0jne .loop
pass failfail
data
code
code
instrumentation
patch
patch
patch
add 4, eaxcmp ebx, 0x0jne .loop
emulate(mov ptr[eax],
ebx)
restore state
save statejmp 863828
segv_handler ... dyn_segv_han
dler ... ...
DyninstSR-
Dyninst
xx
√
√
√x
√
√
√
√
√
√
yes
yes
yes
yes
yes
yes
yes
yes
yes
Malware Analysis and Instrumentation 29
The packers we’re studying
[1] Packer (r)evolution. Panda Research, 2008. Two-month average Feb-March 2008.
Packer
Malware market share[1]
0.13%MEW
0.17%WinUPack
0.33%Yoda's Protector
0.37%Armadillo
0.43%Asprotect
1.26%FSG
1.29%Aspack
1.74%nPack
2.08%Upack
2.59%PECompact
2.95%Themida
4.06%EXECryptor
6.21%PolyEnE
9.45%UPX
0.89%Nspack
Res.
Self-modifyin
g
yes
yes
yes
yes
yes
yes
Anti instru-
mentation
yes
yes
yes
yes
yes
Obfuscated
yes
yes
yes
yes
yes
yes
yes
yes
yes
√
√
√
an
ti-d
eb
uggin
g t
ech
niq
ues
Reduced relocation overhead despite emulation
Better handling of program featuresExceptions Indirect control flow
Malware Analysis and Instrumentation 30
Improved Dyninst overhead
Res.
Malware Analysis and Instrumentation 31
Conclusion
SR-Dyninst gives youAll the benefits of Dyninst on malwareSafer instrumentation on normal binaries
Ongoing workAnti-debugger techniquesMore descriptive CFGsAutomated defensive-mode activationSR-Dyninst in next Dyninst release