Upload
haliem
View
253
Download
0
Embed Size (px)
Citation preview
Automatic Code Features Extraction Using
Bio-inspired AlgorithmsEICAR 2013
Ciprian Opris,a, George Cabau and Adrian Coles, a
Bitdefender, Technical University of Cluj-Napoca
November 18, 2013
Agenda
1 Introduction
2 Objectives
3 OpCodes Extraction and Normalization
4 Automatic Filters Selection
5 Experimental results
6 Conclusions and future work
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 2 / 25
1. Introduction
Agenda
1 Introduction
2 Objectives
3 OpCodes Extraction and Normalization
4 Automatic Filters Selection
5 Experimental results
6 Conclusions and future work
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 3 / 25
1. Introduction
Where are we? (1)We need to detect malware.
︸ ︷︷ ︸↓
Hash(es)↓?
Malware database↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 4 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ pmsmplpcmlpctjczczczmJ
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ pmsmplpc mlpctjczczczmJ
<pmsmplpc>
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ p msmplpcm lpctjczczczmJ
<pmsmplpc>, <msmplpcm>
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ pm smplpcml pctjczczczmJ
<pmsmplpc>, <msmplpcm>, <smplpcml>
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ pms mplpcmlp ctjczczczmJ
<pmsmplpc>, <msmplpcm>, <smplpcml>, <mplpcmlp>
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ pmsm plpcmlpc tjczczczmJ
<pmsmplpc>, <msmplpcm>, <smplpcml>, <mplpcmlp>, <plpcmlpc>
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ pmsmplpcmlpctjczczczmJ
<pmsmplpc>, <msmplpcm>, <smplpcml>, <mplpcmlp>, <plpcmlpc>, . . .
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
1. Introduction
Where are we? (2)
︸ ︷︷ ︸↓
→ push, mov, sub, mov, push, lea, push, call, mov, . . .
→ pmsmplpcmlpctjczczczmJ
<pmsmplpc>, <msmplpcm>, <smplpcml>, <mplpcmlp>, <plpcmlpc>, . . .
↓?Malware database
↙ ↘
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 5 / 25
2. Objectives
Agenda
1 Introduction
2 Objectives
3 OpCodes Extraction and Normalization
4 Automatic Filters Selection
5 Experimental results
6 Conclusions and future work
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 6 / 25
2. Objectives
Objectives
GoalImprove detection on .NET malware by filtering the OpCodes toextract more meaningful n-grams.
Extract OpCode sequences from .NET applications.
Eliminate unreachable code.
Design a fitness function to evaluate the quality of an OpCodefilter.
Use bio-inspired algorithms to find the best filter.
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 7 / 25
3. OpCodes Extraction and Normalization
Agenda
1 Introduction
2 Objectives
3 OpCodes Extraction and Normalization
4 Automatic Filters Selection
5 Experimental results
6 Conclusions and future work
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 8 / 25
3. OpCodes Extraction and Normalization
Parsing and disassembling .NET
An extension of theMicrosoft PortableExecutable format
Many, many tables
=== Method 4: name=’mpress._::Main’; RVA=0x0000254C;
FA=0x0000074C; size=0x9A ===
= Exception handlers: 000025D6; =
0000254C: [00] nop
0000254D: [28 0E 00 00 0A] call 0x0A00000E
00002552: [12 00] ldloca.s 0x00
00002554: [28 03 00 00 06] call 0x06000003
00002559: [13 06] stloc.s 0x06
0000255B: [11 06] ldloc.s 0x06
0000255D: [2D 10] brtrue.s 0x10
0000255F: [00] nop
00002560: [72 01 00 00 70] ldstr 0x70000001
00002565: [72 23 00 00 70] ldstr 0x70000023
0000256A: [28 0F 00 00 0A] call 0x0A00000F
0000256F: [26] pop
00002570: [15] ldc.i4.m1
00002571: [13 05] stloc.s 0x05
00002573: [2B 02] br.s 0x02
00002575: [26] pop
00002576: [06] ldloc.0
00002577: [28 10 00 00 0A] call 0x0A000010
0000257C: [80 01 00 00 04] stsfld 0x04000001
00002581: [7E 01 00 00 04] ldsfld 0x04000001
00002586: [6F 11 00 00 0A] callvirt 0x0A000011
0000258B: [0B] stloc.1
...
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 9 / 25
3. OpCodes Extraction and Normalization
Parsing and disassembling .NET
An extension of theMicrosoft PortableExecutable format
Many, many tables
=== Method 4: name=’mpress._::Main’; RVA=0x0000254C;
FA=0x0000074C; size=0x9A ===
= Exception handlers: 000025D6; =
0000254C: [00] nop
0000254D: [28 0E 00 00 0A] call 0x0A00000E
00002552: [12 00] ldloca.s 0x00
00002554: [28 03 00 00 06] call 0x06000003
00002559: [13 06] stloc.s 0x06
0000255B: [11 06] ldloc.s 0x06
0000255D: [2D 10] brtrue.s 0x10
0000255F: [00] nop
00002560: [72 01 00 00 70] ldstr 0x70000001
00002565: [72 23 00 00 70] ldstr 0x70000023
0000256A: [28 0F 00 00 0A] call 0x0A00000F
0000256F: [26] pop
00002570: [15] ldc.i4.m1
00002571: [13 05] stloc.s 0x05
00002573: [2B 02] br.s 0x02
00002575: [26] pop
00002576: [06] ldloc.0
00002577: [28 10 00 00 0A] call 0x0A000010
0000257C: [80 01 00 00 04] stsfld 0x04000001
00002581: [7E 01 00 00 04] ldsfld 0x04000001
00002586: [6F 11 00 00 0A] callvirt 0x0A000011
0000258B: [0B] stloc.1
...
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 9 / 25
3. OpCodes Extraction and Normalization
Parsing and disassembling .NET
An extension of theMicrosoft PortableExecutable format
Many, many tables
=== Method 4: name=’mpress._::Main’; RVA=0x0000254C;
FA=0x0000074C; size=0x9A ===
= Exception handlers: 000025D6; =
0000254C: [00] nop
0000254D: [28 0E 00 00 0A] call 0x0A00000E
00002552: [12 00] ldloca.s 0x00
00002554: [28 03 00 00 06] call 0x06000003
00002559: [13 06] stloc.s 0x06
0000255B: [11 06] ldloc.s 0x06
0000255D: [2D 10] brtrue.s 0x10
0000255F: [00] nop
00002560: [72 01 00 00 70] ldstr 0x70000001
00002565: [72 23 00 00 70] ldstr 0x70000023
0000256A: [28 0F 00 00 0A] call 0x0A00000F
0000256F: [26] pop
00002570: [15] ldc.i4.m1
00002571: [13 05] stloc.s 0x05
00002573: [2B 02] br.s 0x02
00002575: [26] pop
00002576: [06] ldloc.0
00002577: [28 10 00 00 0A] call 0x0A000010
0000257C: [80 01 00 00 04] stsfld 0x04000001
00002581: [7E 01 00 00 04] ldsfld 0x04000001
00002586: [6F 11 00 00 0A] callvirt 0x0A000011
0000258B: [0B] stloc.1
...
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 9 / 25
3. OpCodes Extraction and Normalization
Parsing and disassembling .NET
An extension of theMicrosoft PortableExecutable format
Many, many tables
=== Method 4: name=’mpress._::Main’; RVA=0x0000254C;
FA=0x0000074C; size=0x9A ===
= Exception handlers: 000025D6; =
0000254C: [00] nop
0000254D: [28 0E 00 00 0A] call 0x0A00000E
00002552: [12 00] ldloca.s 0x00
00002554: [28 03 00 00 06] call 0x06000003
00002559: [13 06] stloc.s 0x06
0000255B: [11 06] ldloc.s 0x06
0000255D: [2D 10] brtrue.s 0x10
0000255F: [00] nop
00002560: [72 01 00 00 70] ldstr 0x70000001
00002565: [72 23 00 00 70] ldstr 0x70000023
0000256A: [28 0F 00 00 0A] call 0x0A00000F
0000256F: [26] pop
00002570: [15] ldc.i4.m1
00002571: [13 05] stloc.s 0x05
00002573: [2B 02] br.s 0x02
00002575: [26] pop
00002576: [06] ldloc.0
00002577: [28 10 00 00 0A] call 0x0A000010
0000257C: [80 01 00 00 04] stsfld 0x04000001
00002581: [7E 01 00 00 04] ldsfld 0x04000001
00002586: [6F 11 00 00 0A] callvirt 0x0A000011
0000258B: [0B] stloc.1
...
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 9 / 25
3. OpCodes Extraction and Normalization
CIL instruction types
instructions that move data around:ldc (load constant), ldarg (load argument), . . .
arithmetic and logic instructions:add, div, or, and, xor, . . .
object model instructions:newobj, . . .
instructions that modify the control flow
returning instructions (call, callvirt)unconditional branches (br, br.s)conditional branches (brtrue, brfalse, breq.s)flow disruptive instructions (ret, throw, jmp)
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 10 / 25
3. OpCodes Extraction and Normalization
Eliminating unreachable code
Enqueue the entry point andexception handlers.While queue is not empty:
Dequeue the next address.
Sweep until already reached codeor end of the buffer isencountered
Unconditional branch→ follow the branchConditional branch→ enqueue branch, continuesweepingFlow disruptive instruction→ stop current sweeping
Queue:
i1
i2
i3
. . .
ik
ik+1
br
. . .
ik
brtrue
i5
. . .
ret
. . .
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 11 / 25
3. OpCodes Extraction and Normalization
Eliminating unreachable codeEnqueue the entry point andexception handlers.While queue is not empty:
Dequeue the next address.
Sweep until already reached codeor end of the buffer isencountered
Unconditional branch→ follow the branchConditional branch→ enqueue branch, continuesweepingFlow disruptive instruction→ stop current sweeping
Queue:
i1
i2
i3
. . .
ik
ik+1
br
. . .
ik
brtrue
i5
. . .
ret
. . .
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 11 / 25
3. OpCodes Extraction and Normalization
Eliminating unreachable codeEnqueue the entry point andexception handlers.While queue is not empty:
Dequeue the next address.
Sweep until already reached codeor end of the buffer isencountered
Unconditional branch→ follow the branchConditional branch→ enqueue branch, continuesweepingFlow disruptive instruction→ stop current sweeping
Queue:
i1
i2
i3
. . .
ik
ik+1
br
. . .
ik
brtrue
i5
. . .
ret
. . .
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 11 / 25
3. OpCodes Extraction and Normalization
Eliminating unreachable codeEnqueue the entry point andexception handlers.While queue is not empty:
Dequeue the next address.
Sweep until already reached codeor end of the buffer isencountered
Unconditional branch→ follow the branchConditional branch→ enqueue branch, continuesweepingFlow disruptive instruction→ stop current sweeping
Queue:
i1
i2
i3
. . .
ik
ik+1
br
. . .
ik
brtrue
i5
. . .
ret
. . .
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 11 / 25
3. OpCodes Extraction and Normalization
Eliminating unreachable code
Enqueue the entry point andexception handlers.While queue is not empty:
Dequeue the next address.
Sweep until already reached codeor end of the buffer isencountered
Unconditional branch→ follow the branchConditional branch→ enqueue branch, continuesweepingFlow disruptive instruction→ stop current sweeping
Queue:
i1
i2
i3
. . .
ik
ik+1
br
. . .
ik
brtrue
i5
. . .
ret
. . .
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 11 / 25
3. OpCodes Extraction and Normalization
Eliminating unreachable code
Enqueue the entry point andexception handlers.While queue is not empty:
Dequeue the next address.
Sweep until already reached codeor end of the buffer isencountered
Unconditional branch→ follow the branchConditional branch→ enqueue branch, continuesweepingFlow disruptive instruction→ stop current sweeping
Queue:
i1
i2
i3
. . .
ik
ik+1
br
. . .
ik
brtrue
i5
. . .
ret
. . .
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 11 / 25
3. OpCodes Extraction and Normalization
Eliminating unreachable codeEnqueue the entry point andexception handlers.While queue is not empty:
Dequeue the next address.
Sweep until already reached codeor end of the buffer isencountered
Unconditional branch→ follow the branchConditional branch→ enqueue branch, continuesweepingFlow disruptive instruction→ stop current sweeping
Queue:
i1
i2
i3
. . .
ik
ik+1
br
. . .
ik
brtrue
i5
. . .
ret
. . .
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 11 / 25
3. OpCodes Extraction and Normalization
OpCodes normalization
DefinitionThe basic normalization function:
normal : O → Σ ∪ {ε}
normal(nop) = ε
normal(brtrue) = normal(brfalse)
DefinitionFiltering (Λ-normalization), Λ ⊆ Σ:
normalΛ(o) =
{normal(o) , if normal(o) ∈ Λε , otherwise
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 12 / 25
4. Automatic Filters Selection
Agenda
1 Introduction
2 Objectives
3 OpCodes Extraction and Normalization
4 Automatic Filters Selection
5 Experimental results
6 Conclusions and future work
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 13 / 25
4. Automatic Filters Selection
Λ-detectability
Sequences of symbols from:
558695 clean methods
272 malware clusters
Different n-grams for different filters Λ.
p1 →
ng1, , ng4, , ng7cleanset filtering−−−−−−−−−→ ng1, , ng7
p2 →
ng2, ng4, , ng8, ng9cleanset filtering−−−−−−−−−→ ng2, , ng8, ng9
p3 →
, ng4, , ng6cleanset filtering−−−−−−−−−→ , ng6
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 14 / 25
4. Automatic Filters Selection
Λ-detectability
Sequences of symbols from:
558695 clean methods
272 malware clusters
Different n-grams for different filters Λ.
p1 →
ng1, , ng4, , ng7cleanset filtering−−−−−−−−−→ ng1, , ng7
p2 →
ng2, ng4, , ng8, ng9cleanset filtering−−−−−−−−−→ ng2, , ng8, ng9
p3 →
, ng4, , ng6cleanset filtering−−−−−−−−−→ , ng6
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 14 / 25
4. Automatic Filters Selection
Λ-detectability
Sequences of symbols from:
558695 clean methods
272 malware clusters
Different n-grams for different filters Λ.
p1 → ng1, ng3, ng4, ng5, ng7
cleanset filtering−−−−−−−−−→ ng1, , ng7
p2 → ng2, ng4, ng5, ng8, ng9
cleanset filtering−−−−−−−−−→ ng2, , ng8, ng9
p3 → ng3, ng4, ng5, ng6
cleanset filtering−−−−−−−−−→ , ng6
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 14 / 25
4. Automatic Filters Selection
Λ-detectability
Sequences of symbols from:
558695 clean methods
272 malware clusters
Different n-grams for different filters Λ.
p1 → ng1, ng3, ng4, ng5, ng7
cleanset filtering−−−−−−−−−→ ng1, , ng7
p2 → ng2, ng4, ng5, ng8, ng9
cleanset filtering−−−−−−−−−→ ng2, , ng8, ng9
p3 → ng3, ng4, ng5, ng6
cleanset filtering−−−−−−−−−→ , ng6
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 14 / 25
4. Automatic Filters Selection
Λ-detectability
Sequences of symbols from:
558695 clean methods
272 malware clusters
Different n-grams for different filters Λ.
p1 → ng1, ng3, ng4, ng5, ng7cleanset filtering−−−−−−−−−→ ng1, ng4, ng7
p2 → ng2, ng4, ng5, ng8, ng9cleanset filtering−−−−−−−−−→ ng2, ng4, ng8, ng9
p3 → ng3, ng4, ng5, ng6cleanset filtering−−−−−−−−−→ ng4, ng6
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 14 / 25
4. Automatic Filters Selection
Λ-detectability
Sequences of symbols from:
558695 clean methods
272 malware clusters
Different n-grams for different filters Λ.
p1 → ng1, ng3, ng4, ng5, ng7cleanset filtering−−−−−−−−−→ ng1, ng4, ng7
p2 → ng2, ng4, ng5, ng8, ng9cleanset filtering−−−−−−−−−→ ng2, ng4, ng8, ng9
p3 → ng3, ng4, ng5, ng6cleanset filtering−−−−−−−−−→ ng4, ng6
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 14 / 25
4. Automatic Filters Selection
The fitness function
DefinitionThe fitness function:
f : P(Σ)→ R
f (Λ) =clusters detectability
number of clusters
Search space: | P(Σ) |= 2|Σ|
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 15 / 25
4. Automatic Filters Selection
The fitness function
DefinitionThe fitness function:
f : P(Σ)→ R
f (Λ) =clusters detectability
number of clusters
Search space: | P(Σ) |= 2|Σ| Example
ecbeceaaed
bedccecaeed
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 15 / 25
4. Automatic Filters Selection
The fitness function
DefinitionThe fitness function:
f : P(Σ)→ R
f (Λ) =clusters detectability
number of clusters
Search space: | P(Σ) |= 2|Σ|
Example
ecbeceaaed
bedccecaeed
Λ = Σ = {a, b, c , d , e}ec be ce a a ed
be dc ce c a e ed
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 15 / 25
4. Automatic Filters Selection
The fitness function
DefinitionThe fitness function:
f : P(Σ)→ R
f (Λ) =clusters detectability
number of clusters
Search space: | P(Σ) |= 2|Σ|
Example
ecbeceaaed
bedccecaeed
Λ = {e}eeeeeeee
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 15 / 25
4. Automatic Filters Selection
The fitness function
DefinitionThe fitness function:
f : P(Σ)→ R
f (Λ) =clusters detectability
number of clusters
Search space: | P(Σ) |= 2|Σ|
Example
ecbeceaaed
bedccecaeed
Λ = {a, b, e}e beea a ebeea e e
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 15 / 25
4. Automatic Filters Selection
Evolutionary algorithms
Start with a population of random solutions.At each step, the individuals interact and evolve towards bettersolutions.Eventually, they should reach an optimum solution (global or local).
Genetic Algorithm
Particle Swarm Optimization
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 16 / 25
4. Automatic Filters Selection
Genetic Algorithm
Binary encoding: 0 1 1 0 1 1 0 . . . 1 0
Crossover: Λ1,Λ2crossover−−−−−→ Λ′1,Λ
′2
Mutation: Λmutation−−−−−→ Λ′
Roulette Wheel selection: Pselection(Λk) =f (Λk)∑Λ
f (Λ)
Elitism
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 17 / 25
4. Automatic Filters Selection
Particle Swarm Optimization
Representation: p = (X ,V ,Xbest , best fitness)X ∈ [0, 1]|Σ|, V ∈ [−1, 1]|Σ|
Update:
X ′ = X + V
V ′ = ωV + φ1r1(Xbest − X ) + φ2r2(global best − X )
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 18 / 25
5. Experimental results
Agenda
1 Introduction
2 Objectives
3 OpCodes Extraction and Normalization
4 Automatic Filters Selection
5 Experimental results
6 Conclusions and future work
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 19 / 25
5. Experimental results
5. Experimental results (1)
Parallel speedup for the fitness function:
Amdahl’s law: S(k) =T (1)
T (k)=
1
B + 1−Bk
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 20 / 25
5. Experimental results
5. Experimental results (1)
Parallel speedup for the fitness function:
Amdahl’s law: S(k) =T (1)
T (k)=
1
B + 1−Bk
Experimentally, B = 0.04 so Smax = limk→∞
S(k) =1
B= 25
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 20 / 25
5. Experimental results
5. Experimental results (2)Learning evaluation:
Best fitness learnt:
GA: 0.3965
PSO: 0.4029Cross-validation results:
GA best PSO bestSimilar malware samples 0.1819 0.1833
Obfuscated samples 0.8859 0.8859
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 21 / 25
6. Conclusions and future work
Agenda
1 Introduction
2 Objectives
3 OpCodes Extraction and Normalization
4 Automatic Filters Selection
5 Experimental results
6 Conclusions and future work
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 22 / 25
6. Conclusions and future work
Summary
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 23 / 25
6. Conclusions and future work
Conclusions
n-grams are a robust way to classify programs.
Existing methods can be improved by filtering the OpCodesequences.
Bio-inspired algorithms can be used for finding good filters.
C. Opris,a (Bitdefender) Automatic Code Features Extraction Using Bio-inspired Algorithms November 18, 2013 24 / 25
Thank you!
Questions?