View
219
Download
3
Tags:
Embed Size (px)
Citation preview
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Deep Inside anDeep Inside an
AntiVirus EngineAntiVirus Engine
Network Associates, Inc.Network Associates, Inc.Jimmy KuoJimmy Kuo
Director, AV ResearchDirector, AV [email protected]@alumni.caltech.edu
Deep Inside anDeep Inside an
AntiVirus EngineAntiVirus Engine
Network Associates, Inc.Network Associates, Inc.Jimmy KuoJimmy Kuo
Director, AV ResearchDirector, AV [email protected]@alumni.caltech.edu
Stanford, 16MAR99
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Short description of viruses
Environments
Purposes of an antivirus engine
Detection technologies
Virus removal technologies
Wrap-up
AgendaAgendaAgendaAgenda
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Viruses
Replicate!
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
File VirusesCom, Exe (DLL, VxD), Bat, Sys, mIRC, Html
Boot VirusesBoot Sector
Master Boot Records
Macro VirusesWord, Excel, PowerPoint, Access
Multipartite
Virus TypesVirus TypesVirus TypesVirus Types
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Virus growth through the years:
*Dr Solomon’s count of viruses and trojans
40000400004000040000
0
5000
10000
15000
20000
25000
30000
35000
40000
1990 1991 1992 1993 1994 1995 1996 1997 1998
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
PC
OLE2 files
Compressed files
Self-extracting files
.BAT files, mIRC script, VB Script
UNIX filesystems
Environment DeterminationEnvironment DeterminationEnvironment DeterminationEnvironment Determination
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
FTP
HTTP
SNMP
SMTP
NNTP
TCP/IP
Mime, uuencode, SSL, PGP
ProtocolsProtocolsProtocolsProtocols
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
We deal with users with problems on their computer, problems they do not know how to handle.
1. Relieve the panic.
2. Understand the problem.
3. Resolve what the user understands to be the problem.
PurposePurposePurposePurpose
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
1. Detect all viruses.
2. The program is running on a clean machine.
3. Don’t give them a reason not to use your product.
McAfee (NAI) MantraMcAfee (NAI) MantraMcAfee (NAI) MantraMcAfee (NAI) Mantra
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
The TechnologyThe Technology The TechnologyThe Technology
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Data organization– In memory
–On disk
Only things essential to detection are stored in memory. Names, repair information, virus information all stored on disk.
Signature SearchSignature SearchSignature SearchSignature Search
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
All strings kept in memory.
All strings of the same type.
Method died out when viruses neared 1000.
Signature Organization, Case 0Signature Organization, Case 0Signature Organization, Case 0Signature Organization, Case 0
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Split into virus types: Boot viruses, File viruses (Algorithmic detection, CRC detection), Macro viruses.
Boot virus strings swapped to disk. Pull it in only if target file looks like a boot image. (55AA signature)
CRCs used for those viruses that don’t change. Keep verification information on disk.
Signature Organization, Case 1Signature Organization, Case 1Signature Organization, Case 1Signature Organization, Case 1
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
All detection strings kept in memory.
• Sorted into separate bins.
• Only the particular bin that could contain the virus string is stored in low memory.
• All else stored in EMS or XMS.
Virus removal information and names stored on disk.
Signature Organization, Case 2Signature Organization, Case 2Signature Organization, Case 2Signature Organization, Case 2
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
All detection strings stored in memory. Some are classed as “not necessary for the average user” and not used unless specifically requested.
All verification information stored on disk (or EMS or XMS if available).
Strings sorted into groups which have common start characteristics.
Signature Organization, Case 3Signature Organization, Case 3Signature Organization, Case 3Signature Organization, Case 3
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Needs to be “front end fast.” If there’s a virus, it can take longer. But most things are not viruses, so it should be as quick as possible to determine that the target is not there.
No time allowed for front end setup.
So, quick and simple wins out.
Signature Search AlgorithmsSignature Search AlgorithmsSignature Search AlgorithmsSignature Search Algorithms
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Code TracersCode TracersCode TracersCode Tracers
Simplified emulation, but faster.
Static emulation.
Only have to know instruction length and flow transfer statements.
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Code Tracing, Case 1Code Tracing, Case 1Code Tracing, Case 1Code Tracing, Case 1
Given a target COM file,
For specific cases of known flow transfers (jmp, call, push/ret, minor variations of such),
Get to a fixed location, start searching for viruses here.
Problem cases: polymorphic entry code
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Code Tracing, Case 2Code Tracing, Case 2Code Tracing, Case 2Code Tracing, Case 2
Given a target COM file,
Trace code path through all available paths, until out of buffer. Remember opcodes. Use in opcode string matching.
First time out of buffer, trace again. Remember opcodes again.
Problem cases: Appending virus, appended to a small host.
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Code Tracing, Case 3Code Tracing, Case 3Code Tracing, Case 3Code Tracing, Case 3
Given a target COM file,
Organize your virus database according to the different types of entry code.
Search against only those viruses that use that type of entry code.
This is the current technique we’re using.
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Intel 80x86, primarily 8086.
Now 80386 also needed.
Portable.
Apple emulation.
EmulatorsEmulatorsEmulatorsEmulators
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Prefetch queue length.
How much of the environment do you include in the emulation?
The “perfect” emulator takes too much time and memory.
Result: Emulate situations required for known viruses. Needs upgrading to match reality.
Emulator ProblemsEmulator ProblemsEmulator ProblemsEmulator Problems
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Matrix of opcode digraphs.
Map the set of opcodes gathered from code trace onto the opcode digraphs of known viruses. If it does not match, it cannot be that virus.
Add digraph matrices together to save memory space.
Code MatrixCode MatrixCode MatrixCode Matrix
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
• Loop detection.– Likely to need decrypting (emulate)
• Probability distribution (a particular virus uses Rotates much too often).
• Polymorphic viruses too difficult to handle otherwise.
Special Case CodeSpecial Case CodeSpecial Case CodeSpecial Case Code
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
An OLE2 file is a filesystem in a file.
It’s a proprietary format belonging to Microsoft.
Cracking the OLE2 format was easy. Next comes the Word document stream, the Excel spreadsheet stream, WordBasic, Visual Basic, ...
OLE2 Files (Macro Viruses)OLE2 Files (Macro Viruses)OLE2 Files (Macro Viruses)OLE2 Files (Macro Viruses)
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Word6 macro encryption/protection is a single byte XOR. Key is available in document.
Office97 macro protection is GUI only. Actual code is not encrypted at all.
Excel95 password protection is almost trivial. Uses 16 byte XOR key with minor on-the-fly calculations.
Other Macro Virus IssuesOther Macro Virus IssuesOther Macro Virus IssuesOther Macro Virus Issues
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Office97 password protection against Open uses MD5. [Yuk!]
PowerPoint97 streams stored as GZIP compressed data streams.
WordBasic is tokenized language.
VisualBasic is p-code. But there’s a separate compressed code body for “Edit.”
Other Macro Virus Issues...Other Macro Virus Issues...Other Macro Virus Issues...Other Macro Virus Issues...
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
VisualBasic5 now supported across other applications! Soon, we’ll have to crack other file formats, not just OLE2.
VisualBasic6 coming out in next few months.
Things can up-convert, some can downconvert.
Emulators for all these languages!
Still More Macro Virus IssuesStill More Macro Virus IssuesStill More Macro Virus IssuesStill More Macro Virus Issues
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Must have “sufficient” variant determination.–Bytes to cut.
–Where from.
–Where to retrieve original information.
–Where did the virus replace/take that original info from?
Virus removal database does not need to stay in memory.
Virus RemovalVirus RemovalVirus RemovalVirus Removal
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Variant determination.–Different sizes.
–Different CRC values over different ranges.
–String found at different position.
– The specific variant does something unique. Need to know this for user information.
Varient DeterminationVarient DeterminationVarient DeterminationVarient Determination
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Virus Payloads affect:–Registry
–Added instructions in files. (AUTOEXEC.BAT)
–Additional files dropped.
– Things added to WIN.INI.
– “Bad sector” repair.
–Anything software can do.
Side Effect RemovalSide Effect RemovalSide Effect RemovalSide Effect Removal
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Speed Issues Speed Issues Speed Issues Speed Issues
Memory management.
Memory, hard disk, floppy.
640K memory, XMS, EMS, 32-bit memory, memory swapping.
Clean machine.
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Final Thought Final Thought Final Thought Final Thought
Compare what was covered in this presentation against an access control package.
Project: The following files are allowed to be executed by this set of people. AND NOTHING ELSE!
Wh
o’s
wat
chin
g y
ou
r n
etw
ork
Questions & AnswersQuestions & Answers
Your Partner Against the Virus Problem
Your Partner Against the Virus Problem