Bin carver

Preview:

Citation preview

Bin-Carver: Automatic recovery of binary executable files

• The process of reassembling files from disk fragments in the absence of metadata.

What is file carving?

• Accidental user deletions.

• Intentional user deletions.

• Malware.

When would we need file carving?

Using .jpeg file as an example :

•Find header (FF D8).•Know footer pair (FF D9).•Find all contiguous data.

Traditional file carving method

•Fragmentation.•Doesn’t work without exact header

and footer information.•Doesn’t work with all file types.

o focuses on documents of forensic interest.o binary executables not included.

Problems with traditional method

• Recover Executable Linkable Format (ELF) file e from disk image D • D strictly consists of file content

blocks• Assume D is an EXT2 file system,

block size 4k

Bin-carver overview -1

• File content has not been overwritten.

• File content is stored in increasing order.

• ELF file e has n blocks in the disk. want to link these n blocks together utilizing internal graph node logic.

Bin-carver overview -2

Bin-carver overview -3

•Filename recovery is typically not possible without the file system metadata.

•Fragmentation.

Challenges

System Overview Diagram

• ELF-header scanner.o scan all possible ELF headers hi using ELF-

file magic value.• Block node linker.

o scans disk image, identifies nodes and links them.

• Conflict-node resolver.o removes conflict nodes and outputs ELF-file ei.

Components

• Headers hold a “road map” describing ELF file organization.• Searching for the magic number

sequence 7f 45 4c 46 allows to locate headers, telling how to traverse all other sections.

Scanner -1

Each header is 52k and contains:

• Program header table (PHT)o array of program headers• Section header table (SHT)

o array of section headers

Scanner -2

•Usually located at end of ELF file.o can serve as a footer because of this.

•Since A(footer) > A(hi) can start our search at the 0x14 disk block.

•Gives a multitude of other constraints that allow to calculate the location of the footer.

Searching SHT

•Locates segments that create memory image of the program.

•Each program header is 32 bytes.•Usually starts right after ELF

headers.osame 4k block.

Searching PHT

•From program header, infer vase virtual address of image file.

•Keep iterating and build the road map.

•The goal is to find every fill this road map with content (bi).

Searching PHT

• With no fragmentation, job is done.• But, with any garbage gap, this

approach would fail.• So how to link each individual bi if

the disk is fragmented?

Finished?

• Have to logically connect bi and bj .

• Explore the caller-callee relationship:• Fill block place of bcaller and bcallee

o find address• Logically link them together.

o function prologue signature (local calls)o PLT instruction sequence (library calls)

Block-node linker -1

• On a library callo Use PLT block number as an anchor.o Use this anchor to identify absolute block number of

the caller block.• On a local call

o Only determines distance.o Only works with blocks starting with e8 (CALL opcode).• Most cases library calls are used to

resolve block numbers.

Block-node linker -2

• A particular placeholder i could have several candidates.

• To eliminate redundant placeholders:o Use identified non-conflict nodeso Explore logic connectionso Resolve nodeo Iterate through until a fixed point is reached

Conflict-node resolver -1

• Block-node linker only focuses on linking code blocks. Conflict-node resolver handles other data blocks (.data, .debug).

Conflict-node resolver -2

To retrieve data blocks:• Treat data sections as a block between the ELF header

and the first block of code section.• Resolvers explores constraints defined in PHT and SHT.• Worst case scenario: data section does not have

identifiable sections and we must use dynamic execution to eliminate bogus permutations.o Essentially, if the recovered binary file doesn’t crash,

it may have been recovered successfully.

Conflict-node resolver -3

• Comparisons were intended to be made to other similar tools, both Foremost and Scalpel do not support carving for fragmented ELF binary files.

Evaluation - Comparison

Evaluation -1

•All files are ELF binaries.o worst case, high false positive rates.o addition of heterogeneous data irrelevant.

•Performance of algorithm is invariant to size of the disk.

•Performance relies on number of files to be recovered.

Evaluation -2

• To evaluate accuracy, need to prove the recovered files are true elf files.• Need to create an MD5 hash of first

block and every individual block for each true ELF binary to detect true data in worst case fragmentation scenario.

Evaluation -3

Identification rate:• Shows portion that can be identified

no matter how fragmented the disk is.o must be able to match hash values

Recovery Rate• Valid files in the system that were

identified and recovered.

Effectiveness -1

Overall, very effective. On average:• Identification rate of 96.3%• Recovery rate of 93.1%

Effectiveness -2

Effectiveness -3

• All performance slowdowns occur during linker and resolver phases.• Large gaps hurt performance, and

the large number of caller-callee instructions cause performance penalties.

Runtime Analysis -1

Runtime Analysis -2

Conclusion• Bin-Carver, a tool for dissecting, map- ping, and recovering binary

executable files from raw binary data.

• Bin-Carver is extremely accurate, and much better than all the existing file carving techniques when recovering binary files with fragmentations.

• Bin-Carver also provides a useful complement to the more traditional header-footer pairing approach for file carving to gain more complete disk image recovery.

References1. A. Pal, K. Shanmugasundaram, N. Memon, Automated reassembly of fragmented images, in: Proceedings of the 2003

International Conference on Multimedia and Expo - Volume 2, ICME ’03, IEEE Computer Society, Washington, DC, USA, 2003, pp. 625–628.

2. A.Pal, N.Memon, The evolution of file carving, Signal Processing Magazine, IEEE 26 (2) (2009) 59 –71.

3. M.Karresand, N.Shahmehri, File type identification of data fragments by their binary structure, in: Information Assurance Workshop, 2006 IEEE, 2006, pp. 140 –147.

4. M. McDaniel, M. H. Heydari, Content based file type detection algorithms, in: Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS’03) - Track 9 - Volume 9, 2003.

5. M. Karresand, N. Shahmehri, Oscar – file type identification of binary data in disk clusters and ram pages, in: Security and Privacy in Dynamic Environments, Vol. 201 of IFIP International Federation for Information Processing, 2006, pp. 413–424.

6. S.Moody, R.Erbacher, Sadi-statistical analysis for data type identification, in: Systematic Approaches to Digital Forensic Engineering, 2008. SADFE ’08. Third International Workshop on, 2008, pp. 41 –54.

Thank You…......!!!!!!!