172
CS266 Software Reverse Engineering (SRE) Reversing and Patching Wintel Machine Code Teodoro (Ted) Cipresso, [email protected] Department of Computer Science San José State University Spring 2015 The information in this presentation is taken from the thesis “Software reverse engineering education” available at http://scholarworks.sjsu.edu/etd_theses/3734/ where all citations can be found.

Reversing and Patching Machine Code

Embed Size (px)

Citation preview

Page 1: Reversing and Patching Machine Code

CS266 Software Reverse Engineering (SRE)

Reversing and Patching Wintel Machine Code

Teodoro (Ted) Cipresso, [email protected]

Department of Computer Science

San José State University

Spring 2015

The information in this presentation is taken from the thesis “Software reverse engineering education” available at http://scholarworks.sjsu.edu/etd_theses/3734/ where all citations can be found.

Page 2: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Introduction to Compilers and Machine Code

Machine code is the executable representation of software. Typically:

The result of translating high-level language source code (e.g., C/C++) to object code using using a compiler.

Object code contains platform-specific machine instructions. The set of machine instructions available is defined by the CPU (or GPU).

Object code file is made executable using linker:

A linker resolves any external dependencies that object code may have, such as user-written or OS libraries (DLLs).

Machine code is often referred to as “native code” as it executes only on the platform to which it belongs (is native to).

2

Page 3: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Introduction to Compilers and Machine Code

In contrast to high-level languages, there are low-level languages which are still considered to be high-level by the CPU because the language syntax is still a textual or mnemonic abstraction of the processor's instruction set.

For example, assembly language, a language that uses helpful mnemonics to represent machine instructions, still must be translated to object code and made executable using a linker.

The translation from assembly code to object code is done by an assembler instead of a compiler—reflecting the closeness of assembly language's syntax to actual machine code.

3

Page 4: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Introduction to Compilers and Machine Code

The reason why compilers translate programs coded in high-level and low-level languages to machine code is three-fold:

(1) CPUs only understand machine instructions (operation codes).

(2) Having a CPU dynamically translate HLL statements to machine instructions would consume significant, additional CPU time.

(3) A CPU that could dynamically translate HLL statements to machine instructions would be complex, expensive, and difficult to maintain.

Imagine having to update the firmware in the CPU (or GPU) every time a bug is fixed or a feature is added to your favorite HLL; this would be much more annoying than patch Tuesday

4

Page 5: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Introduction to Compilers and Machine Code

To relieve a HLL compiler from the difficult task of generating machine instructions, some compilers do not generate machine code directly, instead they generate code in a LLL such as assembly [8].

This allows for a separation of concerns where the compiler doesn't have to know how to encode and format machine instructions for every target platform or processor.

Instead it just concerns itself with generating valid assembly code for an assembler on the target platform.

Some compilers, such as the C/C++ compilers in the GNU Compiler Collection (GCC), have the option to output assembly code.

This allows a programmer to tweak the code [9, 2004].

5

Page 6: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Introduction to Compilers and Machine Code

The steps in the process undertaken by the GCC compiler to create an executable are given below [9, 2004]:

Preprocessing: Expand macros in the high-level language source file.

Compilation: Translate the high-level source code to assembly language.

Assembly: Translate assembly language to object code (machine code).

Linking (Create the final executable):

Statically or dynamically link together the object code with the object code of the programs and libraries it depends on.

Establish initial relative addresses for the variables, constants, and entry points in the object code.

6

Page 7: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Introduction to Compilers and Machine Code

7

Page 8: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Decompilation and Disassembly of Machine Code

Having an understanding of how high-level language programs become executables can be helpful when attempting to reverse machine code.

Most tools that assist in reversing executables work by translating the machine code back into assembly language.

This is possible because there exists a one-to-one mapping from each assembly language statement to a machine instruction [10].

A tool that translates machine code back into assembly language is called a disassembler.

8

Page 9: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Decompilation and Disassembly of Machine Code

From a reversers point of view it would be best to translate assembly language [back] to a high-level language, as it would be much less difficult to comprehend and alter the program.

This is a difficult task for any tool because once high-level language source code is compiled down to machine code, a great deal of information is lost.

For example, one cannot tell by looking at the machine code which high-level language (if any) the machine code originated from.

Object-oriented constructs would be especially difficult to recover.

Perhaps knowing the kind of assembly code a particular compiler generates for certain constructs might help in generating HLL code, but this is not a reliable strategy.

9

Page 10: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Decompilation and Disassembly of Machine Code

The greatest difficulty in reverse engineering machine code comes from the lack of adequate decompilers.

[5] argues that it should be possible to create good decompilers for binaries, but recognizes that other experts disagree—raising the point that some information is “irretrievably lost during the compilation process.”

For those interested in recovering the source code of a binary, decompilation may not offer much hope because as [11] states:

“a general decompiler does not attempt to reverse every action of the compiler, rather it transforms the input repeatedly until the result is high level source code. It therefore won't recreate the original source file; probably nothing like it.”

10

Page 11: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Decompilation and Disassembly of Machine Code

Boomerang is an open-source decompiler that seeks to one day be able to decompile machine code to high-level language source code [11].

To get a sense of the effectiveness of Boomerang as a reversing tool, a simple program, HelloWorld.c was compiled and linked using the MinGW 32-bit GNU C++ compiler for Windows and then decompiled using Boomerang.

The C code generated by the Boomerang decompiler looked like a hybrid of C and assembly language, had countless syntax errors, and ultimately bore no resemblance to the original program.

Similar results are seen with other binaries; Boomerang seems to need manual guidance when decompiling MSVC-compiled programs.

11

Page 13: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Decompilation and Disassembly of Machine Code

The Reversing Engineering Compiler or REC is both a compiler and a decompiler that claims to be able to produce a “C-like” representation of machine code [12].

The results of the decompilation using REC were similar to that of Boomerang.

Based on the current state of decompilation technology for machine code, recovering or generating HLL source code from a native binary doesn't appear to be a feasible approach.

However, due to the one-to-one mapping between machine instructions and assembly language statements, we can obtain an assembly language representation using a tool known as a disassembler.

13

Page 14: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Decompilation and Disassembly of Machine Code

Multiple graphical tools available that not only include a disassembler, a tool which generates assembly language from machine code, but also allow for debugging and altering the machine code during execution.

OllyDbg is a shareware interactive machine code debugger and disassembler tool for Windows [13].

The tool’s emphasis on machine code analysis makes it particularly helpful in cases where the source code for the target program is unavailable.

OllyDbg operates as follows: 1) disassemble the binary executable, 2) generate assembly language from the machine code, and 3) perform heuristic analysis to identify individual functions, loops, and functions calls.

14

Page 15: Reversing and Patching Machine Code
Page 16: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Machine Code Reversing and Patching using OllyDbg

16

Pane Capabilities

Disassembler • Edit, debug, test, and patch a binary executable using actions available on a popup menu.

• Patch an executable by copying edits to the disassembly back to the binary.

Dump • Display the contents of memory or a file in one of 7 predefined formats: byte, text, integer, float, address, disassembly, or PE Header.

• Set memory breakpoints.

• Locate references to data in the disassembly.

Information • Decode and resolve the arguments of the currently selected assembly instruction in the Disassembler pane.

• Modify the value of register arguments.

• View memory locations referenced by each argument in either the Disassembler of Dump panes.

Page 17: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Machine Code Reversing and Patching using OllyDbg

17

Pane Capabilities

Registers • Decodes and displays the values of the CPU and FPU registers for the currently executing thread.

• Floating point register decoding can be configured for MMX (Intel) or 3DNow! (AMD) multimedia extensions.

• Modify the value of CPU registers.

Stack • Display the stack of the currently executing thread.

• Trace stack frames. In general, stack frames are used to:

• Restore the state of registers and memory on return from a call statement.

• Allocate storage for the local variables, parameters, and return value of the called subroutine.

• Provide a return address.

Page 19: Reversing and Patching Machine Code
Page 20: Reversing and Patching Machine Code
Page 21: Reversing and Patching Machine Code
Page 22: Reversing and Patching Machine Code
Page 23: Reversing and Patching Machine Code
Page 24: Reversing and Patching Machine Code
Page 25: Reversing and Patching Machine Code
Page 26: Reversing and Patching Machine Code
Page 27: Reversing and Patching Machine Code
Page 28: Reversing and Patching Machine Code
Page 29: Reversing and Patching Machine Code
Page 30: Reversing and Patching Machine Code
Page 31: Reversing and Patching Machine Code
Page 32: Reversing and Patching Machine Code
Page 33: Reversing and Patching Machine Code
Page 34: Reversing and Patching Machine Code
Page 35: Reversing and Patching Machine Code
Page 36: Reversing and Patching Machine Code
Page 37: Reversing and Patching Machine Code
Page 38: Reversing and Patching Machine Code
Page 39: Reversing and Patching Machine Code
Page 40: Reversing and Patching Machine Code
Page 41: Reversing and Patching Machine Code
Page 42: Reversing and Patching Machine Code
Page 43: Reversing and Patching Machine Code
Page 44: Reversing and Patching Machine Code
Page 45: Reversing and Patching Machine Code
Page 46: Reversing and Patching Machine Code
Page 47: Reversing and Patching Machine Code
Page 48: Reversing and Patching Machine Code
Page 49: Reversing and Patching Machine Code
Page 50: Reversing and Patching Machine Code
Page 51: Reversing and Patching Machine Code
Page 52: Reversing and Patching Machine Code
Page 53: Reversing and Patching Machine Code
Page 54: Reversing and Patching Machine Code
Page 55: Reversing and Patching Machine Code
Page 56: Reversing and Patching Machine Code
Page 57: Reversing and Patching Machine Code
Page 58: Reversing and Patching Machine Code
Page 59: Reversing and Patching Machine Code
Page 60: Reversing and Patching Machine Code
Page 61: Reversing and Patching Machine Code
Page 62: Reversing and Patching Machine Code
Page 63: Reversing and Patching Machine Code
Page 64: Reversing and Patching Machine Code
Page 65: Reversing and Patching Machine Code
Page 66: Reversing and Patching Machine Code
Page 67: Reversing and Patching Machine Code
Page 68: Reversing and Patching Machine Code
Page 69: Reversing and Patching Machine Code
Page 70: Reversing and Patching Machine Code
Page 71: Reversing and Patching Machine Code
Page 72: Reversing and Patching Machine Code
Page 73: Reversing and Patching Machine Code
Page 74: Reversing and Patching Machine Code
Page 75: Reversing and Patching Machine Code
Page 76: Reversing and Patching Machine Code
Page 77: Reversing and Patching Machine Code
Page 78: Reversing and Patching Machine Code
Page 79: Reversing and Patching Machine Code
Page 80: Reversing and Patching Machine Code
Page 81: Reversing and Patching Machine Code
Page 82: Reversing and Patching Machine Code
Page 83: Reversing and Patching Machine Code
Page 84: Reversing and Patching Machine Code
Page 85: Reversing and Patching Machine Code
Page 86: Reversing and Patching Machine Code
Page 87: Reversing and Patching Machine Code
Page 88: Reversing and Patching Machine Code
Page 89: Reversing and Patching Machine Code
Page 90: Reversing and Patching Machine Code
Page 91: Reversing and Patching Machine Code
Page 92: Reversing and Patching Machine Code
Page 93: Reversing and Patching Machine Code
Page 94: Reversing and Patching Machine Code
Page 95: Reversing and Patching Machine Code
Page 96: Reversing and Patching Machine Code
Page 97: Reversing and Patching Machine Code
Page 98: Reversing and Patching Machine Code
Page 99: Reversing and Patching Machine Code
Page 100: Reversing and Patching Machine Code
Page 101: Reversing and Patching Machine Code
Page 102: Reversing and Patching Machine Code
Page 103: Reversing and Patching Machine Code
Page 104: Reversing and Patching Machine Code
Page 105: Reversing and Patching Machine Code
Page 106: Reversing and Patching Machine Code
Page 107: Reversing and Patching Machine Code
Page 108: Reversing and Patching Machine Code
Page 109: Reversing and Patching Machine Code
Page 110: Reversing and Patching Machine Code
Page 111: Reversing and Patching Machine Code
Page 112: Reversing and Patching Machine Code
Page 113: Reversing and Patching Machine Code
Page 114: Reversing and Patching Machine Code
Page 115: Reversing and Patching Machine Code
Page 116: Reversing and Patching Machine Code
Page 117: Reversing and Patching Machine Code
Page 118: Reversing and Patching Machine Code
Page 119: Reversing and Patching Machine Code
Page 120: Reversing and Patching Machine Code
Page 121: Reversing and Patching Machine Code
Page 122: Reversing and Patching Machine Code
Page 123: Reversing and Patching Machine Code
Page 124: Reversing and Patching Machine Code
Page 125: Reversing and Patching Machine Code
Page 126: Reversing and Patching Machine Code
Page 127: Reversing and Patching Machine Code
Page 128: Reversing and Patching Machine Code
Page 129: Reversing and Patching Machine Code
Page 130: Reversing and Patching Machine Code
Page 131: Reversing and Patching Machine Code
Page 132: Reversing and Patching Machine Code
Page 133: Reversing and Patching Machine Code
Page 134: Reversing and Patching Machine Code
Page 135: Reversing and Patching Machine Code
Page 136: Reversing and Patching Machine Code
Page 137: Reversing and Patching Machine Code
Page 138: Reversing and Patching Machine Code
Page 139: Reversing and Patching Machine Code
Page 140: Reversing and Patching Machine Code
Page 141: Reversing and Patching Machine Code
Page 142: Reversing and Patching Machine Code
Page 143: Reversing and Patching Machine Code
Page 144: Reversing and Patching Machine Code
Page 145: Reversing and Patching Machine Code
Page 146: Reversing and Patching Machine Code
Page 147: Reversing and Patching Machine Code
Page 148: Reversing and Patching Machine Code
Page 149: Reversing and Patching Machine Code
Page 150: Reversing and Patching Machine Code
Page 151: Reversing and Patching Machine Code
Page 152: Reversing and Patching Machine Code
Page 153: Reversing and Patching Machine Code
Page 154: Reversing and Patching Machine Code
Page 155: Reversing and Patching Machine Code
Page 156: Reversing and Patching Machine Code
Page 157: Reversing and Patching Machine Code
Page 158: Reversing and Patching Machine Code
Page 159: Reversing and Patching Machine Code
Page 160: Reversing and Patching Machine Code
Page 161: Reversing and Patching Machine Code
Page 162: Reversing and Patching Machine Code
Page 163: Reversing and Patching Machine Code
Page 164: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Overview of Windows Debuggers

The following Windows debuggers can be downloaded for free from here:

KD Kernel Debugger: remote debug OS problems like blue screens, also useful for device driver development.

NTSD NT debugger: user-mode debugger for user-mode applications. Essentially a GUI added to CDB Command-line debugger.

WinDbg: wraps KD and NTSD with a GUI, therefore WinDbg can function both as a kernel-mode and user-mode debugger.

Visual Studio [.NET]: same debugging engine as KD and NTSD but offers richer UI than WinDbg for user-mode application debugging.

164[WinDbg1, slides 164-170]

Page 165: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Comparison of Windows Debuggers

165

Feature KD NTSD WinDbg Visual Studio .NET

Kernel-mode debugging Y N Y N

User-mode debugging Y Y Y Y

Unmanaged debugging (e.g, C++) Y Y Y Y

Managed debugging (e.g., C#) Y Y Y Y

Remote debugging Y Y Y Y

Attach to process Y Y Y Y

Detach from process in Win2K and XP Y Y Y Y

SQL debugging N N N Y

Page 166: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

WinDbg (NTSD + KD with a better UI)

Provides command-line options: start minimized (-m), attach to a process by pid (-p) and auto-open crash files (-z).

Supports three types of commands:

Regular commands (e.g.: k) for debugging processes.

Dot commands (e.g.: .sympath) for controlling the debugger.

Extension commands (e.g.: !handle) that you may add to WinDbg. These are implemented as exported functions in extension DLLs.

You need symbols in order to be able to do effective debugging. Symbol files could be in the (older) COFF (.DBG) format or the PDB (.PDB) format.

You can set symbol directories through File->Symbol File Path, or using .sympath from the WinDbg command window.

166

Page 167: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

WinDbg Just-in-time Debugging

You can set WinDbg as the default JIT debugger by running Windbg –I.

Sets registry key HKLM\Software\Microsoft\Windows NT\CurrentVersion\AeDebug to WinDbg.

WinDbg will be launched if a process, which is not already being debugged, throws an exception and does not handle/consume the exception.

To set WinDbg as the default managed debugger (C#), you’d need to set these registry keys explicitly:

HKLM\Software\Microsoft\.NETFramework\DbgJITDebugLaunchSetting to 2

HKLM\Software\Microsoft\.NETFramework\DbgManagedDebugger to WinDbg.

167

Page 168: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Dump Files and Crash Dump Analysis

When Windows crashes, it dumps the physical memory contents and all process information to a dump file, configured through System->Control Panel->Advanced->Startup and Recovery.

168

[WinDbg1]

Page 169: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Dump Files and Crash Dump Analysis

The .dump command creates a user-mode or kernel-mode crash dump file.

You can take a snapshot of a process by generating a crash dump:

Syntax: .dump [options] file-path

A mini-dump /m is usually small versus a full-memory mini-dump /mf.

It is useful to dump handle information as well /mfh.

A mini-dump contains information about all threads including their stacks and list of loaded modules.

A full dump contains more information such as the process heap.

169

Page 170: Reversing and Patching Machine Code

Reversing and Patching Wintel Machine Code

Dump Files and Crash Dump Analysis (cont’d)

To analyze a dump in WinDbg, browse to File->Open Crash Dump, and select a dump file from the file system.

WinDbg will show you the instruction the dumped process was executing when it crashed.

170

Page 172: Reversing and Patching Machine Code

172