View
2
Download
0
Category
Preview:
Citation preview
KEYSTONE: Next Generation Assembler Frameworkwww.keystone-engine.org
NGUYEN Anh Quynh <aquynh -at- gmail.com>
Blackhat USA - August 4th, 2016
1 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Bio
Nguyen Anh Quynh (aquynh -at- gmail.com)I Nanyang Technological University, SingaporeI Researcher with a PhD in Computer ScienceI Operating System, Virtual Machine, Binary analysis, etcI Capstone disassembler: http://capstone-engine.orgI Unicorn emulator: http://unicorn-engine.orgI Keystone assembler: http://keystone-engine.org
2 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Capstone: Next Generation Disassembler Engine
Blackhat USA 2014
3 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Unicorn: Next Generation CPU Emulator
Blackhat USA 2015
4 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Fundamental frameworks for Reverse Engineering
5 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Fundamental frameworks for Reverse Engineering
6 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Assembler framework
DefinitionCompile assembly instructions & returns encoding as sequence ofbytes
I Ex: inc EAX → 40
May support high-level concepts such as macro, function, etcFramework to build apps on top of it
ApplicationsDynamic machine code generation
I Binary rewriteI Binary searching
7 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Internals of assembler engine
Given assembly input codeParse assembly instructions into separate statementsParse each statement into different types
I Label, macro, directive, etcI Instruction: menemonic + operands
F Emit machine code accordinglyF Instruction-Set-Architecture manual referenced is needed
8 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges of building assembler
Huge amount of works for the core only!Good understanding of CPU encodingGood understanding of instruction setKeep up with frequently updated instruction extensions.
9 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Good assembler framework?
True frameworkI Embedded into tool without resorting to external process
Multi-archI X86, Arm, Arm64, Mips, PowerPC, Sparc, etc
Multi-platformI *nix, Windows, Android, iOS, etc
UpdatedI Keep up with latest CPU extensions
BindingsI Python, Ruby, Go, NodeJS, etc
10 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Existing assembler frameworks
Nothing is up to our standard, even in 2016!I Yasm: X86 only, no longer updatedI Intel XED: X86 only, miss many instructions & closed-sourceI Other important archs: Arm, Arm64, Mips, PPC, Sparc, etc?
11 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Life without assembler frameworks?
People are very much struggling for years!I Use existing assembler tool to compile assembly from fileI Call linker to link generated object fileI Use executable parser (ELF) to parse resulted file for final encoding
Ugly and inefficientLittle control on the internal process & outputCross-platform support is very poor
12 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Dream a good assembler
Multi-architecturesI Arm, Arm64, Mips, PowerPC, Sparc, X86 (+X86_64) + more
Multi-platform: *nix, Windows, Android, iOS, etcUpdated: latest extensions of all hardware architecturesIndependent with multiple bindings
I Low-level framework to support all kind of OS and toolsI Core in C++, with API in pure C, and support multiple binding
languages
13 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Problems
No reasonable assembler framework even in 2016!Apparently nobody wants to fix the issuesNo light at the end of the dark tunnelUntil Keystone was born!
14 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Timeline
Indiegogo campaign started on March 17th, 2016 (for 3 weeks)I 99 contributors, 4 project sponsors
Beta code released to beta testers on April 30th, 2016I Only Python binding available at this time
Version 0.9 released on May 31st, 2016I More bindings by beta testers: NodeJS, Ruby, Go & Rust
Version 0.9.1 released on July 27th, 2016I 2 more bindings: Haskell & OCaml
15 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Keystone == Next Generation Assembler Framework
16 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges to build Keystone
Huge amount of works!Too many hardware architecturesToo many instructionsLimited resource
I Started as a personal project
17 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Keystone design & implementation
18 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Ambitions & ideas
Have all features in months, not years!Stand on the shoulders of the giants at the initial phase.Open source project to get community involved & contributed.Idea: LLVM!
19 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Introduction on LLVM
LLVM projectOpen source project on compiler: http://llvm.orgHuge community & highly activeBacked by many major players: AMD, Apple, Google, Intel, IBM,ARM, Imgtec, Nvidia, Qualcomm, Samsung, etc.Multi-arch
I X86, Arm, Arm64, Mips, PowerPC, Sparc, Hexagon, SystemZ, etcMulti-platform
I Native compile on Windows, Linux, macOS, BSD, Android, iOS, etc
20 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
LLVM’s Machine Code (MC) layer
Core layer of LLVM to integrate compiler with its internal assemblersUsed by compiler, assembler, disassembler, debugger & JIT compilersCentralize with a big table of description (TableGen) of machineinstructionsAuto generate assembler, disassembler, and code emitter fromTableGen (*.inc) - with llvm-tablegen tool.
21 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Why LLVM?
Available assembler internally in Machine Code (MC) module - forinline assembly support.
I Only useable for LLVM modules, not for external codeI Closely designed & implemented for LLVMI Very actively maintained & updated by a huge community
Already implemented in C++, so easy to immplement Keystone coreon topPick up only those archs having assemblers: 8 archs for now.
22 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
LLVM advantages
High quality code with lots of tested done using test casesAssembler maintained by top experts of each archs
I X86: maintained by Intel (arch creator).I Arm+Arm64: maintained by Arm & Apple (arch creator & Arm64’s
device maker).I Hexagon: maintained by Qualcomm (arch creator)I Mips: maintained by Imgtec (arch creator)I SystemZ: maintained by IBM (arch creator)I PPC & Sparc: maintained by highly active community
New instructions & bugs fixed quite frequently!Bugs can be either reported to us, or reported to LLVM upstream,then ported back.
23 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Are we done?
24 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges to build Keystone (1)
LLVM MC is a challengeNot just assembler, but also disassembler, Bitcode, InstPrinter, LinkerOptimization, etcLLVM codebase is huge and mixed like spaghetti :-(
Keystone jobKeep only assembler code & remove everything else unrelatedRewrites some components but keep AsmParser, CodeEmitter &AsmBackend code intact (so easy to sync with LLVM in future)Keep all the code in C++ to ease the job (unlike Capstone)
I No need to rewrite complicated parsersI No need to fork llvm-tblgen
25 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Decide where to make the cut
Where to make the cut?I Cut too little result in keeping lots of redundant codeI Cut too much would change the code structure, making it hard to sync
with upstream.Optimal design for Keystone
I Take the assembler core & make minimal changes
26 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges to build Keystone (2)
Multiple binariesLLVM compiled into multiple libraries
I Supported libsI ParserI TableGenI etc
Keystone needs to be a single library
Keystone jobModify linking setup to generate a single library
I libkeystone.[so, dylib] + libkeystone.aI keystone.dll + keystone.lib
27 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges to build Keystone (3)
Code generated MC Assembler is only for linkingRelocation object code generated for linking in the final codegeneration phase of compiler
I Ex on X86: inc [_var1] → 0xff, 0x04, 0x25, A, A, A, A
Keystone jobMake fixup phase to detect & report missing symbolsPropagate this error back to the top level API ks_asm()
28 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges to build Keystone (4)
Unaware of relative branch targetsEx on ARM: blx 0x86535200 → 0x35, 0xf1, 0x00, 0xe1
Keystone jobks_asm() allows to specify address of first instructionChange the core to retain address for each statementFind all relative branch instruction to fix the encoding according tocurrent & target address.
29 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges to build Keystone (5)
Give up when failing to handle craft inputEx on X86: vaddpd zmm1, zmm1, zmm1, x → "this is not animmediate"Returned llvm_unreachable() on input it cannot handle
Keystone jobFix all exits & propagate errors back to ks_asm()
I Parse phaseI Code emit phase
30 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Challenges to build Keystone (6)
Other issuesLLVM does not support non-LLVM syntax
I We want other syntaxes like Nasm, Masm, etc
Bindings must be built from scratchKeep up with upstream code once forking LLVM to maitain ourselves
Keystone jobExtend X86 parser for new syntaxes: Nasm, Masm, etcBuilt Python binding myselfExtra bindings came later, by community: NodeJS, Ruby, Go, Rust,Haskell & OCamlKeep syncing with LLVM upstream for important changes & bug-fixes
31 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Keystone flow
32 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Keystone vs LLVM
Forked LLVM, but go far beyond itIndependent & truly a framework
I Do not give up on bad-formed assembly
Aware of current code position (for relative branches)Much more compact in size, lightweight in memoryThread-safe with multiple architectures supported in a single binaryMore flexible: support X86 Nasm syntaxSupport undocumented instructions: X86Provide bindings (Python, NodeJS, Ruby, Go, Rust, Haskell, OCaml asof August 2016)
33 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Write applications with Keystone
34 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Introduce Keystone API
Clean/simple/lightweight/intuitive architecture-neutral API.Core implemented in C++, but API provided in C
I open & close Keystone instanceI customize runtime instance (allow to change assembly syntax, etc)I assemble input codeI memory management: free allocated memory
Python/NodeJS/Ruby/Go/Rust/Haskell/OCaml bindings built aroundthe core
35 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Sample code in C
36 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Sample code in Python
37 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Demo
38 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Keypatch plugin for IDAOpen source IDA plugin https://keystone-engine.org/keypatchTool for assembling & patching in IDACo-developed with Thanh Nguyen (VNSecurity.net)
39 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Other applications from around internet
Radare2: Unix-like reverse engineering framework and commandlinetoolsPwnypack: CTF toolkit with Shellcode generatorRopper: Rop gadget and binary information toolGEF: GDB plugin with enhanced featuresUsercorn: Versatile kernel+system+userspace emulatorX64dbg: An open-source x64/x32 debugger for windowsLiberation: code injection library for iOSDemovfuscator: Deobfuscator for movfuscated binaries.More from http://keystone-engine.org/showcase
40 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Status & future works
StatusVersion 0.9 went public on May 31st, 2016Version 0.9.1 was out on July 27th, 2016Based on LLVM 3.9Version 1.0 will be released as soon as all important bugs get fixed
Future worksMore refined error code returned by parser?Find & fix all the corner cases where crafted input cause the core exitMore bindings promised by community!Synchronize with latest LLVM version
I Future of Keystone is guaranteed by LLVM active development!
41 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Reverse Engineering Trilogy
42 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Conclusions
Keystone is an innovative next generation assemblerI Multi-arch + multi-platformI Clean/simple/lightweight/intuitive architecture-neutral APII Implemented in C++, with API in C language & multiple bindings
availableI Thread-safe by designI Open source in dual licenseI Future update guaranteed for all architectures
We are seriously committed to this project to make it the bestassembler engine
43 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
44 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
References
Keystone assemblerI Homepage: http://keystone-engine.orgI Twitter: @keystone_engineI Github: http://github.com/keystone-engine/keystoneI Mailing list: http://freelists.org/list/keystone-engine
Keypatch: http://keystone-engine.org/keypatchAvailable apps using Keystone:http://keystone-engine.org/showcase
Capstone disassembler: http://capstone-engine.orgUnicorn emulator: http://unicorn-engine.org
45 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Acknowledgement
FX for the inspiration of the Keystone name!Indiegogo contributors for amazing financial support!Code contributors!Community for great encouragement!
46 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Questions and answersKEYSTONE: next generation assembler engine
http://keystone-engine.org
NGUYEN Anh Quynh <aquynh -at- gmail.com>
47 / 47 NGUYEN Anh Quynh KEYSTONE: Next Generation Assembler Framework
Recommended