50
Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy , Ali Rizvi-Santiago [email protected] [email protected] P

Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago [email protected] [email protected] P

Embed Size (px)

Citation preview

Page 1: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Reverse Engineering Dynamic Languages

A Focus on Python

Aaron Portnoy , Ali Rizvi-Santiago

[email protected] [email protected] P

Page 2: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

About Us

Work in TippingPoint DVLabs (http://dvlabs.tippingpoint.com)

Responsible for bughunting, patch analysis, vuln-dev

Authors and contributors to…Sulley Fuzzing FrameworkPaiMei PyMSRPCOpenRCE.org

P

Page 3: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Talk Outline

We will be focusing on Python in its binary formsDisassembling codeCode object modificationRuntime stuff

An example of reversing PythonCheating at an MMORPG

P

Page 4: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Introduction to Dynamic Languages

What are the characteristics of a dynamic language?Most tasks performed at runtime rather than during compilation

Advantages to dynamic languagesDevelopment speedPortabilityFlexibility

Great for lazy coders (like us)

S

Page 5: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Why Python?

Implements many dynamic features

Rapidly gaining popularity

We were already familiar with its internals

S

Page 6: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

a

Multiplayer Online Role Playing Game10,000+ subscribers

Written in PythonDistributed in a binary form

Why this game?Its TV commercial interrupted Robot ChickenPedram wanted to cheat at it

P

Page 7: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 8: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

First Look

python24.dllsafe to assume, written in Python

What is this 130mb PYD file?Google says frozen Python objects

Grepping tells us this is likely the source of interesting stuff

Panda3D LibraryMade by Disney

P

Page 9: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

What do we know about Python?

Source code compiled to objectsInterpreted

Python is a dynamic languageType information must be present somewhere

Python implements a virtual machineByte code must also be present somewhere

S

Page 10: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Structure of a PYD

Let’s check it out in IDA

P

Page 11: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 12: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Python Serialization

Python’s ‘marshal’ moduleKind of like pickle, but handles internal types

What is this currently used for?.pyc – cached code objects (for avoiding having to re-parse).pyz – squeezed code objects.pyd – marshalled code objects stored in a shared object (.dll, .so, etc)

S

Page 13: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Python Code Object

What do we get when we deserialize?An object of type ‘code’

Code object properties:co_argcount, co_nlocals, co_stacksize, co_flags, co_code, co_consts,

co_names, co_varnames, co_filename, co_name, co_firstlineno, co_lnotab, co_freevars, co_cellvars

Which is the most interesting to a reverser?co_code – string representation of object’s byte code

P

Page 14: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Byte Code Primer

Instruction consists of a 1-byte opcode followed by an argument when requiredArguments are 16-bits

Has support for extended argsUsed if your code has more than 64k of defined constants

Ridiculous getopt implementation?Like gcc?

Data is not part of byte codeIndex references into other code object properties

co_constsco_namesco_varnames

P

Page 15: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Byte Code Example

\x64\x02\x00 \x64\x4E\x00 \x64\x17\x00 \x66\x03\x00 \x55

P

Page 16: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Byte Code Example (cont.)

LOAD_CONST 2LOAD_CONST 78LOAD_CONST 23BUILD_TUPLE 3RETURN_VALUE

P

Page 17: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Code Object Modification

Code objects are immutableBUT, you can clone an object, optionally modifying attributesWe call this “sneaking the type”™

>>> code = type(eval('lambda:x').func_code)>>> help(code)Help on class code in module __builtin__:

class code(object)| code(argcount, nlocals, stacksize, flags, codestring, | constants, names, varnames, filename, name, | firstlineno, lnotab[, freevars[, cellvars]]) | | Create a code object. Not for the faint of heart.

S

Page 18: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Introducing AntiFreeze

Tool for statically modifying code objects within a PYDWeb-basedInterface utilizes Ext-js javascript library

ComponentsDisassembly EngineAssemblerFunctionality for extracting code objects from a PYD

PE ParserIntel Disassembler

P

Page 19: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 20: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 21: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 22: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 23: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Enough About Static Stuff

Time to explore runtime tricks…

S

Page 24: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

((((Objects and Types) of Objects)) and Types) of Types)

In Python, there are objects and types

Every object has a type associated with it

Every object also inherits from the ‘object’ typeThis also includes the ‘type’ typeSo, all types inherit from the type type

Which also inherits from the object type

If you try to mentally graph those relationships, you may have an aneurism

P

Page 25: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Python Object Data Structure

0 int ob_refcnt4 struct _typeobject* ob_type8 int ob_size

All Instantiated Objects are prefixed with the following information:

ob_refcnt – is the reference counter for the object which is utilized for garbage collection

ob_type – contains a pointer to the type of the object

ob_size – duh

S

Page 26: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Python Standard Types

All base types are exported by the python dll. Check your local dependency viewer for all types.

0:001> dd 0x1663660 *this is the address of an object01663660 00000002 1e1959d0 0000001c 0000001c01663670 0000007f 01706498 1e051f70 dea555d001663680 0166c660 0166c630 7d8c4178 0166f598

0:001> ln 0x1e1959d0 *your ob_type goes here(1e1959d0) python24!PyDict_Type Exact matches: python24!PyDict_Type (<no parameter info>)

P

Page 27: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Execution of a Code Object

PyFrameObject*

PyEval_EvalCode(PyCodeObject* co, PyObject* globals, PyObject* locals)

Binds Code object to globals()/locals() and returns a PyFrameObject

PyObject*

PyEval_EvalFrame(PyFrameObject* f)

PyEval_EvalFrame takes the new frame object and is responsible for actual execution.

P

Page 28: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Concurrent execution of code objects

Multiple interpreters can exist in a single processEach Interpreter has a list of threads associated with it

Concurrency is handled via a lock known as the GILRemember FreeBSD?

PyEval_EvalFrame is responsible for releasing the lock

S

Page 29: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Diving in With a Debugger

Key things we will need to identifyAll existing interpretersThreads associated with an interpreterWhat's currently being executed?

S

Page 30: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Interpreters

The list of interpreters is a plain old stackJust need to find a reference to the head of the stack.

“interp_head” in python-src/Python/pystate.c

0:001> u PyInterpreterState_Head *mad-friendlypython24!PyInterpreterState_Head:1e08ce90 a1c0871b1e mov eax, [python24!1e1b87c0]1e08ce95 c3 ret

S

Page 31: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Interpreter Data Structure

0 struct _is* next4 struct _ts* tstate_head8 PyObject* modulesc PyObject* sysdict10 PyObject* builtins14 PyObject* codec_search_path18 PyObject* codec_search_cache1c PyObject* codec_error_registry

S

Page 32: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Threads

The list of interpreters is also just a plain old stack

0 struct _ts* next4 PyInterpreterState* interp8 struct _frame* framec int recursion_depth10 int tracing14 int use_tracing…40 PyObject* dict…50 long thread_id ; this is your GetCurrentThreadId()

S

Page 33: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

0 int ob_refcnt4 struct _typeobject* ob_type8 int ob_size

c struct _frame *f_back ; calling frame10 PyCodeObject *f_code14 PyObject *f_builtins18 PyDictObject *f_globals1c PyDictObject *f_locals20 PyObject **f_valuestack24 PyObject **f_stacktop28 PyObject *f_trace

Frame Object

S

Page 34: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Hooking?

All code must pass through PyEval_EvalCode or PyEval_EvalFrame

Can also hook PyObject_CallFunction or PyObject_CallMethod

S

Sounds easy enough…

Page 35: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

• Breaking on PyEval_EvalFrame

– Display Name of code object• da poi(poi(poi(@esp+4)+0xc+4)+8+0x2c)+8+0xc

– Display Locals• r@$t1=poi(@esp+4);r@$t1=poi(@$t1+0x18);r@$t2=dwo(@$t1+0x10)+1;r@

$t1=poi(@$t1+0x14);r@$t3=@$t1+@$t2*@$ptrsize;.while(@$t1<@$t3){r@$t2=poi(@$t1+4);r@$t1=@$t1+@$ptrsize;j(@$t2>0x14)'da@$t2+0x14';''}

– Display Globals• r@$t1=poi(@esp+4);r@$t1=poi(@$t1+0x1c);r@$t2=dwo(@$t1+0x10)+1;r@

$t1=poi(@$t1+0x14);r@$t3=@$t1+@$t2*@$ptrsize;.while(@$t1<@$t3){r@$t2=poi(@$t1+4);r@$t1=@$t1+@$ptrsize;j(@$t2>0x14)'da@$t2+0x14';''}

• Breaking on a PyObject_Call*– r@$t1=poi(@esp+4);r@$t2=@$t1;r@$t2=poi(@$t2+0x1c)+0x14;.printf

"PyFunction_Type:";da@$t2;r@$t3=@$t1;r@$t3=poi(@$t3+8);r@$t3=poi(@$t3);.printf"PyCFunction_Type";da@$t3;r@$t4=@$t1;r@$t4=poi(@$t4+8);r@$t4=poi(@$t4+0x1c)+0x14;.printf"PyMethod_Type";da@$t4

Thx WinDBG!

Breakpoints

S

Page 36: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Isn’t that a context switch into and out of kernel for execution of EVERY frame?

Wait…

P

Page 37: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Userspace Hooking0:000> .dvalloc 1000Allocated 1000 bytes starting at 00430000

Let's poke around0:000> u PyEval_EvalFramepython24!PyEval_EvalFrame:1e027940 83ec54 sub esp,54h1e027943 53 push ebx1e027944 8b1dc4871b1e mov ebx, [1e1b87c4]1e02794a 56 push esi0:000> a PyEval_EvalFrame1e027940 jmp 0x4300001e027945 0:000> u PyEval_EvalFramepython24!PyEval_EvalFrame:1e027940 e9bb8640e2 jmp 004300001e027945 1dc4871b1e sbb eax, 1e1b87c41e02794a 56 push esi1e02794b 8b742460 mov esi,dword ptr [esp+60h]1e02794f 57 push edi1e027950 33ff xor edi,edi1e027952 83c8ff or eax,0FFFFFFFFh1e027955 3bf7 cmp esi,edi0:000> a 43000000430000 int 300430001 sub esp, 0x5400430004 push ebx00430005 mov ebx, [0x1e1b87c4]0043000b jmp 0x1e02794a

P

Page 38: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

PyRun_* makes injection incredibly easy. Let's take a look at PyRun_String:

PyObject*PyRun_String(const char* str, int start, PyObject*

globals, PyObject* locals){

return run_err_node(PyParser_SimpleParseString(str, start),

"<string>", globals, locals, NULL);

}

Dynamic Recompilation

S

Page 39: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Straightforward approachRe-declare the function and then call the original:

def old(blah, heh, ok, im, over, it): print "hello globals()"

original_old = olddef new(*args, **kwds): print repr(args), repr(kwds) res = original_old(*args, **kwds) print "result was: %s"% repr(res) return resold = new

Function Hooking in Python

S

Page 40: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

instancemethods are immutable and are bound to an instance

Just need to sneak it’s type and then clone with your new function.

instancemethod = type(Exception.__str__)instancemethod(function, instance, class)

class obj(object): def method(self): print "yay for methods"

def new(self): print "okay...."

x = obj()old = x.method.im_funcx.method = instancemethod(new, x, type(x))

Instance Method Hooking in Python

S

Page 41: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

sys.settrace(fn)

http://docs.python.org/lib/debugger-hooks.html

def fn(*args): print repr(args)sys.settrace(fn)

ihooks

http://effbot.org/librarybook/ihooks.htm

Python Supported Debugging Hooks

S

Page 42: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Enough Boring Stuff, Time for Demos

P

Page 43: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Static PYD Modifications for Pirates

Digging through the disassembly using AntiFreeze….We notice *Globals generally contain interesting constants to modify

pirates.reputation.ReputationGlobalsLevel/Experience cheats

pirates.economy.EconomyGlobalsGold cheats

pirates.piratebase.PirateGlobalsSpeed/Accleration/Jump Height/… cheats

pirates.ship.ShipGlobalsSpeed/Acceleration cheats

P

Page 44: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 45: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 46: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

P

Page 47: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Caught?

P

Page 48: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Screenshot Contest

Disney announced a screenshot contest that coincides with ReconTop 10 get an iPod Touch

We’ll submit our obviously cheating screenshots now…http://apps.pirates.go.com/pirates/v3/#/community/contests.html

P

Page 49: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P

Questions?

Additionally, contact us via [email protected]@tippingpoint.com

Blog/Updates/etc at http://dvlabs.tippingpoint.com

P/S

Page 50: Reverse Engineering Dynamic Languages A Focus on Python Aaron Portnoy, Ali Rizvi-Santiago aportnoy@tippingpoint.com arizvisa@tippingpoint.com P