Transcript
Page 1: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

StatisticalDeobfuscation forAndroidApplications

BenjaminBichsel

VeselinRaychev

PetarTsankov

MartinVechev

DepartmentofComputerScience

Page 2: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

WhyDe-obfuscate?

GooglePlay

Androidbinaries(APKs)(nocodeavailable)

NumberofAPKsonGooglePlay 2.4MAPKs

’10 ’12 ’14 ’16

Page 3: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroid

Obfuscate

Non-descriptivenames

Namesprovidekeysemanticinformation

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

Somenamesremain

Page 4: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroid

Obfuscate

Non-descriptivenames

Namesprovidekeysemanticinformation

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

Somenamesremain

SecurityChallenges

CodeInspection

Third-partyLibraryDetection

… manyothers

Page 5: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroidNon-descriptive

names

Namesprovidekeysemanticinformation

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

Somenamesremain

Canwereverselayoutobfuscation

Page 6: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

LayoutObfuscationinAndroid

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDatabase();

}

Cursor execSQL(String str) {return db.rawQuery(str);

}}

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDatabase();

}

Cursor c(String str) {return b.rawQuery(str);

}}

Non-descriptivenames

NamesprovidekeysemanticinformationYes,withroughly80%accuracy!

www.apk-deguard.com

Page 7: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

www.apk-deguard.com

Releasedlastweek,sofar:>5Kusers >5GBAPKs

Redditposts/comments Tweets

... ...

Page 8: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

HowDoesDeGuard Work?

Page 9: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhase

Open-source,unobfuscatedapplications

LearningPhase

Staticanalysis Training

Probabilisticmodel𝑃 )

Semanticrepresentation

ObfuscatedCode De-obfuscatedCode

Page 10: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ProbabilisticGraphicalModels

Page 11: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

ProbabilisticGraphicalModelsname1 name2 weight

𝑓% SQLiteHelper DBUtils 0.3𝑓& SQLiteHelper DBHelper 0.2

name1 name2 weight𝑓' getWritableDB db 0.7𝑓( getWritableDB instance 0.4

name1 name2 weight𝑓) DBUtils instance 0.5𝑓* DBHelper db 0.4𝑓+ … … …

Graph+featuresdefineaprobabilisticgraphicalmodel

𝑂 𝐾𝑃 ) == 𝑃 𝑎, 𝑏 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑔𝑒𝑡𝑊𝑟𝑖𝑡𝑎𝑏𝑙𝑒𝐷𝐵)

= 1𝑍 exp(0.3 I 𝑓% 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎

+0.2 I 𝑓& 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎 +⋯ )

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

𝑂 Unknownvariables

Knownvariables

𝑓%, 𝑓&, . .

𝐾

Featurefunctions

Page 12: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

ProbabilisticGraphicalModelsname1 name2 weight

𝑓% SQLiteHelper DBUtils 0.3𝑓& SQLiteHelper DBHelper 0.2

name1 name2 weight𝑓' getWritableDB db 0.7𝑓( getWritableDB instance 0.4

name1 name2 weight𝑓) DBUtils instance 0.5𝑓* DBHelper db 0.4𝑓+ … … …

Graph+featuresdefineaprobabilisticgraphicalmodel

𝑂 𝐾𝑃 ) == 𝑃 𝑎, 𝑏 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑔𝑒𝑡𝑊𝑟𝑖𝑡𝑎𝑏𝑙𝑒𝐷𝐵)

= 1𝑍 exp(0.3 I 𝑓% 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎

+0.2 I 𝑓& 𝑆𝑄𝐿𝑖𝑡𝑒𝐻𝑒𝑙𝑝𝑒𝑟, 𝑎 +⋯ )

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

𝑂 Unknownvariables

Knownvariables

𝑓%, 𝑓&, . .

𝐾

Featurefunctions

NextHowaretheweightsandfeatureslearned?

Page 13: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

Learning

Page 14: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

Learning

UnobfuscatedAPKs

name1 name2 weight𝑓% SQLiteHelper DBUtils 0.3𝑓& SQLiteHelper DBHelper 0.2𝑓' getWritableDB db 0.7𝑓( getWritableDB instance 0.4𝑓) DBUtils instance 0.5𝑓* DBHelper db 0.4𝑓+ … … …

name1 name2𝑓% SQLiteHelper DBUtils𝑓& SQLiteHelper DBHelper𝑓' getWritableDB db𝑓( getWritableDB instance𝑓) DBUtils instance𝑓* DBHelper db𝑓+ … …

Computeweightsthatmaximize𝑃 𝑂 = 𝑜N 𝐾 = 𝑘N foralltrainingsamples(𝑜N, 𝑘N)

Staticanalysis

TrainModel

Featuretemplates

Features(withcandidatenames)

Dependencygraphs

28templates

Actualgraphshave>1,000nodes

>2,000

>100,000

Page 15: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhase

Open-source,unobfuscatedapplications

LearningPhase

Staticanalysis Training

ObfuscatedCode De-obfuscatedCode

Probabilisticmodel𝑃 )

Page 16: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhaseObfuscatedCode De-obfuscatedCode

Probabilisticmodel𝑃 )

Page 17: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Staticanalysis

Page 18: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Programanalysis

MAPInference

Candidate assignment𝒐 𝑷 𝒐 𝒌)*a =DBUtils b =instance 1.2a =DBHelper b =db 1.3a =DBUtils b =db 0.8a =DBHelper b =instance 1.2

*Non-normalized

�⃗� = 𝑎𝑟𝑔𝑚𝑎𝑥𝑃 𝑂 = �⃗�′ 𝐾 = 𝑘�⃗�′ ∈ Ω

Page 19: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

aextends

getsfield-in

b

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Programanalysis

MAPInference

Candidate assignment𝒐 𝑷 𝒐 𝒌)*a =DBUtils b =instance 1.2a =DBHelper b =db 1.3a =DBUtils b =db 0.8a =DBHelper b =instance 1.2

*Non-normalized

�⃗� = 𝑎𝑟𝑔𝑚𝑎𝑥𝑃 𝑂 = �⃗�′ 𝐾 = 𝑘�⃗�′ ∈ Ω

Page 20: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ObfuscatedCode

SQLiteHelper

getWritableDB

DBHelperextends

getsfield-in

db

PredictionPhasename1 name2 weightSQLiteHelper DBUtils 0.3SQLiteHelper DBHelper 0.2

name1 name2 weightgetWritableDB db 0.7getWritableDB instance 0.4

name1 name2 weightDBUtils instance 0.5DBHelper db 0.4DBUtils db 0.2DBHelper instance 0.2

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {b = getWritableDB();

}}

Staticanalysis

Deobfuscated Code

class DBHelper extends SQLiteHelper {SQLiteDatabase db; public DBHelper(Context ctx) {db = getWritableDB();

}}

Transform

Page 21: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

PreservingSemantics

class Aint aObject bvoid a()

class B extends

Avoid b()void c(A a)

Syntacticconstraintse.g.fieldswithinaclassmusthavedistinctnames

Semanticconstraintse.g.methodoverloadsmustbepreserved

Freelyrenamingfields/variables/methodsmaychange theprogramsemantics

musthavedistinctnames

musthavedistinctnames

mustnotoverride

methoda()

Page 22: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard:SystemOverview

Staticanalysis TransformMAP

Inference

class a extends SQLiteHelper {SQLiteDatabase b;public a(Context ctx) {

b = getWritableDB();}

}

class DBHelper extends SQLiteHelper{SQLiteDatabase db;public DBHelper(Context ctx) {

db = getWritableDB();}

}

PredictionPhase

Open-source,unobfuscatedapplications

LearningPhase

Staticanalysis Training

ObfuscatedCode De-obfuscatedCode

Page 23: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard Implementation

Page 24: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

DeGuard Implementation

www.apk-deguard.com

§ StaticanalysisframeworkforJavaandAndroid

StaticAnalysis

LearningandMAPInference§ Scalableopen-sourceframework

forstructuredprediction§ Open-source:http://nice2predict.org

§ Trainingdata:2Kopen-source,unobfuscated Androidapplications

Page 25: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

1. CanDeGuard reverseProGuard?2. CanDeGuard detectthird-partylibraries?3. IsDeGuard usefulformalwareinspection?

EvaluationEvaluation

Page 26: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

ProGuard Experiment

SourceCode

ObfuscatedAPK De-obfuscatedAPK

Non-obfuscatedAPK =?

Page 27: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

AfterObfuscation

Fields Methods Classes Packages Total

20

40

60

80

100

0

%ofprogramelements

only13%knownnames

Knownnames

Page 28: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

CanDeGuard reverseProGuard?

Fields Methods Classes Packages Total

20

40

60

80

100

0

%ofprogramelements

Knownnames

Correctlypredictednames

Mis-predictednames

Packagenamesaredirectlyusedto

predictthird-partylibraries

1.6%knownnames

80.6%correctnames

80%ofthenamesareidenticaltotheoriginalones

i.e.,identicaltotheoriginalnames

Page 29: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

CanDeGuard DetectThird-PartyLibraries?

LibraryCode

SourceCode

ObfuscatedAPK

ProGuardobfuscateslibrarypackagenames

De-obfuscatedAPK

?

Precision:93.1%Recall:91%

ProGuard

Page 30: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

IsDeGuard UsefulforMalwareInspection?

class d {String a = System.getProperty(..)char[] b;byte [] c;byte[] a(String) {}

}

class Base64 {String NL = System.getProperty(..)char[] ENC;byte [] DEC;byte[] decode(String) {}

}

De-obfuscatingsamplesfromtheAndroidMalwareGenomeProject

MalwareSample De-obfuscatedMalwareSample

Base64Decoder

Revealsstringdecoders

Revealsclassesthathandlesensitivedata(e.g.Location)

Hardtohandleheavily-obfuscatedcode(e.g.reflection)

Page 31: Statistical Deobfuscation for Android Applications · Statistical Deobfuscation for Android Applications Benjamin Bichsel Veselin Raychev Petar Tsankov Martin Vechev Department of

package com.example.dbhelper

class DBHelper extends SQLiteHelper {SQLiteDatabase db;

public DBHelper(Context ctx) {db = getWritableDB();

}

Cursor execSQL(String str) {return db.rawQuery(str);

package a.b.c

class a extends SQLiteHelper {SQLiteDatabase b;

public a(Context ctx) {b = getWritableDB();

}

Cursor c(String str) {return b.rawQuery(str);

Tryonline:www.apk-deguard.com Fields Methods Classes Packages Total

20

40

60

80

100

0

SQLiteHelper

getWritableDB

a

b

name1 name2 weight

SQLiteHelper DBUtils 0.3

SQLiteHelper DBHelper 0.2

name1 name2 weight

getWritableDB db 0.7

getWritableDB instance 0.4

ProbabilisticModels

HighPredictionAccuracy

Moreinfo:http://www.srl.inf.ethz.ch/spas

Summary


Recommended