37
A"ribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska, Qinghua Zheng 1

Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

A"ribu'ngHacksYu-XiangWang

Jointworkwith ZiqiLiu, AlexSmola,KyleSoska, QinghuaZheng

1

Page 2: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

AmazonAI

•  MakingmachinelearningandAItechnologiesaccessibletoalldevelopers.

•  Wearehiring!– PhDInternshipposi'onsallyearround.– Full-'meposi'onsalsoavailable.– Contactme,Anima,Alexoranyotherfolksthere.

2

Page 3: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Background•  Thereare1,000,000,000websitesontheinternetasofSep2014.

•  About1%ofthemarecurrentlyhackedorinfected(source:securi.net)

•  That’sabout10millionmaliciouswebsites!

3

Page 4: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Whatcanwedoaboutit?

•  Typicallyfocusondetec'onandremedia'on.– UsingsmalliFrames(Mavromma's&Monrose,08)– NortonSafeWeb,McAfeeSiteAdvisor.

•  Forensics/A,ribu/onofhacks– muchharderproblems– What?How?When?– Thispaper:usesta's'cs,MLtools!

4

Page 5: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Outline

1.  Challenges

2.  Putourselvesinthehackers’shoes

3.  Oursolu'on:survivalanalysis+trendfiltering

4.  Resultsonrealdata

5

Page 6: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Challenge1:hiddenhackingprocedure

Noneofthethreeisknowntous!

Websitesgethacked…

6

Page 7: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Challenge2:Unknownhacking'me

Noexplicitlabelsforsupervisedlearner.7

Page 8: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Challenge3:'mevaryingrisk

•  Securityriskis'mesensi've.– Hackerskeepdiscoveringnewexploits.– Websiteskeeppatchingbugs/vulnerability.– Newversionsofsolwarearebeinginstalled.

Sharpchangestriggeredbyevents!

8

Page 9: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Fromahacker’spointofview

Ifoundanexploit!Whattodo?

Money?

Fame?

Hackasmanysitesaspossibleasquicklyaspossible

Sharethetheexploitwithpeers.Scriptkiddieswillkickin.

Whatcanwelearnfromthis?-  Searchablestringsnippetsareindica'vefeatures(Soska&Chris'n2014)e.g.,HTMLtags<meta>WordPress2.9.2</meta>-  Changepointsinhackingvolumerevealhiddenevents/ac'vi'es.(Thispaper!)

9

Page 10: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Outline

1.  Challenges

2.  Putourselvesinthehackers’shoes

3.   Oursolu/on:survivalanalysis+trendfiltering

4.  Resultsonrealdata

10

Page 11: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Recalltheinputandoutput

•  Task:es'matetheriskofgepnghacked.

•  Input:– Censoredhack'me.–  featuresofwebsites.

•  Thisissurvivalanalysis!

11

Page 12: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Survivalanalysis

Whattheheckisthat?

It’sourbreadandbu"er.-  Datesbacktolate1600s,instudying

smallpoxandlifeexpectancy.-  S'llanac'veresearchareatoday.

Halley Bernoulli

Modernformula'on:(Kaplan&Meier,1958;Cox,1972)-Adensityes'ma'onproblemforr.v.T:'meofdeath.

12

Page 13: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

13

MachineLearning Sta/s/cs

•  regression•  clustering

•  classifica'on

•  Bayesianinference

•  Graphicalmodels

•  Onlinelearning

Page 14: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Hackingasasurvivalproblem•  Awebsitegothacked óApa'enthadahearta"ack.•  Vulnerablefeatures óGenesassc.withheartdisease•  Relaycheckpoint óAregularphysicalcheckup.•  Blacklisted óDiagnosedwithheartfailure

•  Inferen'altasksofinterest:–  Prob(Hearta"ackbeforeage40|DNAsequencex,healthyun'l30)–  Prob(hackedbeforeMay1|featurevectorx,nothackedyettoday)

14

Page 15: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

TheCoxmodel

•  Asemi-parametricmodel.•  The“default”survivalanalysismodel…•  Cited44903'mes(GoogleScholar)!

SirDavidCox

Cox(1972).“RegressionmodelsandLife-tables”.JournaloftheRoyalSta's'csSociety.

�(x, t;w) = �0(t) exp hw, xi

15

Page 16: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

FromCoxmodeltoourmodel

•  Coxmodel:– Lowdimensionalgeneralizedlinearmodel

•  Ourmodel:

– Timevarying,addi'vehazardfunc'on.– Highdimensional.wisavectoroffunc'onsint.– Fullynonparametricforeachfeature.

�(x, t;w) = �0(t) exp hw, xi

16

Page 17: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Comparingtoexis'ng'me-varyingsurvivalmodels

•  Kernel,smoothingsplines(Kooperberg’94;Sauerbrei’07)–  Curseofdimensionality.–  Requirehomogeneoussmoothness.

•  Howwearedoingdifferently?– Addi'veineachdimension.– Usetrendfiltering(Kimet.al.,,2009;Tibshirani,2013)tohandleheterogeneoussmoothness/sharpchanges.

17

Page 18: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Locallyadap'venonparametricregressionviatrendfiltering

•  Forfunc'onswithboundedvaria'on:–  TrendFiltering:n^(-2/3)minimaxrate– Alllinearsmoothers:n^(-1/2)subop'malrate

18(Kimet.al.2009,SIAMReview),(Tibshirani,AoS2013),(W.,SmolaandTibshirani,ICML’14)

Fusedlasso

Page 19: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

LearningbyregularizedMLE

•  Technicalchallenges:–  Thisisop'mizingoverfunc'ons!–  Intervalcensoringlossisnon-convex–  TVoperatorisnon-smooth.

min

(w0,w1,...,wp)2Fp� log

Y

i2Bp(ti ⌧i < Ti)

Y

i/2B

p(⌧i > T )

�+ �

pX

j=0

TV(wj)

19

Page 20: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Ourcontribu'ons•  Func'ons=>VectorsinEuclideanspace

–  Thesolu'onisparameterizedbyasmallnumberofstep-func'ons.(acutere-parameteriza'onanduseofMammen&VanDeGeer,1997)

•  Handlingnon-smoothnessviaproximalSVRG.–  Combinelinear'meproximalmapusingdynamicprogramming(Johnson,2013)withresultsin(Yu,2014)

–  Convergenceratedespitenon-convexity(Reddiet.al.,2016)

•  Efficientimplementa'on.–  Representsonlyac'vesets.–  Highlyscalable,upmillionsoffeaturesanddatapoints.

20

Page 21: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Keystepoftheprox-SVRGalgorithm

21

-Doublyrobustes'ma'on-Controlvariate.

(Reddiet.al.,2016.Allen-Zhu,2016.)Sta'onarityconvergencerate:

Page 22: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Proximaldecomposi'on

•  Johnson(2013)’sDPalgorithmsolves:

•  Buthowtodealwiththenon-nega'vity?– UsingYaoliangYu(2015)’sgeneralcharacteriza'on,weshowthatitdecomposes!

22

Page 23: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

TVpenaltyisnotsensi'vetosparsity.

•  Donotdis'nguishbetween:

23

Page 24: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Moresparsity(lessbias)withTV-log

24

Page 25: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Moresparsity(lessbias)withTV-log

25

Forpiecewiseconstantfunc'ons,TV_logisstrictlysmaller!Anovelvaria'onaldefini'on.

Page 26: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Howdoweop'mizeit?

•  DiscreteTV_log=DiscreteTV+Concave

•  Theconcavepartcanbeshowntobecon'nuouslydifferen'able.

•  Combinetheconcavepartwiththelossfunc'ons.ThesameproximalSVRG!

26

Page 27: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Outline

1.  Challenges

2.  Putourselvesinthehackers’shoes

3.  Oursolu'on:survivalanalysis+trendfiltering

4.   Resultsonsimula/onandrealdata

27

Page 28: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Simulatedexample:recoveryagainstthegroundtruth

28

TV-penalty TV-logpenalty

Page 29: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Simulatedexample:recoveryagainstthegroundtruth

29

Unregularized PolsplinesinR

Page 30: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Experimentsonmillionsofsitesandmillionsoffeatures,from2010-2014.

Trainingerror Testerror 30

Page 31: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Casestudy:Worldpressfeatures

•  A"ackerstendtoworkinbatches

Startofana"ackbyanelitehacker

Secondcampaignofscriptkiddies

31

Page 32: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Interpre'ngthemonotonemodelonceavulnerabili'esknown,alwaysatrisk

32

Page 33: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Otherapplica'ons?

•  Userdropoutratees'ma'on– Checkresponsesofgroupsofpeopletocertainpromo'ons.

•  Alipay.comdatafromAntFinancial.– Ac'veuserifloginfor7daysinarow.– Otherwiseconsidereddroppedout.– Dataof4millionusers(1%oftheAlipayusers)

33

Page 34: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

ResultsontheAlipayDataSet

34

March10:Cashrebatepromo'on

April18:Healthinsurancebonuspromo

Page 35: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Conclusion

•  Using3xeffec'veparameters,ourmodelsignificantlyoutperformstheclassicCoxmodelinpredic'onaccuracy.

•  Interpretability:Allowsustoa"ributehackstofeatures,andspecificexploits.

•  Scalability:fasterandmorelocallyadap'vethanexis'ng'me-varyingmodels.

35

Page 36: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Openproblems•  Sta's'calproper'es:–  Consistencyandsamplecomplexityofthemodel.–  Implicitsparsityregulariza'on?Sublineardependenceind?

•  Computa'onalproper'es:–  Nonconvex,butconvergencetonearglobalminimaundersta's'calassump'ons?

•  Applica'on:–  Usehigherordertrendfilteringonothersurvivalanalysisproblems,e.g.,marriage,divorce…

36

Page 37: Aribu’ng Hacks - UCSByuxiangw/talks/attributing... · 2018-09-11 · Aribu’ng Hacks Yu-Xiang Wang Joint work with Ziqi Liu, Alex Smola, Kyle Soska , Qinghua ... , 2016) • Efficient

Thankyouforyoura"en'on!

Ziqi Alex Kyle Qinghua

Code/demoavailableat:h"ps://github.com/ziqilau/Experimental-HazardRegression

37