7
What is Data Science? “Raw” data “Ac0onable” data “Needle Detector” “Data Science (DS) is the extrac0on of knowledge from large volumes of data…” hBps://en.wikipedia.org/wiki/Data_science Frederick D. Pearce

Intro to Data Science and Periodic Models for Security Data

Embed Size (px)

Citation preview

Page 1: Intro to Data Science and Periodic Models for Security Data

WhatisDataScience?

“Raw”data“Ac0onable”

data

“NeedleDetector”

“DataScience(DS)istheextrac0onofknowledgefromlargevolumesofdata…”

hBps://en.wikipedia.org/wiki/Data_science

FrederickD.Pearce

Page 2: Intro to Data Science and Periodic Models for Security Data

What’sa“Good”DSenvironment?

“Raw”Data

“Ac0onable”Data

Analy8cs“Kitchen”

Fresh/Local/Organic “Right”Tools Read/WriteRecipes

Python?ELK?Splunk?Log*?

FrederickD.Pearce

Page 3: Intro to Data Science and Periodic Models for Security Data

FrederickD.PearceDataScienceProcessSchema0cbyFarcasteratEnglishWikipedia,CCBY-SA3.0,

hBps://commons.wikimedia.org/w/index.php?curid=40129394

What’sinaDS“BlackBox”?

Page 4: Intro to Data Science and Periodic Models for Security Data

PeriodicModel“Recipe”

“Raw”Data

PeriodicModelParameters

PeriodicModel“Recipe”

Average(μReq)≈???[Requests]Freq.(fReq)≈???[Hertz]Amp.(ΑReq)≈???[Requests]

OpenDNSpythontool

“Cleaner” “Inverter”

hBp://system.opendns.com

FrederickD.Pearce

Page 5: Intro to Data Science and Periodic Models for Security Data

PM:Fast-FourierTransform

hBps://en.wikipedia.org/wiki/Fast_Fourier_transform

xn

0me(tn)

|Xk|

frequency(fk)

FFT

FFT-1**

*

**

*

**

* *** * * *

*

•  Data(xn)decomposedintodiscretefrequencycomponents•  Xkiscomplexvalued,withΑk=|Xk|andφk=tan-1(Xk)•  It’sfast!DFTisO(N2)whileFFTisO(N*log(N))

Xk ≡ xn * exp− i(2π

kN)n

n=0

N −1

∑ k = 0,...,N −1FFT:

fNyquist

FrederickD.Pearce

Page 6: Intro to Data Science and Periodic Models for Security Data

PM:So,what’stheanswer?

Mean(μReq)≈75.4[Requests*109],Freq.(fReq)≈1.64*10-6[Hertz],Amp.(ΑReq)≈8.2[Requests*109],Phase(φReq)≈175[degrees]

(fReq,ΑReq)

FrederickD.Pearce

Page 7: Intro to Data Science and Periodic Models for Security Data

Applica0onto“Log”DataDevelopPython-based“securityrecipes”thatcombinesearchandanaly0cscapabili0esofPythonandELK*

*Possibletointegratewithothercodingtoolstoo(e.g.R)

FFTof“idle”VMlogdatacollectedwithBRO(haps://www.bro.org/)

f=1/24hr

FrederickD.Pearce