Transcript
Page 1: Mining Your Logs - Gaining Insight Through Visualization

Mining Your LogsGaining Insight Through Visualization

Google TechTalk March 2011

Raffael Marty - @zrlram

Page 2: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Raffael Marty

2

• Founder @ • Chief Security Strategist and Product Manager @ Splunk• Manager Solutions @ ArcSight• Intrusion Detection Research @ IBM Research• IT Security Consultant @ PriceWaterhouse Coopers

Applied Security VisualizationPublisher: Addison Wesley (August, 2008)

ISBN: 0321510100

Page 3: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Agenda

3

•Log Analysis

•History

•Log Architectures

•What’s Working and What’s Not?

•Future Needs

•Data Visualization

•Visualization Concepts

•Security Visualization Use-Cases

Page 4: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Log Analysis

4

10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/ HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27" TYhzVH8AAAEAAGOkBOQAAADA 655268

2010-12-28T18:12:10.031+00:00 frontend2-raffy syslog-ng[19600]: syslog-ng starting up; version='3.1.1'

2011-01-10T21:27:04.820+00:00 frontend2-raffy kernel: : [ 664.107313] blocked inbound IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f:6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255 LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126 PROTO=UDP SPT=17500 DPT=17500 LEN=160

Page 5: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

History•1980 Eric Allman develops syslogd(8)•1996 Intellitactics•1997 Tivoli Risk Manager developed by IBM Research in Zurich (later Zurich Correlation Engine, ZCE)

•1999 - 2010 A number of log management / SIEM players enter the market (software, appliances)

•2000 ArcSight - 2010 sold for $1.65bn to HP•2009 Loggly (logging as a service)

5

Page 6: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

History - The Other View•Network management (SNMP)•IDS false positive reduction•Security monitoring (multiple data sources)•Unification of NOC and SOC (failed?)•Application monitoring (moving up the stack)-original tools failed due to architectural constraints-new approaches have been presented

6

Page 7: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Log Management Today

Where are you?

Page 8: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Log Management Today

DIY•grep •Perl•SQL

Log Management•Open source•Commercial

CEP and SIEM•Open source•Commercial

MapReduce•Open source

Advanced Analytics•Not log specific!

less tools

Page 9: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Open Source Tools•graylog2• logstash•swatch• tenshi• logwatch•OSSEC•snare• lasso

• lire•LogSurfer•SEC•LogHound•slct• log2timeline• logzilla•OSSIM

•MS Logparser•Sguil•Octopussy•Sagan

9

this list is likely incomplete!

Page 10: Mining Your Logs - Gaining Insight Through Visualization

© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud

Commercial Tools

10

this list is likely incomplete!

Page 11: Mining Your Logs - Gaining Insight Through Visualization

Log Architectures

11

Page 12: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Log Mgmt Architecture

12

Collection:- syslog- OPSEC- SDEE- netflow- database

Storage:- on board- external storage array- clusters

Processing:- indexing- context storage- clustering

Page 13: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Log Mgmt Architecture

13

Collection:- syslog- OPSEC- SDEE- netflow- database

Processing:- indexing- context storage- clustering

Data Access:- free-text search- field-based search- tagging schemas

normalizedor raw

raw

Page 14: Mining Your Logs - Gaining Insight Through Visualization

© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud

Agents and Connectors• piece of code to transport logs to a central location• features- batch- compress- encrypt- sign- fail-over

14

• often additional features:- parse- normalize- aggregate- enrichment (context)

• special protocols:- OPSEC, SDEE- Windows

• file-based collection• database collection

Page 15: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

SIEM Architecture

15

normalizedraw

asset context

identity context

...

RDBMS

context / tagging

Page 16: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

SIEM Architecture•RDBMS schema- Fixed number and type of fields-New data sources with new fields?‣ overloading

•RDBMS clusters are expensive and scale poorly•Need a parser for every data source•Slow historical data queries•Hard to configure database efficiently-because of different use-cases

16

Page 17: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

SIEM Architecture Benefits•Parsed data enables-real-time correlation-real-time statistics-data augmentation (context) close to source•Unified data access language-over a fixed set of fields

•Real-time dashboards

17

Page 18: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Search vs. SIEM•Full-text indexing•Parsing at search time

18

Example search:denied

Example search:user=rmarty

• use index to find occurrences of ‘denied’

• use index to find ALL occurrences of ‘rmarty’

• apply parser to results• remove results where

user is not rmarty

Page 19: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

New SIEM - Hybrid Models•Use parsers for known data sources•Collect everything else•Index all data and use index for search•Correlate parsed data

19

Page 20: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Categorization and Tagging•How do you find all failed logins across any data source?

•Does not scale- for new data sources- for new events of existing sources

•Define a ‘taxonomy’ for all events•Map events into taxonomy

20

security:538 OR “sshd authentication failure” OR “sshd failed password” OR ...

id -> object, action, status

Page 21: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Content Creation•Rules, dashboards, reports, searches can use taxonomy:

•All failures related to files:

•Mixing with other fields:

21

object=authentication AND action=login AND status=success

object=file AND status=failure

action=login AND user=rmarty

•Approach scales well•Huge effort to build and maintain mappings

Page 22: Mining Your Logs - Gaining Insight Through Visualization

Logging as a Service

Logging as a Service (LaaS)

22

•Economically advantageous - think about TCO•Pay as you go•Elastic infrastructure scales with your needs•No installation needed•No setup costs / time for logging solution•Open platform with RESTful APIs

Page 23: Mining Your Logs - Gaining Insight Through Visualization

Logging as a Service

Loggly

23

Data Sources Consumers

APIProxies

Distributeddata store

Distributedindexing and processing

Data collectionData access

mobile-166 My syslog

Logglyuser interface

Indexers and Search Machines

Log Archive

UI extensions

Page 24: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Tool Usage

24

DIY MR Log Mgmt SIEM LaaS

data sources

knownonly a few

knownonly a few

unknownmany

knownmany -

analysis use-cases

knownone or a few

explorationlarge-scale

unknownmany

unknownmany

extend platform

dynamic use-cases no no yes yes yes

real-time correlation no no no yes extend

platform

costengineerhardwaremaintenance

engineershardwaremaintenance

license(hardware)maintenance

licensehardwaremaintenance

subscription

Should you rather do it yourself (DIY)?

Page 25: Mining Your Logs - Gaining Insight Through Visualization

What is Working and What is not?

25

Page 26: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

What’s Working•Log collection•Log centralization•Alerting on a priori known patterns•Solving specific, known use-cases for sets of known data sources, e.g.,-monitoring privileged access to financial servers-generating compliance reports-security forensics

26

Page 27: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

What’s Not Working•Log formats are all over and not documented

•No logging guidelines / developer education•Parsing is broken-based on regexes-numerous mistakes-doesn’t scale

27

Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576

Page 28: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

What’s Not Working•Normalization is broken:- IP to hostnames (when to do DNS lookup)-usernames (rmarty vs. ram vs. raffy)

•Categorization / Taxonomy-doesn’t scale- is buggy

•Prioritization has no working formula•Anomaly detection is voodoo!

28

- is always out of date-expensive

Page 29: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

What Does It Mean?•We don’t understand our data•Security Operations Center (SOC) monitors all corporate data sources. Analysts-don’t know all the applications-don’t know all the setups-don’t know what log records are ‘normal’ behavior

29

--> Need tools to enable log owners to work with their data

Page 30: Mining Your Logs - Gaining Insight Through Visualization

Future Needs

30

Page 31: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

We Need Better Tools•We will have more and more data and need to deal with larger amounts of data - SIEM needs to support new distributed, scalable data management technologies

•More and more application layer data -How are we going to deal with all the parsing / entity extraction?-We need logging standards and guidelines

•How do we help analysts understand the data?-What is important and what is not?-Mapping problems to business process, business risk!

31

Page 32: Mining Your Logs - Gaining Insight Through Visualization

Data Visualization

32

Page 33: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Data/Log Visualization•Exploration and Discovery

•Answer Questions

•Communicate Information

•Support Decisions33

Page 34: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

•We are nowhere!•Visualization is an afterthought•Sec Viz dichotomy•Tools are lacking fundamental capabilities•Users don’t understand data, how can they understand visuals?

Security Visualization

34

Page 35: Mining Your Logs - Gaining Insight Through Visualization

Visualization Concepts

35

Page 36: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

The Analysis Approach

36

Overview first Zoom Details on demand

Principle by Ben Shneiderman

Page 37: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Simultaneous Views

37

Page 38: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Dynamic Coloring

38

Page 39: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Linked Views

39

Page 40: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Legible / Usable Graphs

40

Reducing non data ink!

Page 41: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Choosing the Right Chart

41

Page 42: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Ode to the Pie

42

Page 43: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Careful With Interpretations

43

Page 44: Mining Your Logs - Gaining Insight Through Visualization

SecViz Examples

44

Page 45: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service 45

Page 46: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service 46

Page 47: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service 47

Page 48: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Situational Awareness• Treemap• Protovis.JS• Size: Amount • Brightness: Variance• Color: Sensor• Shows: Scans - bright spots

• Thanks to Chris Horsley

48

Page 49: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service 49

Page 50: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Firewall Treemap

50

Page 51: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Firewall LogPort Source IP Destination IP

51

Page 52: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

IDS Sig Tuning - Treemap

52

Hierarchy: SourceDestinationSignatureNumber of Events

Color: PrioritySize: Number of alerts

Page 53: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Vulnerability Data by Host

53

Page 54: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Visualization Future

54

•A solution to entity extraction•Dynamic and interactive displays•Computer aided intelligence / visualization-Computer supported exploration-Highly interactive

•Expert system that captures domain knowledge-Collaborative

Page 55: Mining Your Logs - Gaining Insight Through Visualization

© by Raffael MartyLogging as a Service

Share, discuss, challenge, and learn about security visualization.

http://secviz.org

• List: secviz.org/mailinglist

• Twitter: @secviz

55

Page 56: Mining Your Logs - Gaining Insight Through Visualization

56

about.me/raffy


Recommended