View
6.430
Download
3
Embed Size (px)
DESCRIPTION
In this two part presentation we will explore log analysis and log visualization. We will have a look at the history of log analysis; where log analysis stands today, what tools are available to process logs, what is working today, and more importantly, what is not working in log analysis. What will the future bring? Do our current approaches hold up under future requirements? We will discuss a number of issues and will try to figure out how we can address them. By looking at various log analysis challenges, we will explore how visualization can help address a number of them; keeping in mind that log visualization is not just a science, but also an art. We will apply a security lens to look at a number of use-cases in the area of security visualization. From there we will discuss what else is needed in the area of visualization, where the challenges lie, and where we should continue putting our research and development efforts.
Citation preview
Mining Your LogsGaining Insight Through Visualization
Google TechTalk March 2011
Raffael Marty - @zrlram
© by Raffael MartyLogging as a Service
Raffael Marty
2
• Founder @ • Chief Security Strategist and Product Manager @ Splunk• Manager Solutions @ ArcSight• Intrusion Detection Research @ IBM Research• IT Security Consultant @ PriceWaterhouse Coopers
Applied Security VisualizationPublisher: Addison Wesley (August, 2008)
ISBN: 0321510100
© by Raffael MartyLogging as a Service
Agenda
3
•Log Analysis
•History
•Log Architectures
•What’s Working and What’s Not?
•Future Needs
•Data Visualization
•Visualization Concepts
•Security Visualization Use-Cases
© by Raffael MartyLogging as a Service
Log Analysis
4
10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/ HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27" TYhzVH8AAAEAAGOkBOQAAADA 655268
2010-12-28T18:12:10.031+00:00 frontend2-raffy syslog-ng[19600]: syslog-ng starting up; version='3.1.1'
2011-01-10T21:27:04.820+00:00 frontend2-raffy kernel: : [ 664.107313] blocked inbound IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f:6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255 LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126 PROTO=UDP SPT=17500 DPT=17500 LEN=160
© by Raffael MartyLogging as a Service
History•1980 Eric Allman develops syslogd(8)•1996 Intellitactics•1997 Tivoli Risk Manager developed by IBM Research in Zurich (later Zurich Correlation Engine, ZCE)
•1999 - 2010 A number of log management / SIEM players enter the market (software, appliances)
•2000 ArcSight - 2010 sold for $1.65bn to HP•2009 Loggly (logging as a service)
5
© by Raffael MartyLogging as a Service
History - The Other View•Network management (SNMP)•IDS false positive reduction•Security monitoring (multiple data sources)•Unification of NOC and SOC (failed?)•Application monitoring (moving up the stack)-original tools failed due to architectural constraints-new approaches have been presented
6
© by Raffael MartyLogging as a Service
Log Management Today
Where are you?
© by Raffael MartyLogging as a Service
Log Management Today
DIY•grep •Perl•SQL
Log Management•Open source•Commercial
CEP and SIEM•Open source•Commercial
MapReduce•Open source
Advanced Analytics•Not log specific!
less tools
© by Raffael MartyLogging as a Service
Open Source Tools•graylog2• logstash•swatch• tenshi• logwatch•OSSEC•snare• lasso
• lire•LogSurfer•SEC•LogHound•slct• log2timeline• logzilla•OSSIM
•MS Logparser•Sguil•Octopussy•Sagan
9
this list is likely incomplete!
© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud
Commercial Tools
10
this list is likely incomplete!
Log Architectures
11
© by Raffael MartyLogging as a Service
Log Mgmt Architecture
12
Collection:- syslog- OPSEC- SDEE- netflow- database
Storage:- on board- external storage array- clusters
Processing:- indexing- context storage- clustering
© by Raffael MartyLogging as a Service
Log Mgmt Architecture
13
Collection:- syslog- OPSEC- SDEE- netflow- database
Processing:- indexing- context storage- clustering
Data Access:- free-text search- field-based search- tagging schemas
normalizedor raw
raw
© PixlCloud LLC 2011pixlcloud | Visualization in the Cloud
Agents and Connectors• piece of code to transport logs to a central location• features- batch- compress- encrypt- sign- fail-over
14
• often additional features:- parse- normalize- aggregate- enrichment (context)
• special protocols:- OPSEC, SDEE- Windows
• file-based collection• database collection
© by Raffael MartyLogging as a Service
SIEM Architecture
15
normalizedraw
asset context
identity context
...
RDBMS
context / tagging
© by Raffael MartyLogging as a Service
SIEM Architecture•RDBMS schema- Fixed number and type of fields-New data sources with new fields?‣ overloading
•RDBMS clusters are expensive and scale poorly•Need a parser for every data source•Slow historical data queries•Hard to configure database efficiently-because of different use-cases
16
© by Raffael MartyLogging as a Service
SIEM Architecture Benefits•Parsed data enables-real-time correlation-real-time statistics-data augmentation (context) close to source•Unified data access language-over a fixed set of fields
•Real-time dashboards
17
© by Raffael MartyLogging as a Service
Search vs. SIEM•Full-text indexing•Parsing at search time
18
Example search:denied
Example search:user=rmarty
• use index to find occurrences of ‘denied’
• use index to find ALL occurrences of ‘rmarty’
• apply parser to results• remove results where
user is not rmarty
© by Raffael MartyLogging as a Service
New SIEM - Hybrid Models•Use parsers for known data sources•Collect everything else•Index all data and use index for search•Correlate parsed data
19
© by Raffael MartyLogging as a Service
Categorization and Tagging•How do you find all failed logins across any data source?
•Does not scale- for new data sources- for new events of existing sources
•Define a ‘taxonomy’ for all events•Map events into taxonomy
20
security:538 OR “sshd authentication failure” OR “sshd failed password” OR ...
id -> object, action, status
© by Raffael MartyLogging as a Service
Content Creation•Rules, dashboards, reports, searches can use taxonomy:
•All failures related to files:
•Mixing with other fields:
21
object=authentication AND action=login AND status=success
object=file AND status=failure
action=login AND user=rmarty
•Approach scales well•Huge effort to build and maintain mappings
Logging as a Service
Logging as a Service (LaaS)
22
•Economically advantageous - think about TCO•Pay as you go•Elastic infrastructure scales with your needs•No installation needed•No setup costs / time for logging solution•Open platform with RESTful APIs
Logging as a Service
Loggly
23
Data Sources Consumers
APIProxies
Distributeddata store
Distributedindexing and processing
Data collectionData access
mobile-166 My syslog
Logglyuser interface
Indexers and Search Machines
Log Archive
UI extensions
© by Raffael MartyLogging as a Service
Tool Usage
24
DIY MR Log Mgmt SIEM LaaS
data sources
knownonly a few
knownonly a few
unknownmany
knownmany -
analysis use-cases
knownone or a few
explorationlarge-scale
unknownmany
unknownmany
extend platform
dynamic use-cases no no yes yes yes
real-time correlation no no no yes extend
platform
costengineerhardwaremaintenance
engineershardwaremaintenance
license(hardware)maintenance
licensehardwaremaintenance
subscription
Should you rather do it yourself (DIY)?
What is Working and What is not?
25
© by Raffael MartyLogging as a Service
What’s Working•Log collection•Log centralization•Alerting on a priori known patterns•Solving specific, known use-cases for sets of known data sources, e.g.,-monitoring privileged access to financial servers-generating compliance reports-security forensics
26
© by Raffael MartyLogging as a Service
What’s Not Working•Log formats are all over and not documented
•No logging guidelines / developer education•Parsing is broken-based on regexes-numerous mistakes-doesn’t scale
27
Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576
© by Raffael MartyLogging as a Service
What’s Not Working•Normalization is broken:- IP to hostnames (when to do DNS lookup)-usernames (rmarty vs. ram vs. raffy)
•Categorization / Taxonomy-doesn’t scale- is buggy
•Prioritization has no working formula•Anomaly detection is voodoo!
28
- is always out of date-expensive
© by Raffael MartyLogging as a Service
What Does It Mean?•We don’t understand our data•Security Operations Center (SOC) monitors all corporate data sources. Analysts-don’t know all the applications-don’t know all the setups-don’t know what log records are ‘normal’ behavior
29
--> Need tools to enable log owners to work with their data
Future Needs
30
© by Raffael MartyLogging as a Service
We Need Better Tools•We will have more and more data and need to deal with larger amounts of data - SIEM needs to support new distributed, scalable data management technologies
•More and more application layer data -How are we going to deal with all the parsing / entity extraction?-We need logging standards and guidelines
•How do we help analysts understand the data?-What is important and what is not?-Mapping problems to business process, business risk!
31
Data Visualization
32
© by Raffael MartyLogging as a Service
Data/Log Visualization•Exploration and Discovery
•Answer Questions
•Communicate Information
•Support Decisions33
© by Raffael MartyLogging as a Service
•We are nowhere!•Visualization is an afterthought•Sec Viz dichotomy•Tools are lacking fundamental capabilities•Users don’t understand data, how can they understand visuals?
Security Visualization
34
Visualization Concepts
35
© by Raffael MartyLogging as a Service
The Analysis Approach
36
Overview first Zoom Details on demand
Principle by Ben Shneiderman
© by Raffael MartyLogging as a Service
Simultaneous Views
37
© by Raffael MartyLogging as a Service
Dynamic Coloring
38
© by Raffael MartyLogging as a Service
Linked Views
39
© by Raffael MartyLogging as a Service
Legible / Usable Graphs
40
Reducing non data ink!
© by Raffael MartyLogging as a Service
Choosing the Right Chart
41
© by Raffael MartyLogging as a Service
Ode to the Pie
42
© by Raffael MartyLogging as a Service
Careful With Interpretations
43
SecViz Examples
44
© by Raffael MartyLogging as a Service 45
© by Raffael MartyLogging as a Service 46
© by Raffael MartyLogging as a Service 47
© by Raffael MartyLogging as a Service
Situational Awareness• Treemap• Protovis.JS• Size: Amount • Brightness: Variance• Color: Sensor• Shows: Scans - bright spots
• Thanks to Chris Horsley
48
© by Raffael MartyLogging as a Service 49
© by Raffael MartyLogging as a Service
Firewall Treemap
50
© by Raffael MartyLogging as a Service
Firewall LogPort Source IP Destination IP
51
© by Raffael MartyLogging as a Service
IDS Sig Tuning - Treemap
52
Hierarchy: SourceDestinationSignatureNumber of Events
Color: PrioritySize: Number of alerts
© by Raffael MartyLogging as a Service
Vulnerability Data by Host
53
© by Raffael MartyLogging as a Service
Visualization Future
54
•A solution to entity extraction•Dynamic and interactive displays•Computer aided intelligence / visualization-Computer supported exploration-Highly interactive
•Expert system that captures domain knowledge-Collaborative
© by Raffael MartyLogging as a Service
Share, discuss, challenge, and learn about security visualization.
http://secviz.org
• List: secviz.org/mailinglist
• Twitter: @secviz
55