Upload
hillary-hutchinson
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
University of Notre Dame
Consolidated Logging: Infrastructure to Information
Bob Winding, Milind Saraph, Robert RileyUniversity of Notre Dame
Copyright 2008. This work is the intellectual property of the authors. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials
and notice is given that the copying is by permission of the authors. To disseminate otherwise or to republish requires written permission from the authors.
University of Notre Dame
Presentation Agenda
• Project Overview • Design and Implementation • Log Analysis • Lessons Learned • Project Outcomes and Future Directions• Questions • Resources and Links
University of Notre Dame
Background
• 450+ devices in a medium size data center, each a logging silo with own log policy.
• As data center grows, log review, investigation, and troubleshooting becomes an un-scalable effort.
• Multi-tier Systems are intertwined but the logs aren’t showing these relationships.
University of Notre Dame
Project Goals
• Reduce time to review logs – Consistency– Operational efficiency– Proactive vs. Reactive
• Security investigations and audit information– Incident response and Forensics
• Regulatory or contractual compliance• Research ways of obtaining interesting data
from logs
University of Notre Dame
Some Questions
• How much data?• How do we make sense of the data?• Application vs. System logs• Log Analysis vs. SIM• What technologies can we apply to the
problem?– Datamining?– Commercial log analysis tools?
University of Notre Dame
The Project Approach
• Multiple phases -> collecting to analysis• Stand up a logging farm• Establish common log policy• Install/configure clients• Start sending data• Review data for quality• Find useful data
University of Notre Dame
Architecture
UnixWindows
FWclients
Network
Collectors
Offline Analysis
Storage
University of Notre Dame
Collectors: Hardware
• Real vs. Virtual machines, commercial appliances vs. build your own, how many collectors?
• Decision based on administrative boundaries and desired additional (piggybacked) functionality e.g. reporting for mail and web servers.
• Two systems for mail and web servers, IBM 3650, 10 GB memory, mirrored 146 GB drives.
• Two systems running 3 VMs each: Unix, Windows(2), Storage, VPN and internal firewalls, border firewall (work in progress)
University of Notre Dame
Collectors: Software
• VMServer 1.0.3, RHEL4 minimal install
• Syslog (UDP only) vs. syslog-ng (UDP and TCP)
• syslog-ng 1.9, only UDP, minimal configuration, no filtering or processing on collectors, syslog traffic in clear, no guaranteed delivery
• Swatch (perl module) and logsurfer installed to permit real time monitoring
University of Notre Dame
Logging Clients
• Unix (AIX, Solaris, Linux) clients define loghost, use NTP for time synchronization, same logs sent to collectors as logged locally.
• Windows clients using (free) Snare 2.6.5 to convert Windows events to syslog format, use wtime for time synchronization. Other possible converters: Kiwi, ntsyslog
• Firewalls, VPN, and other appliances
University of Notre Dame
Datacenter ZonesCampus/Internet
VPN (OIT Admin)X.y.321.0/23
vpn-saRemote access for system
administrators
Datacenter FirewallPair
No Nat DMZX.y.19.129/26
Public Services/ProxiesDMZ
X.y.334.0/23
PWR
OK
WIC0ACT/CH0
ACT/CH1
WIC0ACT/CH0
ACT/CH1
ETHACT
COL VPN
NNAT -DMZ
Core-SVCS
DMZ
X.y.11.0/24 routed
Mon-TW
Systems MonitoringX.y.732.0/23
Admin-DMZ
Admin-DMZAdministrative Services DMZ
X.y.344.0/23
Core ServicesX.y.432.0/23
PCI BackupA.b.18.0
VPN-SA
University of Notre Dame
Collector Location
• Data center network segmented into zones: Core services, DMZ, Admin-DMZ, Monitoring etc.
• Locate collectors in individual zones or locate all in monitoring zone; currently all collectors in monitoring zone except web log collector which serves reports to user community.
• Traffic to monitoring zone watched closely, normally in the range of 5 to 10 Mb/sec.
University of Notre Dame
Bandwidth Utilization
University of Notre Dame
Monitoring and Analysis
• Sys admins have read only access to logs.• Sys admins can use swatch and logsurfer filters to
watch for events, real-time monitoring and troubleshooting, mostly on as needed temporary basis
• Offline processing on a separate server, daily reports on su and sudo activity for Unix servers.
• Windows sys admins relatively behind in monitoring and analysis, seem to prefer canned GUI based solutions
University of Notre Dame
Storage and Retention
• Collectors hold log for a week
• Daily log files compressed and copied to NFS mounted NetAppliance volume and kept for 6 months
• Logs (proposed to be) retained on tape for one year, 180 GB compressed logs in about 7 months, currently generating about 50 GB per month
University of Notre Dame
Numbers
• Collector for mail systems (MTA’s, Mailstores,appliances etc) 3 M events, 600 MB gzipped, 5 GB raw per day
• Collector for web servers, 2 M events, 300 MB gzipped, 1 GB raw per day
• Collector for Unix servers, 2 M events, 25 MB gzipped, 350 MB raw per day
• Collector for Windows, 2 M events, 700 MB gzipped, 1.8 GB raw per day
• Total: 10 M events, 10 GB raw, 1.7 GB gzipped per day
University of Notre Dame
Pilot Issues
• Data quality issues– Different sources use different formats– Logs inconsistent and incomplete– Bugs in the source log system– Multiple logs comprise an event
• Spikes in traffic– Snare in log transmission loop
• Normalization of data required for analysis
University of Notre Dame
Offline Data analysis
• Define questions logging should answer• Data cleansing, feature extraction, unified
format• Clustering, Aggregation• Visualization• Eventually Correlation
University of Notre Dame
Sample Data analysis
• Analyzed FW flow related log data– Datacenter firewall over two days
• Analyzed all Unix system logon attempts– 30 systems over a month
University of Notre Dame
FW log analysis
• This analysis preceded the log farm and helped generate support for the project
• Some anomalies are characteristic of mis-configuration or intrusion– Source addresses talking to lots of destinations or
ports– Lots of failed traffic– Trying to connect to “dark net” addresses
University of Notre Dame
Uniform Format and Cleansing
• A uniform format needed– Timestamp, action, srcip, srcport, dstip,
dstport, proto, bytes-to-server, bytes-to-client
• Used two days of test data 1.6 million records
University of Notre Dame
Clustering
• Count of Destination IPs per Source IP
University of Notre Dame
Clustering
• The IPs associated with the outliers are, from the most extreme:– A system performing web port scan– Two systems performing ssh port scans– A legitimate monitoring server– Three workstations which likely have worm/virus– Load balancing switch health check probes– Web Statistics logging server that is running a recursive
DNS server to avoid impacting the performance of general University DNS servers.
University of Notre Dame
Unix logon features
• Timestamp• Destination IP• Auth Success/Failure• User• Source IP• Service (ssh, etc.)• Session duration (future)
University of Notre Dame
Unix logon Aggregations
• 45K events in sampled month • About 1000 systems talking to 30 servers• User Perspective
– user, #services, #srcIPs, #success, #failed
• IP Perspective– SrcIP, #users, #success, #failed
University of Notre Dame
Visualizing data with WEKA
• WEKA – data mining tool from University of Waikato
• Clustering– Looking for patterns in the data
• Classification– Hopefully be able to identify interesting records
based on clustering
• Visualization
University of Notre Dame
User scanningU
ser
Nam
e
Source IP Address
University of Notre Dame
Data analysis – brute forceT
ime
Sta
mp
Authentication Accepted/Failed
University of Notre Dame
Operational Process (not implemented yet)
Central log storage
Preprocess into feature
based format
Run Classifier or clustering
Flag anomalies for review
Previous Intervals
Processed Data
Extract logs by time
interval and type
Repeat by time interval
University of Notre Dame
Commercial Log tools
• Commercial log analysis– Used to expedite PCI implementation– Alerts on events– Nice interface, but still requires effort to get
meaningful information– Somewhat analogous to IDS but with multiple
data sources in multiple formats– Doesn’t support custom formats
University of Notre Dame
Lessons learned(logging is hard work)
• Need collaboration among groups• Need to learn about your data, systems, and
network• Analysis is not automatic• It’s a formidable project to get information• Identify what questions you want answered• Forensics is straight forward
University of Notre Dame
Outcomes
• Seven collectors up and running• Majority of Datacenter Servers logging• Collecting 10 GB of data daily• Learning what is normal• Beginning the process of analyzing the data
– Sudo and su activity– Analyzing Unix auth data and firewall data
University of Notre Dame
Future
• Phase II– Enhance data understanding– Make analysis production (beginning of SIM)– Report on Operational Metrics
• Phase III– Realtime alerting– Operationalize event correlation
University of Notre Dame
Questions?