Upload
sumo-logic
View
574
Download
5
Embed Size (px)
Citation preview
Sumo Logic Confidential
Data Collection
June 2016
How-To Webinar
Welcome.To give everyone a
chance to successfully
connect, we’ll start at
10:05 AM Pacific.
Sumo Logic Confidential
At the completion of this webinar, you will be able to…
Design a Sumo Logic deployment that fits your organizationInstall CollectorsCreate your Data SourcesUnderstand Local File Configuration Management
Sumo Logic Confidential
High-Level Data Flow
Sumo Logic Confidential
Sumo Logic Data Flow
Data Collection Search & Analyze
Visualize & Monitor
Alerts
Dashboards
Collectors
Sources
Operators
Charts
1 2 3
Sumo Logic ConfidentialSumo Logic Confidential
Enterprise Logs are Everywhere
Custom App Code
Server / OS
Virtual
Databases
Network
Open Source
Middleware
Content
Delivery
IaaS, PaaS SaaS Securit
y
Sumo Logic Confidential
Designing Your Deployment• Sumo Logic Data
Collection is infinitely flexible.
• Design a Sumo Logic deployment that's right for your organization.
• Installed versus Hosted Collectors.
Sumo Logic Confidential
Host A
Collectors and Sources
Apache Access
Apache Error
Collector A Host B Collecto
r B Host C
Collector C
Apache Access
Apache Error
IIS Logs
IIS W3C Logs
Sumo Logic Confidential
Collectors
Sumo Logic ConfidentialSumo Logic Confidential
Collector and Deployment Options
CollectorCloud Data Collection
Centralized Data
Collection
Local Data Collection
Collector
CollectorCollector
Collector
Hosted Collectors Installed Collectors
Sumo Logic Confidential
Source Types
S3 Bucket Any data written to S3 buckets via AWS,
Lambda Scripts, custom Apps
HTTPS Akamai, Log Appender Libraries, etc.
Google Google API
Typical ScenariosAWS Only Customers, while it's possible to rely on Cloud Data Collection entirely, this is not typical. These source types are normally just part of the overall collection strategies
Benefits/Drawbacks+ No Software Installation- S3 Latency issues- Https Post Caching Need
Cloud Data CollectionMost Data is generated in the Cloud and by Cloud Services and is collected via Sumo Logics Cloud Integrations.
Sumo Logic Confidential
Local Data CollectionThe Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target
Hosts directly to Sumo Logic Backend via https connection.
Source Types
Local Files Operating Systems, Middleware, Custom
Apps, etc.Windows Events
Local Windows EventsDocker
Logs and StatsSyslog (dedicated Collector)
Network Devices, Snare, etcScript (dedicated Collector)
Cloud API’s, Database Content, binary data
Typical ScenariosCustomers with large amounts of (similar) servers, using orchestration/automation, mostly OS and application logs
- On Premise Datacenters- Cloud Instances
Benefits/Drawbacks+ No Hardware Requirement+ Automation
(Chef/Puppet/Scripting)- Outbound Internet Access
Required- Resource Usage on Target
Sumo Logic ConfidentialSumo Logic Confidential
Collector Deployment – Local Collectors
Sumo Logic Confidential
Source Types
Syslog Operating Systems, Middleware, Custom
Applications, etc
Windows Events Remote Windows Events
Script Cloud API’s, Database Content, binary data
Typical ScenariosCustomers with mostly Windows Environments or existing logging infrastructure (syslog/logstash)
- On Premise Datacenters
Benefits/Drawbacks+ No Outbound Internet Access+ Leverage existing logging
Infrastructure- Scale- Dedicated Hardware- Complexity (Failover, syslog
rules)
Centralized Data CollectionThe Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
Sumo Logic ConfidentialSumo Logic Confidential
Collector Deployment – Centralized Collector
Sumo Logic Confidential
Deployment Options SummaryCollector Benefits Drawbacks
Local
• Direct access to source logs• Ease of troubleshooting• No additional HW
requirements
• More Complex Management• Resource usage on target host• Need for outbound internet access
Centralized
• Fewer collectors and sources• Simplified management• Target hosts don’t need
outbound internet access
• Need for dedicated hardware• More complex setup (users,
permissions)• Harder to troubleshoot• Requires careful planning in order to
scale
Hosted• Agentless• Build it into your infrastructure
(S3)• Direct HTTP POST
• Requires local script to POST or curl messagesResources:
Design Your Deployment Best Practices: Local and Centralized Data Collection
Sumo Logic Confidential
Sources
Sumo Logic Confidential
Host A
Collectors and Sources
Apache Access
Apache Error
Collector A Host B Collecto
r B Host C
Collector C
Apache Access
Apache Error
IIS Logs
IIS W3C Logs
Sumo Logic ConfidentialSumo Logic Confidential
Defining a Source
A single Collector can have multiple Sources.
Key fields to define when configuring any Source type:• Name• Description• Historical Data• Source Host• Source Category• File path
– Excluding syslog• Timestamp Parsing
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Remote FileRequired for remote collection:• Listening port• Remote login credentials
– Username and password– Local SSH
• Absolute file path
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Syslog
Required for Syslog collection:• Protocol• Listening port
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Windows Event CollectionRequired for Windows Event Collection:• Remote specific:
– Remote host name(s)– Windows Domain– Username / password
• Windows Event Type
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Windows Performance CollectionRequired for Windows Performance Collection:• Remote specific:
– Remote host name(s)– Windows Domain– Username / password
• Frequency• Perfmon Queries
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: ScriptRequired for script based collection:• Execution frequency• Command type• Path to script • Script to execute• Working directory
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: HTTPRequired for HTTP Source:• How to treat incoming POST
requests
After Configuration:• Use URL to send POST
messages to the collector
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Amazon S3 and AWS sourcesRequired for Amazon S3:• IAM
– Key ID– Security Key
• Bucket name• Path expression• Scan interval
Sumo Logic ConfidentialSumo Logic Confidential
Configuration: Filtering Source Data• Regular expressions are used to create rules to filter data sent from a
Source.
• The filters affect only data sent to Sumo Logic; logs on your end remain intact.
• Filter Types– Exclude Filter (Black List)– Include Filter (White List)– Hash Filter (i.e. Replace credit card number with unique randomly
generated code)– Mask Filter (i.e. Mask each character with #)
– Note• Exclude filters override all other filter types for a specific value• Mask and hash filters are applied after exclusion and inclusion filters
Sumo Logic ConfidentialSumo Logic Confidential
Configuration: Filtering Files (Blacklisting)
• Blacklist files or set of files that shouldn’t be ingested
Sumo Logic Confidential
Metadata
Sumo Logic Confidential
Metadata Fields
Name Description
_collector Name of the collector this data came from
_source Name of the source this data came through
_sourceHost Hostname of the server this data came from
_sourceName Name of the log file (including path)
_sourceCategory
Category designation of source data
Tags added to your messages when data is collected
Host A
Apache Access
Apache Error
Collector A
Sumo Logic Confidential
Host A
Metadata Field Usage
Apache Access_sourceCategory = WS/Apache/Access
Apache Error_sourceCategory =WS/Apache/Error
Collector A Host B Collecto
r B Host C
Collector C
Apache Access_sourceCategory = WS/Apache/Access
Apache Error_sourceCategory =WS/Apache/Error
IIS Logs_sourceCategory =
WS/IIS
IIS W3C Logs_sourceCategory =
WS/IIS/W3C
Sample Searches for_sourceCategory:
= WS/Apache/Access = WS/Apache/* = WS/*
Sumo Logic ConfidentialSumo Logic Confidential
Source Category Best Practices• Recommended nomenclature for Source Categories
Component1/Component2/Component3…
• From least descriptive to most descriptiveNetworking/Firewall/Cisco/FWSMNetworking/Firewall/Cisco/ASANetworking/Firewall/PAN/PA7050Networking/Router/Cisco/2821
• Note: Not all types of logs need to have the same amount of levels.
• Benefits– Simple search scoping by using wild cards anywhere in the string– Simple, intuitive and self-maintaining partitions/index– Simple and self maintaining RBAC rules
• Blog Post: Good SourceCategory, Bad SourceCategory
Sumo Logic Confidential
Automation
Sumo Logic ConfidentialSumo Logic Confidential
Automating Deployments• Silent installation
Use sumo.conf Provide name, credentials and source file parameter for initial setup
only
• Local Configuration Collector Management Manage configuration locally using a JSON file with Chef/Puppet Available for both new and existing collectors
• Collector Management API Define an initial Source configuration for your Collectors using a JSON
file Retrieve and update Collector Configuration from an HTTP endpoint
Sumo Logic ConfidentialSumo Logic Confidential
Installed Collector Deployment Tips• Install using Collector Guidelines/Requrements
• Access Keys– Used for collector registration and API– ID/Key Pair instead of user/pass
• Especially important when storing credentials on disk
• Collector Logs– Logs in: $SUMO_HOME/logs– Current Log: $SUMO_HOME/logs/collector.log– Check for Out of Memory Errors– Increase memory if needed as described on Support Site Post
Sumo Logic Confidential
Questions?
Additional ResourcesSearch Video Library and Documentation
Search/Post to Community ForumsSearch, post, respondSubmit/vote for feature requestsSubmit Tips & Tricks
Open a Support Case
Sumo Logic ServicesCustomer Success, Professional Services, Training
Sumo Logic Confidential
Thank You!
April 2016