Upload
others
View
23
Download
0
Embed Size (px)
Citation preview
PRESENTATION TITLE GOES HERE
Scaling Splunk Log Analytics without the
Storage Headaches Veda Shankar
Red Hat
2 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
splunk> is Analytics for Machine Data
No predefined schema, no custom connectors, no RDBMS, no need to filter/forward.
>Web logs >Log4J, JMS, JMX
>.NET events >Code and scripts
>Configurations >syslog >SNMP
>netflow
>Configurations >Audit/query
logs >Tables
>Schemas
>Hypervisor >Vmware, EC2
>Guest OS, Apps >Cloud
>Configurations >syslog
>File system >ps, iostat, top
>Registry >Event logs
>File system >sysinternals
Logfiles Configs Messages
Traps Alerts
Metrics Scripts Tickets Changes
Linux/Unix Windows Networking Databases Applications Virtualization
& Cloud
>Click-stream data >Shopping cart data >Online transaction
data
Customer Facing Data
Outside the Datacenter
>Manufacturing, logistics…
>CDRs & IPDRs >Power consumption
>RFID data >GPS data
3 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Recognizing Threat Patterns
More log data x Longer retention period
------------------------------------- = Easier pattern recognition + Retention policy compliance
vs.
4 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Data Ingested x Retention Rate (Large Enterprise)
5 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
SPLUNK DATA AND STORAGE ARCHITECTURES
6 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Splunk Automatic Data Migration
HOT: Newly indexed data goes into a hot bucket, which is a bucket that's both searchable and actively being
written to. After the hot bucket reaches a certain size, it becomes a warm bucket ("rolls to warm"), and a new
hot bucket is created.*
WARM: Warm buckets are searchable, but are not actively written to. There are many warm buckets.*
COLD: Once the indexer has created some maximum
number of warm buckets, it begins to roll the warm buckets to cold based on their age.* Cold buckets are also searchable, and can be placed on lower latency,
lower cost media.
7 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Splunk Storage Approaches
8 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
DAS-only: Hot/Warm and Cold
Data
Shared-only: Hot/Warm and Cold
Data
Hybrid: Hot/Warm on DAS,
Cold on Red Hat Storage
Optimized for: • Performance •
Optimized for: • Managing all storage
in a single pool
•
Optimized for: • Cost • • Performance • • Managing >50TB data
sets
Splunk Storage Approach Strengths
9 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
DAS-only: Hot/Warm and Cold
Data
Shared-only: Hot/Warm and Cold
Data
Hybrid: Hot/Warm on DAS,
Cold on Red Hat Storage
Optimized for: •Performance •
Optimized for: •Managing all storage in a sin pool •
Optimized for: •Cost • •Performance • •Managing >50TB data sets
Benefits of Splunk Hybrid Storage Approach
Most recent data:
Highest-performance ingest and search
Aging data: Low-cost, high-capacity, good-performance
search
Cost Scalability: Scale compute costs independently from
scale-out storage capacity costs.
Admin Simplification: Reduce admin complexity and Indexer
downtime by managing aging data growth in a single, elastic pool of storage.
10 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
A New Hybrid Storage Solution for Splunk
10s of TB on Splunk server DAS
100s of TB on Red Hat Storage Cluster
Hot/warm data optimized for performance
Cold historical data optimized for cost/capacity/elasticity
software-defined storage on Cisco big data servers
11 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
RED HAT GLUSTER STORAGE
12 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Red Hat Gluster Storage
Connect clusters of standard x86 servers into a single pool of storage.
• Single, shared namespace with petabyte scalability
• Multi-protocol NAS interface along with Object access
• Data protection and HA at disk, server, and site levels
• Seamlessly extensible and self-healing
software-defined storage
13 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Gluster Concepts
Red Hat Confidential
14 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Gluster Brick
Red Hat Confidential
15 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Gluster Volume
Red Hat Confidential
16 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Gluster Elastic Hash Algorithm
∙ No central metadata ∙ No Performance Bottleneck ∙ Eliminates risk scenarios
∙ Location hashed on filename ∙ Unique identifiers, similar to md5sum
∙ The “Elastic” Part ∙ Files assigned to virtual volumes ∙ Virtual volumes assigned to multiple bricks ∙ Volumes easily reassigned on the fly
17 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
On-premise or in the cloud
hadoop HDFS*
file
native hi-perf
NFS CIFS (Samba)
api object
C lib gfapi
S3/Swift REST api
Clients Access
Primarily accessed as scale-out file storage with optional Swift object APIs
100’s of server nodes
Gluster Multi-Protocol Access
18 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Fault Tolerant Data Placement (distributed replicated volume)
Creates a fault tolerant distributed volume by mirroring the same file across two bricks
19 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
SPLUNK WITH RED HAT GLUSTER STORAGE
20 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Single Site Deployment with Splunk Index Replication
Red Hat Confidential
21 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Rare Term Streaming Search Tests
Red Hat Storage Server for elastic cold storage uniformly exceeded reference configuration on rare search terms under streaming load.
22 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
Dense Term Streaming Search Tests
Red Hat Storage Server for elastic cold storage displayed similar performance to reference configuration under streaming load for dense term searches.
23 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
CPU Percentage vs Concurrency
Red Hat Storage Server for elastic cold storage showed similar CPU utilization profiles to servers with only direct-attached storage.
24 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
EBF Full Run GlusterFS Client using Two and Four Brick Volumes
Red Hat Storage Server for elastic cold storage provides increased performance as the storage is scaled out across a greater number of nodes and bricks.
25 2015 Data Storage Innovation Conference. © Insert Your Company Name. All Rights Reserved.
THANK YOU