30
Copyright © 2016 Splunk Inc. SplunkZam! Chris Kammermann Service Engineering Infrastructure (Team Lead)

SplunkLive! London 2016 - Shazam

  • Upload
    splunk

  • View
    4.327

  • Download
    0

Embed Size (px)

Citation preview

Slide 1

SplunkZam!Chris KammermannService Engineering Infrastructure (Team Lead)

Copyright 2016 Splunk Inc.

Warning! This presentation may involve Audience Participation!

#

During SplunkLive!, you will be able to Shazam the following:

Audio or Bluetooth09:45 - 10:04 Shazam to find out more about me10:05 - 10:30 Fill out a survey on the Shazam presentation10:30 - 13:00 Splunk Conf15:30 - 17:00 Splunk Survey Monkey

Images All Day:Splunk .conf in OrlandoShazam For Brands

Shazamability Schedule

#

AgendaMe as represented by a Splunk dashboardShazam, the Audio/Visual/Beacon recognition appHow Shazam used Splunk in the beginningThe rebirth of Splunk, how Shazam is using it now and what is planned for the future

#

Index=personal sourcetype=anecdote novelty>0

#

#

Name that Song Sight, SoundOne of the worlds most loved and downloaded Mobile Apps120 million Monthly Active UsersShazaming leads to 400K music downloads every day! 5% of all music downloads originate via Shazam!Shazam can recognise TV Adverts, Advertising Billboards, Coke Cans (USA), KFC buckets (Australia), Print Magazines (esquire), even some TFL London Buses!

#

Sometime 4 Years AgoLegacy relational databases and a mass of unstructured data caused the following challenges for Shazam:

Inability to gain insight into the app or our customersLengthy data processing timesStruggled to react quickly

We began using Splunk to address these problems

#

Beacon DataEvery Handset, Wearable Device or PC that has Shazam installed sends Beacon data to servers in the cloudAlmost every button click will generate a beacon log Hundreds of Gigs per day of Beacon data is ingested into Splunk. Most events are searchable in Splunk < 4 seconds from the button click!At the time, Beacon data was 80+% of the data stored in Splunk

Few OS logs were being sent to Splunk

#

How Shazam used this dataShazam for TV campaign analysisA/B testingMusic Charts. Shazam top 20 Radio Show (Australia)Mobile App error analysisKey Monthly Reports ie: MAU

#

TV Advert Analytics Dashboard

#

A/B(C/D) TestingNew features in the App could could grow our MAU. Worst case is that these features annoy the users and they uninstall the app, never to return again!

Every fraction of a percent of our user base that is happy = potentially tens of thousands of users every month!

#

Ad Hoc or Targeted QueriesHow many devices in Japan have Bluetooth enabled?How many people in Los Angeles like this band?What songs have an artificially inflated tag count? Is someone or something trying to rig the charts?What song is popular 8th Avenue and 14th Street New York?How many people share their song find on Google Plus?

#

Good Enough for a Splunk Case Study!Splunk enables us to analyze all of our mobile app data without having to do batch processing or any other cost and time-intensive steps required of traditional business intelligence. Now we can change metrics or add new dashboards quickly and easily, and provide the latest results to our partners and internal stakeholders in real time. That just wasnt possible before.

Charles Henrich, EVP Engineering Shazam

#

Problems?Very first iteration, Architected sub optimally for performanceNot enough storage. The business wanted *ALL* the data for *ALL* time stored and searchable in Splunk!Hundreds of reports and dashboards created. Started to become bogged down. Our monthly active user report would take >1 week to run!Splunk admins moved onto other projects and the system was left to run quietly in the cornerOften exceeded the license limit

#

We solved this byAssigning the right people to support the platformScaled out and clustered our Indexer nodesUsed expired hardware. 600 unused (and out of warranty) servers that we could choose from. It shouldnt matter if a node fails as we can quickly re-provision and use Puppet to reconfigure. Savings of >80% when compared to buying new/supported servers.We shrunk the data coming in using the sedcmd facilityWe bought a bigger license!

#

ALL the Data for ALL the time!But we still couldnt store ALL the data for ALL the time in Splunk.

So we ended up accessing Long Term data that we stored in Amazon Red Shift using Splunk DB Connect.

#

Rebirth!July 2015Version 6.22!Bigger! Faster! Better!OS and other system logs now being ingested.DevOps insight screens being developedIngesting more and more logs from other systems as users get excited by the possibilities

#

New Uses: Heart Rate Monitor

Sparklines show co-related Nagios alerts!

In this situation we were able to identify a service the flooded the network switch once an hour which caused a common switch to drop packets

#

New Uses: What happened overnight?

#

#

#

#

New Uses: MAU and Capacity Predictions

A new release caused our 4 hourly tagging rate in a strategic developing market to jump. Splunk predict command shows the predicted impact of this change

#

Can Splunk predict the next #1 hit?Fun Question to ask!

One of our non Splunk systems can predict up to 33 days out the number one hit on the Billboard charts. There is a great video which explains this phenomenon https://www.youtube.com/watch?v=mcTPvxo8SXY

So far we have been unsuccessful in using Splunk Predictive Analytics on the music charts

#

New Use Case: Inventory(Q) How do we get a single view of hardware we are charged for? Our data sources are:our internal inventory database - accessed via REST API once a month excel spreadsheet from our external Data Center providerAmazon AWS

(A) Splunk! With the following Components:Splunk AWS AppREST API Modular Inputinputcsv command!

#

New Use: Animated Map???

Analysis of Top 10 Music Tracks over a 24 hour period

#

New Use: Animated Map???

Analysis of Top 10 Music Tracks over a 24 hour period

#

Future Use: DevOps, Anomaly DetectionKey objective is to release better code, quicker. Integrating, Git, JIRA, Jenkins, Puppet, Virtualisation and Container logs into Splunk.

We have known unknowns and sometimes unknown unknowns. Anomaly detection should help us identify these?!

#

THANK YOUUse Shazam to fill out a survey on this talk