Chris Kurtz Splunk Conf2014 Presentation

  • Upload
    xoff00

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    1/44

    Chris Kurtz

    System ArchitectArizona State Univer

    Using SplunkProtect Stud

    Faculty and University

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    2/44

    Disclaimer

    2

    During the course of this presentation, we may make forward looking statements regarding fu

    expected performance of the company. We caution you that such statements reflect our curre

    estimates based on factors currently known to us and that actual events or results could differ mafactors that may cause actual results to differ from those contained in our forward-looking state

    our filings with the SEC. The forward-looking statements made in the this presentation are being

    and date of its live presentation. If reviewed after its live presentation, this presentation may no

    accurate information. We do not assume any obligation to update any forward looking stateme

    addition, any information about our roadmap outlines our general product direction and is subj

    time without notice. It is for informational purposes only and shall not, be incorporated into an

    commitment. Splunk undertakes no obligation either to develop the features or functionality de

    any such feature or functionality in a future release.

    Additional Speaker Disclaimer: While I am speaking as an employee of Arizona State University,

    University nor dictate policy, procedures, or purchases. Any and all statements made in this pre

    alone, and do not in any way represent an official statement from ASU. The opinions and comme

    are entirely my own. ASU does not endorse or represent any product mentioned, up to and

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    3/44

    Agenda

    Introduction to me and Arizona State University

    About ASU About me

    Our Environment and our challenges

    Use Cases and Examples Protecting Direct Deposit, two versions

    Phishing as a teaching tool Leveraging your institutional data with lookups and apps

    Conclusion: Where weve been, where were going!

    3

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    4/44

    Introduction

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    5/44

    5

    Largest single University in the US

    More than 80,000 active students

    and another 20,000 accounts (faculty/staff, alumni, affili

    Located in Tempe, Arizona, suburb of Phoenix, 6th Largest

    Not located on the surface of the sunbut you can see it f

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    6/44

    Obligatory About Me: Profession

    6

    Unix/Linux System Administrator by trade, 23 years exper

    Supported NASA/JPL Mars projects at ASU for more than 1

    TES & THEMIS Instrument onboard Mars Global Surveyor & Mars

    MTES Instrument on the Mars Exploration Rovers Spirit and Opp

    ASUs Splunk Guy (System Architect) since early 2013

    Splunk Video Interview Value of Higher Education and Sp

    Author of the ISO 3166 Splunk App more on this later!

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    7/44

    Obligatory about me: Personal

    7

    Self-proclaimed Geek, whats it to ya?

    Steampunk Enthusiast (I made my

    hat, goggles, and the gun!) Beginning Maker (Steampunk and

    Arduino/Electronics)

    xoff on #splunk on efnet

    Little known fact about me:Clyde Tombaugh, the discoverer of

    Pluto, was a personal friend growing up

    http://about.me/chk

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    8/44

    8

    First Google Apps for Education customer

    Multiple campuses with a diverse IT infrastructure

    Many organic, home-grown, custom, and proprietary syst

    Large number of governing requirements:

    FERPA, HIPPA, DARPA, DoJ, NASA, JPL, etc

    Clear separation of responsibilities inside the University TeOffice: the Information Security Office (ISO) does not have

    the systems (and more importantly the logs) run by Opera

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    9/44

    The Power of SplunkSplunk as ASUs universal aggregator of all machine gener

    9

    Logs reside in multiple locations, depending on when and where the

    installed: web logs in one location, system logs multiple others (depeOS); some are on single log concentrator and some in an old, slow, unsupported proprietary search database. ISO requests logs for inchas to use the proprietary tools (or just as often, just grep through mlogfiles) based on ISO description and email/share logs. ISO likely harequest at least once.

    Typical response time to incident: multiple business days

    ISO directly accesses logs in Splunk, often using pre-built das

    alerts, and saved searches. Ops can concentrate on Operatio

    Typical response time to incident: minutes!

    Without

    Splunk

    With

    Splunk

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    10/44

    Splunk and Arizona State Univer

    10

    Licensing

    750gb/day Started at 50gb in Novem

    to 150gb in February 20

    to 500gb in June 2013

    to 750gb in July of 2014

    On track to reach 1TB this

    Infrastructure

    Physical Indexers in Cluster ~14TB in Hardware RAID10 NFS for Cold (being phased out) Architected for 1TB (10 indexers)

    Search Head Pooling 3 virtual servers (12 CPUs, 32gb) NFS SSD storage for shared data

    Virtual Support servers: Deployment Server License Manager Cluster Master

    The value of Splunk to the Informatiohas driven the rapid grow

    but other groups are starting to se

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    11/44

    We didnt know

    11

    To ASU, Splunk was like the invention omicroscope: we didnt know what we cou

    see.

    Martin IdaszakSecurity Architect, Arizona State Univer

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    12/44

    ProtectingDepo

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    13/44

    Use Case: Protecting Direct Depo

    13

    Being able to change your employee

    information online is a great convenience, but

    a target for hackers Because of ASUs international students,

    faculty, and staff, just blocking other countries

    isnt acceptable

    Splunk is the solution!

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    14/44

    How we did itbefore Splunk

    14

    1. Payroll gets a call that an employee didnt get their direc

    2. investigates, sees a foreign bank depositand contactsInformation Security Office.

    3. ISO changes the users password.

    4. ISO requests webserver single sign-on and HR system lo

    Operations and our HR Vendor (could take days!)5. Eventually details are discovered (compromised account

    is informed. Funds are long gone, and ASU has to re-issu

    employees check, eating the loss.

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    15/44

    How we did it with Splunk, Versio

    15

    1. Logs from webserver single sign-on and Peoplesoft now go to Splunk. No mo

    Operations to retrieve logs! This makes both ISO and Ops very happy!

    2. Splunk monitors for Direct Deposit changes via a schedule search, building a

    link the change back to the users webserver authentication. Ok, now we hav

    IP and a usernameso we run geolocation on the originating IP so its easier

    reports based on location of the change.

    Web auth DB records

    IP username

    Geo tagcountry

    user address

    A

    IP

    username

    state/country

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    16/44

    Version 1 stop here:

    16

    ISO creates a scheduled report of unusual originating IPs (Malaysia, etcPayroll before the close of each payroll run.

    Payroll contacts users with unusual changes for verification before payit was a fraudulent change, the change is reverted, so no funds are lost

    Even at this point, Payroll is ecstatic and saves over 30 hours per payrodirect deposit, and ASU saves tens of thousands of dollars per payroll r

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    17/44

    NowHow do we improve this

    17

    We asked the question: Where do you change your direct depo

    1. Home

    So, lets think about it:

    If your direct deposit changes from Malaysia, its probablybut what about Ohio, if you live in Arizona?

    2. Work

    Thats likely fraud, too!

    So lets leverage Splunks geolocation features!

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    18/44

    Version 2 (now in progress)

    18

    1. Starting with the originating IP and username from Version 1custom lookup tables (more later!) to leverage HR system dat

    lookup a usernames information: Name, address, etc.

    2. Geolocation information about the users home zip code (via code) is generated.

    3. Using a free Splunk App called haversine, we calculate the disbetween the users home (technically, the lat/lon of the cente

    zipcode) and the lat/lon of the IP the change was made from.both of these are a bit vague, but were really only looking for

    4. If the distance is unusual (~50 miles) the result will be flaggedreview automatically.

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    19/44

    Lessons learnedand you can do th

    19

    1. GET YOUR DATA INTO SPLUNK!

    2. One of the beautiful things about Splunk is that you can modify how the

    (field extractions, etc.) once its already in Splunk, and that applies to alrdata. The focus should be getting it into Splunk first, and figure out fieldsit as schema on demand!

    3. When you find people who get it use them to evangelize Splunk to othorganization.

    4. When you find people who resist, show them how much time and effortespecially interacting with other departments (if appropriate) by using Sp

    several people over when they discovered that the number of requests flike ISO dropped from 3-5 per week (each taking hours to do) to zero onwas in Splunk.

    5. Dont get caught up on use cases: Once you have the data in Splunk, upresent themselves again and again! Think of it as use case on demand!

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    20/44

    Flexibility

    20

    Its not only its schema-on-the-fly,

    its use-case-on-the-fly.

    - Barak Reeves

    Splunk Sales Engineer, Team TK-421

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    21/44

    Phishingteaching

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    22/44

    Use Case: Phishing as a teaching

    22

    As a public University, a large amount of ourinformation is mandated to be publically available,including a directory of email addressesand we have

    over 100,000 users, and each can have as many emailaddresses as they want

    This means ASU receives a lot of email: In fact, we usedSplunk to determine exactly how much. In the last 12months, ASU received more than ONE BILLION emailmessages, and more than 750 million of them were

    spam and phishing!

    As usualSplunk is the solution!

    ManPie

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    23/44

    Phishing and ASU

    Inbound

    Phishing

    Email

    Mail

    Filter

    Email

    Stored

    User clickson phishing link

    Firewall blocks some

    Some gets through

    Firewall

    h d

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    24/44

    ASU is hard to protect

    24

    ASU, as an entity, is very hard to protect. We have students facross the world, and by their nature, they are very transienmove apartments, dorms, travel the US and abroad, and accsystems from almost everywhere. Unlike most corporations,assume that access to ASU from Nigeria, China, or Malaysia attemptsin fact, its probably legitimate!

    One of the very first things we saw with Splunk were logins on camIndia for the same user on the same day. What was this? Hacking? people using the same login? Turns out Indian students often gave t

    (gasp!) to their parents, who insisted on it, so the parents could regugrades! This let another project to provide limited access to second

    (just for this purpose) know that their efforts were valid and n

    h d h

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    25/44

    Use the data you have

    25

    To protect ASU from spam, we use Barracuda Spam& Virus Firewalls, but there is no Splunk app (yet) sowe make custom field extractions from the

    Barracuda logs.but ASU does not store user emails in Splunk, onlythe headers of the messages that transit our system.

    Do managers

    if a product is

    We regularly

    show that ot

    are doing t

    Seems legit?

    Phi hi d ASU

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    26/44

    Phishing and ASUCorrelate Firewall information w

    to get a list of every user who c

    phishing link.

    Firewall log Email log

    IP Bad URL userEmail with link

    Table of u

    CMDB for contact

    d l t d t bi !

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    27/44

    and let your data combine!

    27

    BUTASU also uses Palo Alto firewalls to protect our users. These firewalls very often cathat users click on, either via mistake or lack of understanding and we correlate that Pwith our mail logs to get a list of every user who clicked on a phishing link.

    The ISO can then directly contact the users who clicked on a phishing link, explain to the

    change their password (and probably run a virus/malware scan), and use the opportuniuser why what they did was bad. The users are thankful that the University is watching osome of the potential victims have become our best reporting sources for received phisphishing emails!

    This too is being automated! We plan to use workflows to allow ISO to easilpotentially compromised account in Splunk, which (via a REST API call to ou

    system) is automatically disabled and (via another REST API) a ticket is creat

    helpdesk, so they can explain the situation to the user when they call in bec

    password no longer works.

    V i 2 ( i )

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    28/44

    Version 2 (now in progress)

    28

    1. ISO actively follows phishing links (from a secure and isolated VirMachine) and enters bogus credentials. We are now using Splunk

    attempted logins using those honeypot credentials. These active then blocked on the Palo Alto Firewalls in a quick but manual proprotects users who might click on the phishing. Eventually, we plaautomate this using Splunk workflows that let ISO directly block sdifferent types of attackers from Splunk, using the Palo Altos API

    2. ASU is investigating using honeypot full email accounts that will b

    from the public directory and then sent spam/phishing attemptsusers. The plan is to use Splunk to index the entire email, so we wfull body of phishing and spam emails as well as headers. Phishinidentified would be blocked using a workflow to the Palo Alto APand the from addresses would be blocked on the Barracudas wit

    L l d d d th

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    29/44

    Lessons learnedand you can do th

    29

    1. LEVERAGE YOUR DATA!

    2. Combining data from multiple sources is amazing! We use dat

    Barracuda Spam Firewalls as well as the Palo Alto Firewalls tomultiple points of visibility into phishing.

    3. Standardize your data! Follow Splunks Common Information that field names are consistent across data types. Once you resrc_ip, for example, exists in multiple datasets, the possibilitieout at you!

    4. Fill in the gaps. When you find gaps in your data models, workfill them in. For us, its the honeypot registrations and full-emOnce we realized full-email indexing was possible (and easy!) new use cases appeared!

    V l f S l k

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    30/44

    Value of Splunk

    30

    This is the best tool weve seen in 10 ye

    - Jay Steed

    AVP for UTO Operations, Arizona State Uni

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    31/44

    Leveraging ycustom data

    The Power of Splunk!

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    32/44

    The Power of Splunk!

    32

    No schemas! This means if you need to alter your data structure (fiextractions, calculated fields, etc.) you can easily do it on the fly, an

    retroactive!No types! Splunk really doesnt care if 42 is a string or a number, divide 42 by 7 and get 6, or add a string to make it 42 is the answeeasily to modify a field or make a new one on the fly.

    Eval is your friend!

    RememberIt doesnt matter if data is from a logfile, database, tex

    output, or anything elsecombine it in any way you want, on the fl

    Why mention this? Because as a Splunk Admin always remember: the data structu

    If it doesnt work for your needs, change it on the fly!

    To correlate data,

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    33/44

    To correlate data,you have to have data to correlate

    33

    Having data from machine logs such as mailservers and fir

    great, its the first (and easiest) data to get into Splunk.

    Without a common key, there is no way to know that two

    data refer to the same individual.

    For ASU, the master datasource is the Data Warehouse. T

    databases contain the records for every student and empl

    Does the email [email protected] belong to John Bun

    Lookups from Databases

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    34/44

    Lookups from Databases

    34

    Isolated Splunk server running Database Connect (DBX) runs SQL Q

    several databases, and writes a series of lookup tables (with the a

    every 4 hours

    Linux ionotify monitors the lookup tables, and on write-close copi

    production systems (sanity checking applies)

    Data Warehouse Isolated Splunkrunning DBX

    Production Sp

    100000001, jbunbury7, John Bunbury, [email protected],

    100000002, jbunbury, Jane Bunbury, [email protected], em

    Problem is

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    35/44

    35

    Splunk (and most other applications) use the ISO3166 standard alpha-2 counUnited States, for example). This is standard for geolocation services in Splunk.

    Butour Oracle Databases for Student data get the data from the students, oftpassports. And machine-readable passports use the ISO3166 alpha-3 countrythere isnt a simple conversion!

    If the Country Code is not in the standard geolocation format, I cant do any gemeans the data is far less useful.

    I looked on the Splunk Apps site (http://apps.splunk.com) but didnt find a solu

    Problem is

    Country alpha-3 alpha-2

    United States USA US

    China CHN CN

    Nigeria NGA NG

    So, I wrote the app myself!

    http://apps.splunk.com/http://apps.splunk.com/
  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    36/44

    So, I wrote the app myself!

    36

    Very simple structure, but so useful!I took the online ISO 3166 country codes (3 kinds: alpha-3,alpha-2, and numeric) and built a lookup table, which I

    call in the dbquery search before outputting the lookuptable

    Loo

    alpha-2,al

    US,USA,84

    CN,CHN,1

    NG,NGA,5

    | dbquery "PS PRD" "SELECT EMPLID,CITY,STATE,POSTAL,COUNTRY_CODE FROM EDS_

    EMPLID CITY STATE POSTAL COUNTRY_CODE | lookup iso3166 iso3166_alpha-3 as CO

    | eval city=upper(substr(CITY,1,1)).lower(substr(CITY,2)) | rename STATE as region_na

    affiliate_id POSTAL as postal_code iso3166_alpha-2 as country_code | eval

    postal_code=if(country_code="US",substr(postal_code,1,5),postal_code) | tableaffiliate_id,city,region_name,postal_code,country_code | outputlookup affiliate_to_a

    Why bother publishing as an app?

    Because it might be useful to someone else, and at least 2 people have now

    Wow, thanks, that solves my problem!

    Building an App is simple!

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    37/44

    Building an App is simple!

    37

    1. In etc/apps, create a directory for your app, with appropriate (default is mandatory)

    2. All config files go in default nothing in local!

    3. Write an appropriate default/app.conf (look at other apps)

    4. Create a README file and other appropriate documentation.

    5. Package and test on a generic Splunk install for sanity (hint .spjust tgz files!)

    6. Upload to apps.splunk.com if something isnt right, itll let y

    7. Make sure to put the docs online!

    http://wiki.splunk.com/Community:Creating_your_first_appli

    My app took me about a day to do, including an obsessive amount of research on

    #splunk

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    38/44

    #splunk

    38

    It is days like today when I am stuck with of crappy software with horrible documen

    and support that I am very thankful that I

    the rest of my time dealing with Splun

    - David Shpritz (automine) Splunk IRC cha

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    39/44

    Conclusion

    The past and the future

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    40/44

    The past and the future

    40

    ASU has heavily invested in Splunk because it solves manyoutstanding issues, and a culture of how can we use Splu

    this? is developing.

    First round (FY14) of data onboarding concentrated on thethe Information Security Office. Second round (FY15) is foOperations needs, with some interesting use cases thrownappear.

    Splunk is expensive, but the savings in man hours, extremuse to validate other systems, and goals to replace antiqusystems has very much paid off.

    Get some help!

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    41/44

    Get some help!

    41

    Splunk Docs (http://docs.splunk.com) I use Splunk docs so much I have ashortcut to just search it. And if you do occasionally find something that is the links at the bottom to provide feedbackthe team is great at respondin

    Splunk Answers (http://answers.splunk.com) I always look (and often poshere before I contact support. Just looking at what others are posting is oftyou need to rephrase the question to find the answers you need. The usersanswers are the true heroes of Splunk. In fact there is only one group bette

    The Splunk Wiki specifically http://wiki.splunk.com/Things_I_wish_I_kne

    The #splunk IRC channel on efnet (http://wiki.splunk.com/Community:IRC

    it, Im a Splunk IRC junkie. This group is just the besta great mix of SplunkSplunk employees), customers, and professional services and hysterical to the crew: Piebob, cgales, ^Brian^, DaGryph, Coccyx, amrit, Duckfez, Yorokoautomine, starcher, jtrucks, and even Trex (a fellow ASUer).

    Also check out @splunk, @splunkdev, and @splunkanswers on Twitter!

    http://docs.splunk.com/http://docs.splunk.com/http://answers.splunk.com/http://wiki.splunk.com/Things_I_wish_I_knew_thenhttp://wiki.splunk.com/Things_I_wish_I_knew_thenhttp://wiki.splunk.com/Things_I_wish_I_knew_thenhttp://answers.splunk.com/http://docs.splunk.com/
  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    42/44

    42

    I look to the future because thats where

    going to spend the rest of my life.

    - George Burns

    Questions and mentioned link

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    43/44

    Questions and mentioned link

    43

    My Splunk App to do ISO 3166 translations:

    http://apps.splunk.com/app/1775/

    Free Splunk App to calculate distances on a globe (a Grea

    haversine calculation):

    http://apps.splunk.com/app/936/

    My Splunk Video:

    http://www.splunk.com/view/SP-CAAAJPW

  • 7/26/2019 Chris Kurtz Splunk Conf2014 Presentation

    44/44

    THANK