AWS re:Invent 2016: Cyber Resiliency – surviving the breach (SAC321)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Misha Govshteyn – Founder and Chief Security Officer | Alert Logic

Sven Skoog – Senior Manager IT Security | Monotype

November 29, 2016

SAC321

Cyber ResiliencySurviving the Breach

What to expect from the session

1. Before the breach1. Macro trends

2. Theory vs practice

1. Sample attacks

2. “Blast radius” and compartmentalization

3. Collect investigation data

4. Example

5. Collecting ammunition

2. After the breach1. Moving too fast

2. Theory vs practice

1. Constructing a (potential) case file

2. Before you act, ask yourself

3. Downside of uncertainty

4. An unflinching game plan

BEFORE THE BREACH

Significant increase in web app attacks

2015 2016

Adopt design patterns optimized for resiliency

• SQL injections of varying seriousness

• (first crude floor(), rand() functions, then deeper concat)

• Progressive Elasticsearch traversals

• (telltale ‘_search?q=random.’ ‘match_all,’ Java methods)

• Omnipresent SSH brute-force (where targets exposed)

• (Unlikely to succeed in keyed locales, but see below

• Occasional WordPress blog attacks; more on this later

What these sample attacks look like

• Get involved at design stage

• Limit blast radius of the breach to

constrain lateral movement

- Limit use/reuse of credentials

- Isolate applications

• Be prepared

- Continuously snapshot configuration state

- Configure IR specific accounts or keys

- Install IR tools in your base images and

enable automated event collection

- Make sure you understand your own logs

Blast radius and compartmentalization

• Distributed accounts, regions, zone

• (Monotype currently has ~22 sub-business-units)

• (AWS role inheritance is an obvious best practice)

• (Keep root accounts truly “sacred,” hardware MFA)

• Augment via Layer-7 (simple Elastic Load Balancing or full

WAF/WSM)

• Then the old classics ACL, security-grp, even .htaccess

• Lastly, protections in the app/endpoint-logic itself

(custom throttling, asymmetric keys, chroot jail(?))

Collect investigation data before the breach

• Establish normal log patterns prior

to breach

• Ensure logs are immutable – stored

outside of your environment, unable

to be deleted or modified by

attackers

• You’re ready when your logs are:

- Easily accessible

- Searchable

- Cloud aware

- Continuously monitored

AWS CloudTrail

AWS Config

System and app logs

Network telemetry

Example: mode of access and user agent

AWS CloudTrail

Collecting ammunition (1 of 2)

– Log everything (**everything**) in two or three places

(syslog, auth.log, MS Event/Sys/Security are a start)

(Apache/IIS webservers, Nginx, firewall/proxy too)

(Don’t forget meta usage… CloudTrail API/RBAC, etc.)

(Even DHCP, switch, VPN, if your storage can take it)

– Duplicate, triplicate, don’t punch, spindle, or mutilate

(At least two protected log copies, host-local, S3, other)

(If you choose not to encrypt logs, at least do checksum)

(Don’t skimp here, this is chain-of-custody stuff)

Collecting ammunition (2 of 2)

– Event reconstruction: when in doubt, bored, curious

(Sometimes useful… source attribution, be it IP or regex)

(Even just seeing “last 3 came from $PROVIDER” helps)

– Other helpful trivia: timing, frequency, adjacent targets

(Is this simple casual probing, or more directed activity?)

(Don’t be afraid to Google… Have others seen this too?)

– Remember those casual WordPress attacks earlier?

(Turns out they were reused blog/email credentials)

Going beyond logs into packet level detail

Reconstructing

successful table

enumeration by

inspecting full

HTTP sessions

and response

body*

* Requires IDS or full network packet capture tools

AFTER THE BREACH

Step 1: Cut the cord as soon as possible

well… maybe…

Actually, give it a minute or two

Downside of moving too fast

“There was a mistake made in the 2 hours

after the attack” James B. Comey Jr., the

director of the F.B.I., told lawmakers at a

hearing on the government’s attempt to force

Apple to help “unlock” the iPhone.

F.B.I. personnel apparently believed that by

resetting the iCloud password, they could get

access to information stored on the iPhone.

Instead, the change had the opposite effect –

locking them out and eliminating other means

of getting in.

Before you act, ask yourself:

• Do you really care about exposing the

data on breached systems?

• What is your primary objective?

• Is there a downside to quietly observing

the actions of the attacker?

Jump to Conclusions:

Start

Downside of uncertainty

Constructing a (potential) case file

– Odds are you are not seeing the incident at “time zero”

– Go back, try to build a behavioral baseline or chronology

– Non-repudiation is important too (we *didn’t* do XYZ)

– Don’t be afraid to do your own open-source investigation

(Google, Reddit, Stack Overflow, even just userid/email)

– Keep in mind: You are Tom Cruise, this is Minority Report

(Act as if this subject is or will soon commit[ting] a crime)

(Build a proto-case-file, might become future evidence)

(Warning: be mindful of EC 95/46, EU Reg 679, etc.)

Deployment and management

Impact largely depends on environment and data

Security group

Development

Production

What happens when developers are breached


•Two-man rule (duty separation) for all but smallest shops

(You’ll need to do this for a compliance framework later)

•No shared accounts, all individual creds, keep “root” sacred

•Move your enterprise into the era of MFA (non-SMS!)

•Trust but verify: Peruse auth.log, syslog, CloudTrail, etc.

(Particular focus… additions, creations, privilege edits)

•I contend 60% of this work is “learning the new normal”

(Familiarity with top talkers, typical activity, patterns)


Assume that at some point you will be breached

Operate from snapshots whenever you can

If possible, observe your adversary without tipping them off to understand full

extent of the breach and attacker intent

Use cloud networking tools to isolate compromised infrastructure and

orchestrate recovery efforts

Run your incident response team through regular, unannounced drills

An unflinching game plan (1 of 3)

•Misha says “blast zone”; I say “shields around the Enterprise”

•You cannot stop it all; you will eventually be compromised

Common threats

(Easily blocked)

Deeper or directed threats

(Defensible w. more resources)

Threats infeasible to fully prevent

(Didn’t foresee, too expensive)


Must demonstrate, uphold

pre/post-incident risk

tolerance

Compartmentalization(firewalls (layer-3, layer-7), filtering, separate credentials)

Sensors and Instrumentation(network traffic, user/behavioral… see it, escalate it)

Evidentiary chain-of-custody

logs, daily review

Extra cloud

countermeasures Extra layer

Layer 3

Layer 2

Layer 1


– Simplest test: If unsure, try your detection/filtration out

– Sven picks a casual SQL crawl or peculiar Jenkins entry

(Then ask “who did this, and when, for what purpose”)

(Do *not* accept “we don’t know” or “no logs available”)

(Incomplete, inconclusive results = clear indicator, dig in)

– Part of our job, as security champion, is stakeholder buy-in

(Fire drills establish operational tempo, raise seriousness)

– If you don’t do it, an auditor/underwriter will do it later

Thank you!

[email protected]

[email protected]

Remember to complete

your evaluations!

Technology

AWS re:Invent 2016: Cyber Resiliency – surviving the breach (SAC321)