26
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ben Hagen, Cloud Security Operations @ Netflix June 21, 2016 Reactive Cloud Security Toward Self-Defending Cloud Environments

Reactive Cloud Security | AWS Public Sector Summit 2016

Embed Size (px)

Citation preview

Page 1: Reactive Cloud Security | AWS Public Sector Summit 2016

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Ben Hagen, Cloud Security Operations @ Netflix

June 21, 2016

Reactive Cloud Security

Toward Self-Defending Cloud Environments

Page 2: Reactive Cloud Security | AWS Public Sector Summit 2016

Introductions, because they

matter.

Page 3: Reactive Cloud Security | AWS Public Sector Summit 2016

Me

● Bachelor’s in Political Science, International Studies,

Minor in Mandarin Chinese

● Master’s in Information Assurance

● Security Operations Center at Motorola

● Consulting at Motorola and Neohapsis

● Security at Obama 2012

● Security Operations at Netflix

Page 4: Reactive Cloud Security | AWS Public Sector Summit 2016

Netflix

● 81+ million members

● Supporting 1,000+ device types

● Available in every* country

● Concurrent delivery from 3 global regions

● > 1/3 of all US broadband

● 1,000+ of developers/1,000s of applications

● A very large monthly AWS bill

● High elasticity

Page 5: Reactive Cloud Security | AWS Public Sector Summit 2016

Netflix

● Application owners “own” their own DevOps

● Immutable server pattern

● Everything scales

● The average TTL of an instance is < 3 days

Page 6: Reactive Cloud Security | AWS Public Sector Summit 2016

Security @ Netflix

● A paved road

● Enablers not blockers

● Application owners “own” their security; Security teams

help them make the right choices

●❤️❤️ Self-service, automation, and architecture ❤️❤️

Page 7: Reactive Cloud Security | AWS Public Sector Summit 2016

Let’s talk about reactive cloud

security

Page 8: Reactive Cloud Security | AWS Public Sector Summit 2016

The old model

● A network firewall blocks traffic

● An intrusion prevention system blocks traffic

● A web application firewall blocks traffic

● Authentication/authorization blocks access

Page 9: Reactive Cloud Security | AWS Public Sector Summit 2016

Block, block, block, block ...

Page 10: Reactive Cloud Security | AWS Public Sector Summit 2016

We can do better.

Page 11: Reactive Cloud Security | AWS Public Sector Summit 2016

What is Reactive Cloud Security?

● Environments should be architected for change

● Security models should understand and leverage these

changes

● Reactive Cloud Security should ...• Understand the context of events within your environment

• Automatically adapt the environment based on security

conditions

Page 12: Reactive Cloud Security | AWS Public Sector Summit 2016

That sounds great. What are

some examples?

Page 13: Reactive Cloud Security | AWS Public Sector Summit 2016

Environmental changes

● Scale an Auto Scaling group

● Modify security groups

● Adjust AWS Identity and Access Management (IAM)

object privileges

● Turn on/off logging

● Isolate a system

● Tag a system

● Redeploy a system

● Shift traffic

● ...

Page 14: Reactive Cloud Security | AWS Public Sector Summit 2016

OK. I get it. But how does it

work?

Page 15: Reactive Cloud Security | AWS Public Sector Summit 2016

The easy stuff: binary conditions

● There are things about your environment which should

never change

● AWS CloudTrail should always be on

● Administrators should always have high privileges

● External traffic should only be terminated on Elastic

Load Balancing load balancers

● SSL certificates should always be valid

● ...

Page 16: Reactive Cloud Security | AWS Public Sector Summit 2016

Less easy stuff: fuzzy conditions

● There are things about your environment that could

change

● Web server CPU load should never exceed X%

● Patterns of inter-application traffic

● Engineers/administrators logging into systems

● API access patterns

● Inbound/outbound traffic patterns

● ...

Page 17: Reactive Cloud Security | AWS Public Sector Summit 2016

Laying the groundwork: AWS

● AWS CloudTrail• Make sure CloudTrail is turned on ... for all the things

• Stream to CloudWatch logs (> 10 min latency)

• Use CloudWatch Events when you can (< 1 min latency)

• Connect both to AWS Lambda functions monitoring for specific

conditions

● AWS Lambda functions identify, log/notify, and react to

these conditions• Create specific “OK” conditions, break glass buttons, etc.

Page 18: Reactive Cloud Security | AWS Public Sector Summit 2016

Laying the groundwork: Non-AWS events

● Requires a robust, reliable, and (programmatically)

accessible logging infrastructure

● Access logs, authentication logs, performance logs, etc.

● A leveragable pipeline ... ELK is a good start, but not

appropriate for everything• CloudWatch Logs, CloudWatch Metrics, Datadog, Statsd,

$plunk, New Relic, etc.

● At Netflix we use Atlas, ES, and other big data pipelines

(https://github.com/Netflix/atlas)

Page 19: Reactive Cloud Security | AWS Public Sector Summit 2016

Strategy is important.

Page 20: Reactive Cloud Security | AWS Public Sector Summit 2016

Three categories of events

#1 Fully automatable #2 Almost automatable #3 Never automatable

Page 21: Reactive Cloud Security | AWS Public Sector Summit 2016

Please talk about some more

relevant buzzwords.

Page 22: Reactive Cloud Security | AWS Public Sector Summit 2016

ChatOps

● Baby steps toward full reactive automation (for

managing bucket #2 type events)

● Use a single shared interface to facilitate notifications,

log work, provide context, and interact with tools

● Automation gets you the context and notification

● Humans approve and execute commands

● Two-factor is important!

Page 23: Reactive Cloud Security | AWS Public Sector Summit 2016
Page 24: Reactive Cloud Security | AWS Public Sector Summit 2016

Right sizing your environment

● Monitor your environment so that security policies

match reality• IAM roles (look out for RepoMan from Netflix)

• Security groups (working on something here too)

• Amazon S3 policies 😣

● Start off with more than you need during development

● Monitor for X days

● Adjust policy based on actual usage; expose this

information!

● Enable break-glass and self-service changes to

automation

Page 25: Reactive Cloud Security | AWS Public Sector Summit 2016

In closing ...

● Cloud environments and modern

development/deployment technologies can increase

Security

● Architect for flexibility and varying security conditions

● Seek to remove practices which can’t be automated

● ChatOps and right sizing are your friends

Page 26: Reactive Cloud Security | AWS Public Sector Summit 2016

Thanks!

Feel free to reach out:

[email protected]

[email protected]

... or yell at me publicly:

● @benhagen