Kapil Thangavelu - Cloud Custodian

  • View
    7.760

  • Download
    5

  • Category

    Internet

Preview:

Citation preview

Cloud CustodianFleet Management in AWS

Serverless

BUT

We still have Servers

Lots

And

Lots

:(

A sea of policies- fleet wide savings policies

- off hours stops for dev environments- garbage collect ebs, elb, etc- Detect over-provisioned resources

- numerous security policies- Encrypt all the Things- Access Control- ssl ciphers

- numerous compliance policies- tag compliance / chargeback- current images- backups

Source

Fleet ManagementAcross Lots of federated accounts.

Natural tendency- One off scripts-

But- How are they implemented- How are they deployed- How are they configured- How are they managed- Who owns them

Software Engineering- How are they Tested- Are they Reviewed

Who Knows? Source

Cloud Custodian•A rules engine for infrastructure management.

•YAML DSL for policies based on query resources or subscribe to events, apply filters, take actions.

Integrated Lambda provisioning and event sources.

•Outputs to Amazon S3, Amazon Cloud Watch Logs, Amazon Cloud Watch Metrics

Opensource @ https://github.com/capitalone/cloud-custodian

- name: require-rds-encrypt-and-non-public resource: rds mode:

- type: cloudtrail- events:

- CreateDBInstance filters:

- or: - Encrypted: false - PubliclyAvailable: true actions:

- type: delete skip-snapshot: true

Amazon Cloud Watch EventsFeatures

● Powerful infrastructure observation capabilities

● Enables “realtime” rules enforcement and reaction with wide coverage of AWS product APIs.

Sources

● All Cloud Trail Events (P99 @ 90s delivery window as of April 2016)

● EC2 instance state changes (600ms)● ASG instance membership changes

(600ms)● Periodic Scheduling (custom)● Custom events

Cloud CustodianResource type policies (ec2 instance, ami, auto scale group, bucket, elb, etc).

Filter resources

Invoke actions on filtered set

Output resource json to s3, metrics to cloudwatch

Vocabularies of actions, and filters for policy construction.

- name: ebs-copy-instance-tags resource: ebs filters: - type: value key: "Attachments[0].Device" value: not-null actions: - type: copy-instance-tags tags: - App - Env - Owner - Name

Filtering resourcesGeneric Value filter

- Jmespath expressions on resource’s json representation

- Lots of operator matching (in, not-in, absent, not-null, gte, regex, etc)

Arbitrary nesting of filters with ‘or’ and ‘and’ blocks.

Simple key/value are equality matches with value expressions

- type: value # Ignore keys that start with # 'aws:' as they don't count towards the limit. Key: "[length(Tags[?!starts_with(Key,'aws:')])][0]" op: less-than value: 10

- or: - “tag:App”: absent - “tag:Env”: absent - and: - Encrypted: false

Multi Step Workflows

“Poorly tagged instances, should be stopped in 1 day, and then terminated in 3”

- mark-for-op- marked-for-op

Chain together multiple policies.

- name: ec2-tag-compliance-mark resource: ec2 description: | Find all non-compliant tag instances for stoppage in 1 days. mode: type: periodic schedule: rate(1 day) filters: - "tag:maid_status": absent - or: - "tag:App": absent - "tag:Env": absent - "tag:Owner": absent actions: - type: mark-for-op op: stop days: 1

- name: ec2-tag-compliance-stop resource: ec2 description: | Stop poorly tagged and schedule Terminate. mode: type: periodic schedule: rate(1 day) filters: - type: marked-for-op op: stop - or: - "tag:App": absent - "tag:Env": absent - "tag:Owner": absent actions: - stop - type: mark-for-op op: terminate days: 4

Custodian Vocabulariesasg: actions: - delete - mark-for-op - rename-tag - suspend - tag - remove-tag - resume propagate-tags filters: - vpc-id - time - marked-for-op - not-encrypted - image-age - onhour - tag-count - offhour - launch-config

ec2: actions: - mark-for-op - remove-tag - snapshot - tag - start - tag-trim - stop - terminate filters: - ebs - marked-for-op - ephemeral - image - instance-age - onhour - tag-count - offhour - image-age

s3: actions: - attach-encrypt - remove-statements - encrypt-keys - encryption-policy - delete-global-grants filters: - missing-statement - global-grants - is-log-target - has-statement

Additional resource types

- RDS - ELB - Redshift - CloudFormation - AMI - EBS - EBS Snapshot

MetricsResource Count

Action Time

Query/Filter Time

Custom

Example Policy - Amazon S3 EncryptionRequire encryption for objects

name: s3-require-encryptionresource: s3description: | Apply encryption required policy to new bucketsmode: type: cloudtrail events: - CreateBucketactions: - encryption-policy - encrypt-keys

Find elb/s3 logs sinks and switch to lambda encrypt name: s3-remediateresource: s3description: | Encryption required policymode: type: periodic schedule: rate(1 day) filters: - type: is-log-targetactions: - attach-encrypt - type: remove-statements statement_ids: - RequireEncryptedPutObject

Roadmap- Elastic search indexing of records / outputs (programmatic dashboards /

historical trending)- Flourish ??- Cross Language support (lambda invoke actions)- Moar filters/actions/resources

https://github.com/capitalone/cloud-custodian/milestones