42
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brandon Chavis, AWS Solutions Architect Ilan Rabinovitch, Director of Technical Community, Datadog 20 th September 2016 Monitoring Containers at Scale

Monitoring Containers at Scale - September Webinar Series

Embed Size (px)

Citation preview

Page 1: Monitoring Containers at Scale - September Webinar Series

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Brandon Chavis, AWS Solutions ArchitectIlan Rabinovitch, Director of Technical Community, Datadog

20th September 2016

Monitoring Containers at Scale

Page 2: Monitoring Containers at Scale - September Webinar Series

Agenda

Introduction to ECS

AWS Logging options: Cloudwatch Logs, Cloudwatch, Cloudtrail

Logging containers with Datadog

Page 3: Monitoring Containers at Scale - September Webinar Series

Amazon EC2 Container Service (ECS)

Container Managementat Any Scale

Flexible ContainerPlacement

Integrationwith the AWS Platform

Page 4: Monitoring Containers at Scale - September Webinar Series

Components of Amazon ECS

TaskOne or more containers running together on an Instance

Task DefinitionDefinition of containers and environment configuration

ClusterFleet of EC2 instances on which tasks run

Cluster ManagerManage cluster resource and state of tasks

SchedulerPlaces tasks onto cluster

AgentCoordinate EC2 instances and Manager

Page 5: Monitoring Containers at Scale - September Webinar Series

Cluster, Scheduler, Task Scheduler

ManagerCluster

Task Definition

Task

Agent

Page 6: Monitoring Containers at Scale - September Webinar Series

Monitoring & Logging

Page 7: Monitoring Containers at Scale - September Webinar Series

AWS logging tools:

CloudwatchCloudwatch LogsCloudtrail

Page 8: Monitoring Containers at Scale - September Webinar Series

CloudWatch Logs with awslogs driver

Amazon CloudWatch Logs

Amazon CloudWatch Logs

Amazon CloudWatch Logs

Amazon CloudWatch Logs

Amazon S3

Amazon Kinesis

AWS Lambda

Amazon Elasticsearch Service

Amazon ECS Store

Stream

Process

Search

Page 9: Monitoring Containers at Scale - September Webinar Series

CloudWatch Logs driver

Page 10: Monitoring Containers at Scale - September Webinar Series

Configuring Logging in Task Definition

logConfiguration task definition parameter

Requires version 1.18 or greater of the Docker Remote API

Maps to docker run --log-driver option

Log drivers: json-file, syslog, journald, gelf, fluentd, awslogs

Page 11: Monitoring Containers at Scale - September Webinar Series

Configuring Logging in Task Definition

"containerDefinitions": [ { "memory": 300, "portMappings": [ {

"hostPort": 80, "containerPort": 80 } ],

"entryPoint": [ "sh", "-c" ], "logConfiguration": {

"logDriver": "awslogs", "options": {

"awslogs-group": "awslogs-test", "awslogs-region": "us-west-2", "awslogs-stream-prefix": "nginx" }

}, "name": "simple-app", "image": "httpd:2.4", "command": [ "/bin/sh -c \"echo 'Congratulations! Your application is now running on a container in Amazon ECS.'

> /usr/local/apache2/htdocs/index.html && httpd-foreground\"" ], "cpu": 10 } ], "family": "cw-logs-example"

}

Page 12: Monitoring Containers at Scale - September Webinar Series

Monitoring with Amazon CloudWatch

Metric data sent to CloudWatch in 1-minute periods and recorded for a period of two weeks

Available metrics: CPUReservation, MemoryReservation, CPUUtilization, MemoryUtilization

Available dimensions: ClusterName, ServiceName

Page 13: Monitoring Containers at Scale - September Webinar Series

Monitoring with Amazon CloudWatch

Page 14: Monitoring Containers at Scale - September Webinar Series

Monitoring with Amazon CloudWatch

Page 15: Monitoring Containers at Scale - September Webinar Series

Monitoring with Amazon CloudWatch

Use the Amazon CloudWatch Monitoring Scripts to monitor additional metrics, e.g. disk space:

# Edit crontab> crontab -e

# Add command to report disk space utilization to CloudWatch every five minutes*/5 * * * * <path_to>/mon-put-instance-data.pl --disk-space-util --disk-space-used --disk-space-avail --disk-path=/ --from-cron

Page 16: Monitoring Containers at Scale - September Webinar Series

Logging Amazon ECS API with AWS CloudTrail

{ "eventVersion": "1.03", "userIdentity": {…}, "eventTime": "2015-10-12T13:57:33Z", "eventSource": "ecs.amazonaws.com", "eventName": "CreateCluster", "awsRegion": "eu-west-1", "sourceIPAddress": "54.240.197.227", "userAgent": "console.amazonaws.com", "requestParameters": { "clusterName": "ecs-cli" },

Page 17: Monitoring Containers at Scale - September Webinar Series

Logging Amazon ECS API with AWS CloudTrail

"responseElements": { "cluster": { "clusterArn": "arn:aws:ecs:eu-west-1:560846014933:cluster/ecs-cli", "pendingTasksCount": 0, "registeredContainerInstancesCount": 0, "status": "ACTIVE", "runningTasksCount": 0, "clusterName": "ecs-cli", "activeServicesCount": 0 } }, […]

Page 18: Monitoring Containers at Scale - September Webinar Series

Monitoring Amazon ECS with Datadog

Page 19: Monitoring Containers at Scale - September Webinar Series

• SaaS based infrastructure and application monitoring• Focus on modern environments

• Cloud, Containers, Micro Services• Processing nearly a trillion data points per day• Intelligent Alerting and Insightful Dashboards

Datadog Overview

Page 20: Monitoring Containers at Scale - September Webinar Series

Operating Systems, Cloud Providers (AWS), Containers, Web Servers, Datastores, Caches, Queues and more...

Monitor Everything

Page 21: Monitoring Containers at Scale - September Webinar Series
Page 22: Monitoring Containers at Scale - September Webinar Series

CloudWatch and ECS

ResourcesCPUReservationMemoryReservationCPUUtilizationMemoryUtilization

Page 23: Monitoring Containers at Scale - September Webinar Series
Page 24: Monitoring Containers at Scale - September Webinar Series

How do we get at the upper layers?

Page 25: Monitoring Containers at Scale - September Webinar Series

Pseudo-files

• Provide visibility into container metrics via the file system. • Generally under: /cgroup/<resource>/docker/$CONTAINER_ID/ or/sys/fs/cgroup/<resource>/docker/$CONTAINER_ID/

Page 26: Monitoring Containers at Scale - September Webinar Series

Pseudo-files: CPU Metrics$ cat /sys/fs/cgroup/cpuacct/docker/$CONTAINER_ID/cpuacct.stat> user 2451 # time spent running processes since boot> system 966 # time spent executing system calls since boot

$ cat /sys/fs/cgroup/cpu/docker/$CONTAINER_ID/cpu.stat> nr_periods 565 # Number of enforcement intervals that have elapsed

> nr_throttled 559 # Number of times the group has been throttled

> throttled_time 12119585961 # Total time that members of the group were throttled (12.12 seconds)

Pseudo-files: CPU Throttling

Page 27: Monitoring Containers at Scale - September Webinar Series

Docker API• Detailed streaming metrics as JSON HTTP socket

$ curl -v --unix-socket /var/run/docker.sock http://localhost/containers/28d7a95f468e/stats

Page 28: Monitoring Containers at Scale - September Webinar Series

STATS Command

# Usage: docker stats CONTAINER [CONTAINER...]$ docker stats $CONTAINER_ID CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O BLOCK I/Oecb37227ac84 0.12% 71.53 MiB/490 MiB 14.60% 900.2 MB/275.5 MB 266.8 MB/872.7 MB

Page 29: Monitoring Containers at Scale - September Webinar Series

Side Car Containers

Page 30: Monitoring Containers at Scale - September Webinar Series

Agents and Daemons

• Ideally we’d want to schedule an agent or daemon on each node via ECS Tasks.

• Current Solutions:1. Bake it into your image.2. Install on each host at provision time.3. Automate with User Scripts and Launch Configs

Page 31: Monitoring Containers at Scale - September Webinar Series

Grant Privileges via IAM$ aws iam create-role \ --role-name ecs-monitoring \ --assume-role-policy-document file://trust.policy

$ aws iam put-role-policy --role-name ecs-monitoring --policy-name ecs-monitoring-policy --policy-document file://ecs.policy

$ aws iam create-instance-profile --instance-profile-name ECSNode

$ aws iam add-role-to-instance-profile \ --instance-profile-name ECSNode \ --role-name ecs-monitoring

Page 32: Monitoring Containers at Scale - September Webinar Series

Create A User Script

Page 33: Monitoring Containers at Scale - September Webinar Series

Auto-Scale!

$ aws autoscaling create-launch-configuration --launch-configuration MyECSCluster --key-name my-key --image-id AMI_ID --instance-type INSTANCE_TYPE --user-data file://launch-script.txt --iam-instance-profile IAM_ROLE

Page 34: Monitoring Containers at Scale - September Webinar Series

Full Stack Monitoring

Docker API ECS & CloudWatch

Monitoring AgentContainer

Containers List &Metadata

Additional Metadata(Tags, events, etc)

Host Level Metrics

Page 35: Monitoring Containers at Scale - September Webinar Series

Monitoring Amazon ECS with Datadog

Page 36: Monitoring Containers at Scale - September Webinar Series

Aren’t we still missing a layer?

Page 37: Monitoring Containers at Scale - September Webinar Series

Operating Systems, Cloud Providers (AWS), Containers, Web Servers, Datastores, Caches, Queues and more...

Monitor Everything

Page 38: Monitoring Containers at Scale - September Webinar Series

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Service Discovery

Page 39: Monitoring Containers at Scale - September Webinar Series

Service Discovery

Docker API ECS & CloudWatch

Monitoring AgentContainer

A O A O

Containers List &Metadata

Additional Metadata(Tags, etc)

Config Backend

Integration ConfigurationsHost Level Metrics

Page 40: Monitoring Containers at Scale - September Webinar Series

Custom Metrics

• Instrument custom applications

• You know your key transactions best.

• Use async protocols like STATSD

Page 41: Monitoring Containers at Scale - September Webinar Series

Demo

Page 42: Monitoring Containers at Scale - September Webinar Series

Monday, October 24, 2016 JW Marriot Austin

https://aws.amazon.com/events/devday-austin

Free, one-day developer event featuring tracks, labs, and workshops around Serverless,

Containers, IoT, and Mobile

Q&A If you want to learn more, register for our upcoming DevDay Austin: