AWS CloudWatch

AWS CloudWatchCloud & Network Monitoring Services

DATA POINT - encapsulates the statistical data that Amazon CloudWatch computes from metric data.

METRICS – represent time-ordered set of data points. Shows CPU usage, memory status, disk usage etc. and custom data. Defined by name, namespace and dimensions. Stored for 2 weeks.

NAMESPACES – containers for metrics. String defining during metric creation. Within namespaces, metrics are isolated from each other. I.e: AWS/EC2, AWS/AutoScaling, AWS/SQS.

DIMENSIONS – pair of name and value to uniquely identify metric (like categories, filters). Using dimensions to filter result sets that CloudWatch queries. Relate to existing metric.

TIME RANGE - defined as point-to-point time range

Statistics

Statistics allow to cumulate many data points and present them as human readable visualization.

Metric data aggregation over specified period of time, based on data points. The starting and ending points can be as close together as 60 seconds, and as far apart as two weeks.

● Minimum – the lowest value during the specified period.● Maximum – the highest value during the specified period.● Sum – all values submitted for the matching metrics added together.● SampleCount – number of data points for calculation.● Average – Sum/SampleCount. Helps to increase/decrease resources as needed.

PERIOD – length of time associated with a specific statistic. Basic unit: second. Minimal value: 60. I.e, for statistics aggregated into ten-minute blocks, set Period to 600. Important for alarms. (aka kubełek)

What metrics do we use:

● EC2 (external) instance metrics

Basic Monitoring – 7 pre-selected metric at 5-minute frequency for free

Detailed Monitoring – all metrics with 1-minute frequency for charge

● EC2 (internal) instance metrics

Extra monitoring (i.e. EBS)

Additional metrics:

● ELB

● RDS

CASE STUDY: Internal vs External CPU usage

CPU Steal (Noisy Neighbour)

CPU usage measurement from CloudWatch perspective differs the measurement from inside the EC2 instance. The difference between these two metrics is what’s known as “CPU Steal”

Agent-based reporting (internal) shows how much you are using the instance.

Amazon reports total usage of instance (including other users exploitation) - external.

CloudWatch Logs

● System and customized logs measurement● Pattern searching (pseudo, case-sensitive, PHP-Apache support)● Logs groups ● Graph based on logs filters● Alarm setting when metric crosses specific threshold

LOG STREAM - data exchange channel between the Logs Agent and AWS

LOG GROUP - represents grouped logs streams

METRIC FILTER - assigned to group text pattern dynamically creating single metric

CloudWatch Logs Agent (awslogs) – system service monitoring and synchronizing logs with AWS. Config in: /etc/awslogs/awslogs.conf

Alarms

Automatically initiates planned actions when defined threshold occurs. One alarm watch single metric over specified period. The actions the alarm performs is: SNS or Auto Scaling policy.

You are receiving this email because your Amazon CloudWatch Alarm "Ingestor_ErrorsCount" in the EU - Ireland region has entered the ALARM state, because "Threshold Crossed: 1 datapoint (2.0) was greater than or equal to the threshold (1.0)." at "Wednesday 07 October, 2015 19:29:29 UTC".

View this alarm in the AWS Management Console:https://console.aws.amazon.com/cloudwatch/home?region=eu-west-1#s=Alarms&alarm=Ingestor_ErrorsCount

Alarm Details:- Name: Ingestor_ErrorsCount- Description:- State Change: INSUFFICIENT_DATA -> ALARM- Reason for State Change: Threshold Crossed: 1 datapoint (2.0) was greater than or equal to the threshold (1.0).- Timestamp: Wednesday 07 October, 2015 19:29:29 UTC

https://console.aws.amazon.com/cloudwatch/home?region=eu-west-1#s=Alarms&alarm=Ingestor_ErrorsCount

Thank You!

Internet

AWS CloudWatch