Machine Learning and Logging for Monitoring Microservices

Preview:

Citation preview

ANALYZE THIS: ML AND LOGGING FOR MONITORING

MICROSERVICES

skb rides the rocket

kernel: xen_netfront: xennet: skb rides the rocket: 19 slots

Daniel Berman

• Product Evangelist @logzio• LAMPer, Docker, ELK• Speaker/Blogger (SitePoint,

DZone)• Meetup organizer: TLV-PHP, TLV-

ELK• Contact me:

@proudboffin | daniel@logz.io

1-min on

• Log analysis company • ELK-as-a-Service• Enterprise grade: auto-

everything, security, multi-tenant

• Additional features: ELK Apps, S3 archiving, AI

Agenda

• Logs + logging background

• The challenges• Centralized logging

with ELK• Using machine

learning• Demo• Q & A

WHAT ARE LOGS?

Online user

behavior

IoTanalytic

s

Dev, monitoring & system

troubleshooting

Security and compliance

LOG ANALYTICS IS FUNDEMENTAL FOR UNDERSTANDING MACHINES

Security devices

App server

Network

LOG ANALYTICS FOR MICROSERVICES

• Service logs

10/01/17 00:53:51 INFO apollo i.l.c.b.c.b.MappedPageFactory: Page file

/tmp/logzio-logback-buffer/listener-metrics/logzio-logback-appender/data/page-

48.dat was just deleted.

• Service metrics

10/01/17 02:53:51 INFO apollo a.b.c.metrics: Account-Incoming, key: 126, value:

54321

LOG ANALYTICS FOR MICROSERVICES

• Host logs/metrics• Execution runtime logs

THE CHALLENGES WITH LOGGING

MICROSERVICES• Transient

• Distributed

• Independent

• Multilayered

LOGGING IN A DOCKERIZED WORLD

$ docker logs

2016-06-02T13:05:22.614090Z 0 [Note] InnoDB: 5.7.12 started; log sequence number

2522067

LOGGING IN A DOCKERIZED WORLD

$ docker stats

CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O

BLOCK I/O

3747bd397456 0.01% 3.641 MB / 2.1 GB 0.17% 3.366 kB / 648 B

0 B / 0 B

396e42ba0d15 0.11% 1.638 MB / 2.1 GB 0.08% 9.79 kB / 648 B

348.2 kB / 0 B

468bf755240a 3.19% 45.67 MB / 2.1 GB 2.17% 25.19 MB / 17.95 MB

774.1 kB / 0 B

5f16814a3c0e 0.01% 495.6 kB / 2.1 GB 0.02% 8.564 kB / 648 B 0

B / 0 B

74cdfa7b8a0c 0.04% 3.908 MB / 2.1 GB 0.19% 2.028 kB / 648 B 0

B / 0 B

99bafb7600fc 0.00% 32.95 MB / 2.1 GB 1.57% 0 B / 0 B 2.093

MB / 20.48 kB

LOGGING IN A DOCKERIZED WORLD

$ docker daemon

time="2016-06-05T12:03:49.716900785Z" level=debug msg="received containerd event:

&types.Event{Type:\"exit\",

Id:\"3747bd397456cd28058bb40799cd0642f431849b5c43ce56536ab7f55a98114f\",

Status:0x0,

Pid:\"4120a7625a592f7c95eab4b1b442a45370f6dd95b63d284714dbb58f00d0a20d\",

Timestamp:0x57541525}"

OH, AND THERE’S THIS…

Large & complex application & operational logs

Multiple different formats

Multiple log files per component /

instance

SLOW& labor Intensive

Error-prone processing

Relies on an individual’s skills

Expensive

Hard to find what is relevant and important in log data

Scaling and securing

open-source implementation is

expensive and almost impossible to

scale

CENTRALIZED LOGGING TO THE

RESCUE

• Centralized data collection and management

management

• Provides inferable context to logs

• Analysis, event correlation and visualization

visualization

OLD SCHOOL LOGGING

$ grep ' 30[1234] ' /var/logs/apache2/access.log | grep -v

baidu | grep -v Googlebot

173.230.156.8 - - [04/Sep/2015:06:10:10 +0000] "GET /morpht HTTP/1.0" 301 26

"-" "Mozilla/5.0 (pc-x86_64-linux-gnu)"

192.3.83.5 - - [04/Sep/2015:06:10:22 +0000] "GET /?q=node/add HTTP/1.0" 301

26 "http://morpht.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)

AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5"

NEW SCHOOL LOGGING

A BIT ABOUT ELK

• World’s most popular open source log

analysis platform

• 4.5M downloads a month!

• Centralized logging AND: search, BI, SEO,

IoT, and more

THE MARKET IS DOMINATED BY OPEN SOURCE SOLUTIONSOver the past 3 years, the market shifted attention from proprietary to open source

It’s simple to get

started and play with ELK, and the

UI is just beautiful

Simple and beautifulOpen Source/Flexible

Fast-growing community, no vendor lock-in and no license

cost

Blazing quick responses even when searching through millions of documents

Fast. Very fast.

ELK Stack500,000+ companies

15K companies

TYPICAL ELK PIPELINE

• Visualizations

and

dashboards

• Log shipper

• Collecting and

parsing

• Full-text search

and analysis

engine

• Scalable, fast,

highly available

• REST API

STEP 1 – INSTALLING ELK

https://hub.docker.com/r/sebp/elk/

elk:

image: sebp/elk

ports:

- "5601:5601"

- "9200:9200"

- "5044:5044"

$ sudo docker-compose up elk

https://github.com/deviantony/docker-elk

• Logging drivers (json-file, syslog, fluentd…)

STEP 2 – FORWARDING LOGS

$ docker run -d --name nginx --log-driver=syslog --log-opt syslog-

address=tcp://SYSLOG_IP:PORT -p 80:80 nginx:alpine

webserver:

image: nginx:alpine

container_name: nginx

ports: - "80:80" s

logging:

driver: syslog

options:

syslog-address=tcp://SYSLOG_IP:PORT

syslog-tag: "nginx"

• Logspout

$ docker run --name="logspout" \ --

volume=/var/run/docker.sock:/var/run/docker.sock \ gliderlabs/logspout

\ syslog+tls://167.23.145.12:55555

STEP 2 – FORWARDING LOGS

• Filebeat

yourapp:

image: your/image

ports:

- "80:80"

links:

- elk elk:

image:

sebp/elk

ports:

- "5601:5601"

- "9200:9200"

- "5044:5044"

STEP 2 – FORWARDING LOGS

• Configure Logstash (input, filter, output)

filter {

if [type] == "dockerlogs" {

if ([message] =~ "^\tat ") {

drop {}

}

grok {

break_on_match => false

match => [ "message", " responded with %{NUMBER:status_code:int}" ]

tag_on_failure => []

}

}

}

STEP 3 – PARSING

• DO NOT expose Elasticsearch(‘network.host’)

• Use proxies• Isolate

Elasticsearch• Change default

ports

STEP 4 – SECURITY

OTHER SOLUTIONS

• Hosted ELK (Logz.io, Elastic Cloud, Sematext)

• Other logging/monitoring SaaS (Datadog, Papertrail, Loggly)

THE BIG ELEPHANT (ELK) IN THE ROOM

• Not knowing what question to ask

• Needle in the haystack syndrome

• Logs cannot be analyzed by a human alone

• Anomaly detection does not work

ANOMALY DETECTION DOESN’T WORK

• Not every anomaly is an error• Not every error represents itself in

an anomaly• Apps run as step functions

ENTER MACHINE LEARNING?

DEMO TIME!

WHAT IS MACHINE LEARNING?

“Machine learning is a type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed.” (TechTarget)

SUPERVISED MACHINE LEARNING (BY EXAMPLE)1. Labeling – gathering and labeling logs

• User behavior• Inter-user similarities• Public resources

2. Training a classifier – defining what log is important

3. Integration within the system

‘skb rides the rocket’

kernel: xen_netfront: xennet: skb rides the rocket: 19 slots

(http://serverfault.com/questions/647489/what-is-causing-skb-rides-the-rocket-errors)

EXTRAS

• Logz.io blog:http://logz.io/blog

• Elastic docshttp://elastic.co/documentation

• Slack team: https://elk-stack-professionals-pfuiokfxqy.now.sh

• ELK meetup:https://www.meetup.com/Tel-Aviv-Yafo-ELK-ElasticSearch-Meetup/

THANKS!

@proudboffin | daniel@logz.io

Recommended