39
ANALYZE THIS: ML AND LOGGING FOR MONITORING MICROSERVICES

Machine Learning and Logging for Monitoring Microservices

Embed Size (px)

Citation preview

Page 1: Machine Learning and Logging for Monitoring Microservices

ANALYZE THIS: ML AND LOGGING FOR MONITORING

MICROSERVICES

Page 2: Machine Learning and Logging for Monitoring Microservices

skb rides the rocket

Page 3: Machine Learning and Logging for Monitoring Microservices

kernel: xen_netfront: xennet: skb rides the rocket: 19 slots

Page 4: Machine Learning and Logging for Monitoring Microservices

Daniel Berman

• Product Evangelist @logzio• LAMPer, Docker, ELK• Speaker/Blogger (SitePoint,

DZone)• Meetup organizer: TLV-PHP, TLV-

ELK• Contact me:

@proudboffin | [email protected]

Page 5: Machine Learning and Logging for Monitoring Microservices

1-min on

• Log analysis company • ELK-as-a-Service• Enterprise grade: auto-

everything, security, multi-tenant

• Additional features: ELK Apps, S3 archiving, AI

Page 6: Machine Learning and Logging for Monitoring Microservices

Agenda

• Logs + logging background

• The challenges• Centralized logging

with ELK• Using machine

learning• Demo• Q & A

Page 7: Machine Learning and Logging for Monitoring Microservices

WHAT ARE LOGS?

Page 8: Machine Learning and Logging for Monitoring Microservices

Online user

behavior

IoTanalytic

s

Dev, monitoring & system

troubleshooting

Security and compliance

LOG ANALYTICS IS FUNDEMENTAL FOR UNDERSTANDING MACHINES

Security devices

App server

Network

Page 9: Machine Learning and Logging for Monitoring Microservices

LOG ANALYTICS FOR MICROSERVICES

• Service logs

10/01/17 00:53:51 INFO apollo i.l.c.b.c.b.MappedPageFactory: Page file

/tmp/logzio-logback-buffer/listener-metrics/logzio-logback-appender/data/page-

48.dat was just deleted.

• Service metrics

10/01/17 02:53:51 INFO apollo a.b.c.metrics: Account-Incoming, key: 126, value:

54321

Page 10: Machine Learning and Logging for Monitoring Microservices

LOG ANALYTICS FOR MICROSERVICES

• Host logs/metrics• Execution runtime logs

Page 11: Machine Learning and Logging for Monitoring Microservices

THE CHALLENGES WITH LOGGING

MICROSERVICES• Transient

• Distributed

• Independent

• Multilayered

Page 12: Machine Learning and Logging for Monitoring Microservices

LOGGING IN A DOCKERIZED WORLD

$ docker logs

2016-06-02T13:05:22.614090Z 0 [Note] InnoDB: 5.7.12 started; log sequence number

2522067

Page 13: Machine Learning and Logging for Monitoring Microservices

LOGGING IN A DOCKERIZED WORLD

$ docker stats

CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O

BLOCK I/O

3747bd397456 0.01% 3.641 MB / 2.1 GB 0.17% 3.366 kB / 648 B

0 B / 0 B

396e42ba0d15 0.11% 1.638 MB / 2.1 GB 0.08% 9.79 kB / 648 B

348.2 kB / 0 B

468bf755240a 3.19% 45.67 MB / 2.1 GB 2.17% 25.19 MB / 17.95 MB

774.1 kB / 0 B

5f16814a3c0e 0.01% 495.6 kB / 2.1 GB 0.02% 8.564 kB / 648 B 0

B / 0 B

74cdfa7b8a0c 0.04% 3.908 MB / 2.1 GB 0.19% 2.028 kB / 648 B 0

B / 0 B

99bafb7600fc 0.00% 32.95 MB / 2.1 GB 1.57% 0 B / 0 B 2.093

MB / 20.48 kB

Page 14: Machine Learning and Logging for Monitoring Microservices

LOGGING IN A DOCKERIZED WORLD

$ docker daemon

time="2016-06-05T12:03:49.716900785Z" level=debug msg="received containerd event:

&types.Event{Type:\"exit\",

Id:\"3747bd397456cd28058bb40799cd0642f431849b5c43ce56536ab7f55a98114f\",

Status:0x0,

Pid:\"4120a7625a592f7c95eab4b1b442a45370f6dd95b63d284714dbb58f00d0a20d\",

Timestamp:0x57541525}"

Page 15: Machine Learning and Logging for Monitoring Microservices

OH, AND THERE’S THIS…

Large & complex application & operational logs

Multiple different formats

Multiple log files per component /

instance

SLOW& labor Intensive

Error-prone processing

Relies on an individual’s skills

Expensive

Hard to find what is relevant and important in log data

Scaling and securing

open-source implementation is

expensive and almost impossible to

scale

Page 16: Machine Learning and Logging for Monitoring Microservices

CENTRALIZED LOGGING TO THE

RESCUE

• Centralized data collection and management

management

• Provides inferable context to logs

• Analysis, event correlation and visualization

visualization

Page 17: Machine Learning and Logging for Monitoring Microservices

OLD SCHOOL LOGGING

$ grep ' 30[1234] ' /var/logs/apache2/access.log | grep -v

baidu | grep -v Googlebot

173.230.156.8 - - [04/Sep/2015:06:10:10 +0000] "GET /morpht HTTP/1.0" 301 26

"-" "Mozilla/5.0 (pc-x86_64-linux-gnu)"

192.3.83.5 - - [04/Sep/2015:06:10:22 +0000] "GET /?q=node/add HTTP/1.0" 301

26 "http://morpht.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)

AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5"

Page 18: Machine Learning and Logging for Monitoring Microservices

NEW SCHOOL LOGGING

Page 19: Machine Learning and Logging for Monitoring Microservices

A BIT ABOUT ELK

• World’s most popular open source log

analysis platform

• 4.5M downloads a month!

• Centralized logging AND: search, BI, SEO,

IoT, and more

Page 20: Machine Learning and Logging for Monitoring Microservices

THE MARKET IS DOMINATED BY OPEN SOURCE SOLUTIONSOver the past 3 years, the market shifted attention from proprietary to open source

It’s simple to get

started and play with ELK, and the

UI is just beautiful

Simple and beautifulOpen Source/Flexible

Fast-growing community, no vendor lock-in and no license

cost

Blazing quick responses even when searching through millions of documents

Fast. Very fast.

ELK Stack500,000+ companies

15K companies

Page 21: Machine Learning and Logging for Monitoring Microservices

TYPICAL ELK PIPELINE

• Visualizations

and

dashboards

• Log shipper

• Collecting and

parsing

• Full-text search

and analysis

engine

• Scalable, fast,

highly available

• REST API

Page 22: Machine Learning and Logging for Monitoring Microservices

STEP 1 – INSTALLING ELK

https://hub.docker.com/r/sebp/elk/

elk:

image: sebp/elk

ports:

- "5601:5601"

- "9200:9200"

- "5044:5044"

$ sudo docker-compose up elk

https://github.com/deviantony/docker-elk

Page 23: Machine Learning and Logging for Monitoring Microservices

• Logging drivers (json-file, syslog, fluentd…)

STEP 2 – FORWARDING LOGS

$ docker run -d --name nginx --log-driver=syslog --log-opt syslog-

address=tcp://SYSLOG_IP:PORT -p 80:80 nginx:alpine

webserver:

image: nginx:alpine

container_name: nginx

ports: - "80:80" s

logging:

driver: syslog

options:

syslog-address=tcp://SYSLOG_IP:PORT

syslog-tag: "nginx"

Page 24: Machine Learning and Logging for Monitoring Microservices

• Logspout

$ docker run --name="logspout" \ --

volume=/var/run/docker.sock:/var/run/docker.sock \ gliderlabs/logspout

\ syslog+tls://167.23.145.12:55555

STEP 2 – FORWARDING LOGS

Page 25: Machine Learning and Logging for Monitoring Microservices

• Filebeat

yourapp:

image: your/image

ports:

- "80:80"

links:

- elk elk:

image:

sebp/elk

ports:

- "5601:5601"

- "9200:9200"

- "5044:5044"

STEP 2 – FORWARDING LOGS

Page 26: Machine Learning and Logging for Monitoring Microservices

• Configure Logstash (input, filter, output)

filter {

if [type] == "dockerlogs" {

if ([message] =~ "^\tat ") {

drop {}

}

grok {

break_on_match => false

match => [ "message", " responded with %{NUMBER:status_code:int}" ]

tag_on_failure => []

}

}

}

STEP 3 – PARSING

Page 27: Machine Learning and Logging for Monitoring Microservices

• DO NOT expose Elasticsearch(‘network.host’)

• Use proxies• Isolate

Elasticsearch• Change default

ports

STEP 4 – SECURITY

Page 28: Machine Learning and Logging for Monitoring Microservices
Page 29: Machine Learning and Logging for Monitoring Microservices

OTHER SOLUTIONS

• Hosted ELK (Logz.io, Elastic Cloud, Sematext)

• Other logging/monitoring SaaS (Datadog, Papertrail, Loggly)

Page 30: Machine Learning and Logging for Monitoring Microservices

THE BIG ELEPHANT (ELK) IN THE ROOM

• Not knowing what question to ask

• Needle in the haystack syndrome

• Logs cannot be analyzed by a human alone

• Anomaly detection does not work

Page 31: Machine Learning and Logging for Monitoring Microservices

ANOMALY DETECTION DOESN’T WORK

• Not every anomaly is an error• Not every error represents itself in

an anomaly• Apps run as step functions

Page 32: Machine Learning and Logging for Monitoring Microservices

ENTER MACHINE LEARNING?

Page 33: Machine Learning and Logging for Monitoring Microservices

DEMO TIME!

Page 34: Machine Learning and Logging for Monitoring Microservices

WHAT IS MACHINE LEARNING?

“Machine learning is a type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed.” (TechTarget)

Page 35: Machine Learning and Logging for Monitoring Microservices

SUPERVISED MACHINE LEARNING (BY EXAMPLE)1. Labeling – gathering and labeling logs

• User behavior• Inter-user similarities• Public resources

2. Training a classifier – defining what log is important

3. Integration within the system

Page 36: Machine Learning and Logging for Monitoring Microservices

‘skb rides the rocket’

kernel: xen_netfront: xennet: skb rides the rocket: 19 slots

(http://serverfault.com/questions/647489/what-is-causing-skb-rides-the-rocket-errors)

Page 37: Machine Learning and Logging for Monitoring Microservices
Page 38: Machine Learning and Logging for Monitoring Microservices

EXTRAS

• Logz.io blog:http://logz.io/blog

• Elastic docshttp://elastic.co/documentation

• Slack team: https://elk-stack-professionals-pfuiokfxqy.now.sh

• ELK meetup:https://www.meetup.com/Tel-Aviv-Yafo-ELK-ElasticSearch-Meetup/

Page 39: Machine Learning and Logging for Monitoring Microservices

THANKS!

@proudboffin | [email protected]