Upload
daniel-berman
View
658
Download
3
Embed Size (px)
Citation preview
ANALYZE THIS: ML AND LOGGING FOR MONITORING
MICROSERVICES
skb rides the rocket
kernel: xen_netfront: xennet: skb rides the rocket: 19 slots
Daniel Berman
• Product Evangelist @logzio• LAMPer, Docker, ELK• Speaker/Blogger (SitePoint,
DZone)• Meetup organizer: TLV-PHP, TLV-
ELK• Contact me:
@proudboffin | [email protected]
1-min on
• Log analysis company • ELK-as-a-Service• Enterprise grade: auto-
everything, security, multi-tenant
• Additional features: ELK Apps, S3 archiving, AI
Agenda
• Logs + logging background
• The challenges• Centralized logging
with ELK• Using machine
learning• Demo• Q & A
WHAT ARE LOGS?
Online user
behavior
IoTanalytic
s
Dev, monitoring & system
troubleshooting
Security and compliance
LOG ANALYTICS IS FUNDEMENTAL FOR UNDERSTANDING MACHINES
Security devices
App server
Network
LOG ANALYTICS FOR MICROSERVICES
• Service logs
10/01/17 00:53:51 INFO apollo i.l.c.b.c.b.MappedPageFactory: Page file
/tmp/logzio-logback-buffer/listener-metrics/logzio-logback-appender/data/page-
48.dat was just deleted.
• Service metrics
10/01/17 02:53:51 INFO apollo a.b.c.metrics: Account-Incoming, key: 126, value:
54321
LOG ANALYTICS FOR MICROSERVICES
• Host logs/metrics• Execution runtime logs
THE CHALLENGES WITH LOGGING
MICROSERVICES• Transient
• Distributed
• Independent
• Multilayered
LOGGING IN A DOCKERIZED WORLD
$ docker logs
2016-06-02T13:05:22.614090Z 0 [Note] InnoDB: 5.7.12 started; log sequence number
2522067
LOGGING IN A DOCKERIZED WORLD
$ docker stats
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O
BLOCK I/O
3747bd397456 0.01% 3.641 MB / 2.1 GB 0.17% 3.366 kB / 648 B
0 B / 0 B
396e42ba0d15 0.11% 1.638 MB / 2.1 GB 0.08% 9.79 kB / 648 B
348.2 kB / 0 B
468bf755240a 3.19% 45.67 MB / 2.1 GB 2.17% 25.19 MB / 17.95 MB
774.1 kB / 0 B
5f16814a3c0e 0.01% 495.6 kB / 2.1 GB 0.02% 8.564 kB / 648 B 0
B / 0 B
74cdfa7b8a0c 0.04% 3.908 MB / 2.1 GB 0.19% 2.028 kB / 648 B 0
B / 0 B
99bafb7600fc 0.00% 32.95 MB / 2.1 GB 1.57% 0 B / 0 B 2.093
MB / 20.48 kB
LOGGING IN A DOCKERIZED WORLD
$ docker daemon
time="2016-06-05T12:03:49.716900785Z" level=debug msg="received containerd event:
&types.Event{Type:\"exit\",
Id:\"3747bd397456cd28058bb40799cd0642f431849b5c43ce56536ab7f55a98114f\",
Status:0x0,
Pid:\"4120a7625a592f7c95eab4b1b442a45370f6dd95b63d284714dbb58f00d0a20d\",
Timestamp:0x57541525}"
OH, AND THERE’S THIS…
Large & complex application & operational logs
Multiple different formats
Multiple log files per component /
instance
SLOW& labor Intensive
Error-prone processing
Relies on an individual’s skills
Expensive
Hard to find what is relevant and important in log data
Scaling and securing
open-source implementation is
expensive and almost impossible to
scale
CENTRALIZED LOGGING TO THE
RESCUE
• Centralized data collection and management
management
• Provides inferable context to logs
• Analysis, event correlation and visualization
visualization
OLD SCHOOL LOGGING
$ grep ' 30[1234] ' /var/logs/apache2/access.log | grep -v
baidu | grep -v Googlebot
173.230.156.8 - - [04/Sep/2015:06:10:10 +0000] "GET /morpht HTTP/1.0" 301 26
"-" "Mozilla/5.0 (pc-x86_64-linux-gnu)"
192.3.83.5 - - [04/Sep/2015:06:10:22 +0000] "GET /?q=node/add HTTP/1.0" 301
26 "http://morpht.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1)
AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5"
NEW SCHOOL LOGGING
A BIT ABOUT ELK
• World’s most popular open source log
analysis platform
• 4.5M downloads a month!
• Centralized logging AND: search, BI, SEO,
IoT, and more
THE MARKET IS DOMINATED BY OPEN SOURCE SOLUTIONSOver the past 3 years, the market shifted attention from proprietary to open source
It’s simple to get
started and play with ELK, and the
UI is just beautiful
Simple and beautifulOpen Source/Flexible
Fast-growing community, no vendor lock-in and no license
cost
Blazing quick responses even when searching through millions of documents
Fast. Very fast.
ELK Stack500,000+ companies
15K companies
TYPICAL ELK PIPELINE
• Visualizations
and
dashboards
• Log shipper
• Collecting and
parsing
• Full-text search
and analysis
engine
• Scalable, fast,
highly available
• REST API
STEP 1 – INSTALLING ELK
https://hub.docker.com/r/sebp/elk/
elk:
image: sebp/elk
ports:
- "5601:5601"
- "9200:9200"
- "5044:5044"
$ sudo docker-compose up elk
https://github.com/deviantony/docker-elk
• Logging drivers (json-file, syslog, fluentd…)
STEP 2 – FORWARDING LOGS
$ docker run -d --name nginx --log-driver=syslog --log-opt syslog-
address=tcp://SYSLOG_IP:PORT -p 80:80 nginx:alpine
webserver:
image: nginx:alpine
container_name: nginx
ports: - "80:80" s
logging:
driver: syslog
options:
syslog-address=tcp://SYSLOG_IP:PORT
syslog-tag: "nginx"
• Logspout
$ docker run --name="logspout" \ --
volume=/var/run/docker.sock:/var/run/docker.sock \ gliderlabs/logspout
\ syslog+tls://167.23.145.12:55555
STEP 2 – FORWARDING LOGS
• Filebeat
yourapp:
image: your/image
ports:
- "80:80"
links:
- elk elk:
image:
sebp/elk
ports:
- "5601:5601"
- "9200:9200"
- "5044:5044"
STEP 2 – FORWARDING LOGS
• Configure Logstash (input, filter, output)
filter {
if [type] == "dockerlogs" {
if ([message] =~ "^\tat ") {
drop {}
}
grok {
break_on_match => false
match => [ "message", " responded with %{NUMBER:status_code:int}" ]
tag_on_failure => []
}
}
}
STEP 3 – PARSING
• DO NOT expose Elasticsearch(‘network.host’)
• Use proxies• Isolate
Elasticsearch• Change default
ports
STEP 4 – SECURITY
OTHER SOLUTIONS
• Hosted ELK (Logz.io, Elastic Cloud, Sematext)
• Other logging/monitoring SaaS (Datadog, Papertrail, Loggly)
THE BIG ELEPHANT (ELK) IN THE ROOM
• Not knowing what question to ask
• Needle in the haystack syndrome
• Logs cannot be analyzed by a human alone
• Anomaly detection does not work
ANOMALY DETECTION DOESN’T WORK
• Not every anomaly is an error• Not every error represents itself in
an anomaly• Apps run as step functions
ENTER MACHINE LEARNING?
DEMO TIME!
WHAT IS MACHINE LEARNING?
“Machine learning is a type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed.” (TechTarget)
SUPERVISED MACHINE LEARNING (BY EXAMPLE)1. Labeling – gathering and labeling logs
• User behavior• Inter-user similarities• Public resources
2. Training a classifier – defining what log is important
3. Integration within the system
‘skb rides the rocket’
kernel: xen_netfront: xennet: skb rides the rocket: 19 slots
(http://serverfault.com/questions/647489/what-is-causing-skb-rides-the-rocket-errors)
EXTRAS
• Logz.io blog:http://logz.io/blog
• Elastic docshttp://elastic.co/documentation
• Slack team: https://elk-stack-professionals-pfuiokfxqy.now.sh
• ELK meetup:https://www.meetup.com/Tel-Aviv-Yafo-ELK-ElasticSearch-Meetup/
THANKS!
@proudboffin | [email protected]