39
COLLECTING AND MOVING DATA AT SCALE Sada Furuhashi Chief Architect Invented Fluentd, Messagepack

DataEngConf SF16 - Collecting and Moving Data at Scale

Embed Size (px)

Citation preview

Page 1: DataEngConf SF16 - Collecting and Moving Data at Scale

COLLECTING AND MOVING DATA AT SCALE

Sada Furuhashi Chief ArchitectInvented Fluentd, Messagepack

Page 2: DataEngConf SF16 - Collecting and Moving Data at Scale

BACKGROUND

Page 3: DataEngConf SF16 - Collecting and Moving Data at Scale

HIGH LEVEL ANALYTICS ARCHITECTURE

Collect Store Process Visualize

Page 4: DataEngConf SF16 - Collecting and Moving Data at Scale

THE CHALLENGE

Collect Store Process Visualize

How do we shorten the collection process?

Easier & Shorter Time ExcelTableau

Page 5: DataEngConf SF16 - Collecting and Moving Data at Scale

THE PROBLEM

Page 6: DataEngConf SF16 - Collecting and Moving Data at Scale

TYPICAL ARCHITECTURE BEFORE FLUENTD

Log Server

Application

App Server

File FileFile

High latencyMust wait for a day

Hard to analyzeComplex text parsers

Application

App Server

File FileFile

Application

App Server

File FileFile

Page 7: DataEngConf SF16 - Collecting and Moving Data at Scale

THE FALSE SOLUTION

Page 8: DataEngConf SF16 - Collecting and Moving Data at Scale

MULTIPLY CONNECTIONS / COMBINATION EXPLOSION

LOGFile

script to parse data

cron job forloading

filteringscript

syslogscript

Tweet-fetching

script

aggregationscript

aggregationscript

script to parse data

rsyncserver

Page 9: DataEngConf SF16 - Collecting and Moving Data at Scale

THE SOLUTION

Page 10: DataEngConf SF16 - Collecting and Moving Data at Scale

CENTRALIZED CONNECTIONS

LOGFILE

Page 11: DataEngConf SF16 - Collecting and Moving Data at Scale

FLUENTD INTERNAL ARCHITECTURE

Page 12: DataEngConf SF16 - Collecting and Moving Data at Scale

INTERNAL ARCHITECTURE (SIMPLIFIED)

Plugin

Input Filter Buffer Output

Plugin Plugin Plugin

2012-02-04 01:33:51myapp.buylog{

“user”:”me”,“path”: “/buyItem”,“price”: 150,“referer”: “/landing”}

TimeTag

Record

Page 13: DataEngConf SF16 - Collecting and Moving Data at Scale

ARCHITECTURE: INPUT PLUGINS

HTTP+JSON (in_http)File tail (in_tail)Syslog (in_syslog)…

Receive logs

Or pull logs from data sources

In non-blocking manner

Plugin

Input

Page 14: DataEngConf SF16 - Collecting and Moving Data at Scale

Filter

ARCHITECTURE: FILTER PLUGINS

Transform logs

Filter out unnecessary logs

Enrich logs

Plugin

Encrypt personal dataConvert IP to countriesParse User-Agent…

Page 15: DataEngConf SF16 - Collecting and Moving Data at Scale

Buffer

ARCHITECTURE: BUFFER PLUGINS

Plugin

Improve performance

Provide reliability

Provide thread-safety

Memory (buf_memory)File (buf_file)

Page 16: DataEngConf SF16 - Collecting and Moving Data at Scale

ARCHITECTURE: OUTPUT PLUGINS

Output

Write or send event logs

Plugin

File (out_file)Amazon S3 (out_s3)MongoDB (out_mongo)…

Page 17: DataEngConf SF16 - Collecting and Moving Data at Scale

Buffer

ARCHITECTURE: BUFFER PLUGINS

Chunk

Plugin

Improve performance

Provide reliability

Provide thread-safety

Input

Output

Chunk

Chunk

Page 18: DataEngConf SF16 - Collecting and Moving Data at Scale

Retry

Error

Retry

Batch

Stream Error

Retry

Retry

DIVIDE & CONQUER & RETRY

Page 19: DataEngConf SF16 - Collecting and Moving Data at Scale

EXAMPLE USE CASES

Page 20: DataEngConf SF16 - Collecting and Moving Data at Scale

STREAMING FROM APACHE TO MONGODB PT I

in_tail /var/log/access.log

/var/log/fluentd/buffer

but_file

Page 21: DataEngConf SF16 - Collecting and Moving Data at Scale

ERROR HANDLING

in_tail /var/log/access.log

/var/log/fluentd/buffer

but_file

Buffering for any outputs Retrying automatically With exponential wait and persistence on a disk

Page 22: DataEngConf SF16 - Collecting and Moving Data at Scale

TAILING FILE INPUT

Supported formats:

Read a log file Custom regexp Custom parser in Ruby

• apache • apache_error • apache2 • nginx

• json • csv • tsv • ltsv

• syslog • multiline • none

pos fileaccess.log

Page 23: DataEngConf SF16 - Collecting and Moving Data at Scale

OUT TO MULTIPLE LOCATIONS

Routing based on tags Copy to multiple storages

bufferaccess.log

in_tail

Page 24: DataEngConf SF16 - Collecting and Moving Data at Scale

H.A. CONFIGURATION (HIGH AVAILABILITY)

Retry automatically Exponential retry wait Persistent on a disk

bufferAutomatic fail-over Load balancing

access.log

in_tail

Page 25: DataEngConf SF16 - Collecting and Moving Data at Scale

FOR HADOOP USERS

Retry automatically Exponential retry wait Persistent on a disk

access.logbuffer

Custom text formatter

Slice files based on time

2016-01-01/01/access.log.gz 2016-01-01/02/access.log.gz 2016-01-01/03/access.log.gz …

in_tail

Page 26: DataEngConf SF16 - Collecting and Moving Data at Scale

HADOOP INTEGRATION INTO S3

Retry automatically Exponential retry wait Persistent on a disk

buffer

Slice files based on time

in_tail

2016-01-01/01/access.log.gz 2016-01-01/02/access.log.gz 2016-01-01/03/access.log.gz …

access.log

Page 27: DataEngConf SF16 - Collecting and Moving Data at Scale

3RD PARTY INPUT PLUGINS

dstat

df AMQL

munin

jvmwatcher

SQL

Page 28: DataEngConf SF16 - Collecting and Moving Data at Scale

3RD PARTY OUTPUT PLUGINS

AMQL

Graphite

Page 29: DataEngConf SF16 - Collecting and Moving Data at Scale

REAL WORLD USE CASES

Page 30: DataEngConf SF16 - Collecting and Moving Data at Scale

HIGH-VOLUME FORWARDING

T R E A S U R ED A T A

-At-most-once / At-least-once -HA (failover) -Load-balancing

Page 31: DataEngConf SF16 - Collecting and Moving Data at Scale

NEAR REALTIME AND BATCH COMBO

Hot data

All data

Page 32: DataEngConf SF16 - Collecting and Moving Data at Scale

EXAMPLE CONFIGURATION FOR REAL TIME BATCH COMBO

Page 33: DataEngConf SF16 - Collecting and Moving Data at Scale

CEP FOR STREAM PROCESSING

Nora is a SQL based CEP engine: http://norikra.github.io/

Page 34: DataEngConf SF16 - Collecting and Moving Data at Scale

CONTAINER LOGGING

T R E A S U R ED A T A

Page 35: DataEngConf SF16 - Collecting and Moving Data at Scale

FLUENTD IN PRODUCTION

Page 36: DataEngConf SF16 - Collecting and Moving Data at Scale

MICROSOFT

Operations Management Suite uses Fluentd: "The core of the agent uses an existing open source data aggregator called Fluentd. Fluentd has hundreds of existing plugins, which will make it really easy for you to add new data sources."

Syslog

Linux Computer

Operating SystemApache

MySQLContainers

omsconfig (DSC)PS DSC

Prov

ider

s

OMI Server(CIM Server)

omsagent

Fire

wal

l / p

roxy

OM

S Se

rvic

e

Upload Data(HTTPS)

Pullconfiguration

(HTTPS)

Page 37: DataEngConf SF16 - Collecting and Moving Data at Scale

ATLASSIAN

"At Atlassian, we've been impressed by Fluentd and have chosen to use it in Atlassian Cloud's logging and analytics pipeline."

Kinesis

Elasticsearchcluster

Ingestionservice

Page 38: DataEngConf SF16 - Collecting and Moving Data at Scale

AMAZON WEB SERVICES

The architecture of Fluentd (Sponsored by Treasure Data) is very similar to Apache Flume or Facebook’s Scribe. Fluentd is easier to install and maintain and has better documentation and support than Flume and Scribe.

Types of DataStoreCollectTransactional • Database reads & write (OLTP)• Cache

Search • Logs• Streams

File • Log files (/val/log)• Log collectors & frameworks

Stream • Log records• Sensors & IoT data

Web Apps

IoT

Appl

icat

ions

Logg

ing

Mobile AppsDatabase

Search

File Storage

Stream Storage

Page 39: DataEngConf SF16 - Collecting and Moving Data at Scale

THANK YOU!