Apache Flink Deep-Dive @ Hadoop Summit 2015 in San Jose, CA

Apache Flink™ deep-diveUnified Batch and Stream Processing

Robert Metzger@rmetzger_

Hadoop Summit 2015,San Jose, CA

Flink’s Recent History

April 2014 April 2015Dec 2014

Top Level Project Graduation

0.70.60.5 0.90.9-m1

What is Flink

3

Gelly

Table

ML

SA

MO

A

DataSet (Java/Scala) DataStream

Hadoop M

/R

Local Remote YARN Tez Embedded

Data

flow

Data

flow

(W

iP)

MR

QL

Table

Casc

adin

g

(WiP

)

Streaming dataflow runtime

Zep

pelin

Program compilation

4

case class Path (from: Long, to: Long)val tc = edges.iterate(10) { paths: DataSet[Path] => val next = paths .join(edges) .where("to") .equalTo("from") { (path, edge) => Path(path.from, edge.to) } .union(paths) .distinct() next }

Optimizer

Type extraction

stack

Task schedulin

g

Dataflow metadata

Pre-flight (Client)

MasterWorkers

Data Sourceorders.tbl

Filter

Map DataSourcelineitem.tbl

JoinHybrid Hash

buildHT probe

hash-part [0] hash-part [0]

GroupRed

sort

forward

Program

Dataflow GraphIndependent of batch or streaming job

deployoperators

trackintermediate

results

Layered Architecture allows plugging of components

Native workload support

5

Flink

Streaming topologies

Long batchpipelines

Machine Learning at scale

How can an engine natively support all these workloads?And what does "native" mean?

Graph Analysis

Low latency

resource utilization iterative algorithms

Mutable state

E.g.: Non-native iterations

6

Step Step Step Step Step

Client

for (int i = 0; i < maxIterations; i++) {// Execute MapReduce job

}

Teaching an old elephant new tricks Treat system as a black box

E.g.: Non-native streaming

7

streamdiscretizer

Job Job Job Job

while (true) { // get next few records // issue batch job}

Data Stream

Simulate stream processor with batch system

Native workload support

8

Flink

Streaming topologies

Long batchpipelines

Machine Learning at scale

How can an engine natively support all these workloads?And what does "native" mean?

Graph Analysis

Low latency

resource utilization iterative algorithms

Mutable state

Ingredients for “native” support

1. Execute everything as streamsPipelined execution, push model

2. Special code paths for batchAutomatic job optimization, fault tolerance

3. Allow some iterative (cyclic) dataflows4. Allow some mutable state5. Operate on managed memory

Make data processing on the JVM robust

9

Flink by Use Case

10

Stream data processingstreaming dataflows

11

Full talk tomorrow:3:10PM, Grand Ballroom 220AStream processing with Flink

Pipelined stream processor

12

StreamingShuffle!

Low latency Operators push data

forward

13

Expressive APIscase class Word (word: String, frequency: Int)

val lines: DataStream[String] = env.fromSocketStream(...)

lines.flatMap {line => line.split(" ").map(word => Word(word,1))} .window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS)) .groupBy("word").sum("frequency") .print()

val lines: DataSet[String] = env.readTextFile(...)

lines.flatMap {line => line.split(" ").map(word => Word(word,1))} .groupBy("word").sum("frequency") .print()

DataSet API (batch):

DataStream API (streaming):

Checkpointing / Recovery

14

Chandy-Lamport Algorithm for consistent asynchronous distributed snapshots

Pushes checkpoint barriersthrough the data flow

Data Stream

barrier

Before barrier =part of the snapshot

After barrier =Not in snapshot

(backup till next snapshot)

Guarantees exactly-once processing

Batch processingBatch on Streaming

15

16

Batch on an streaming engine

File in HDFS

Filter Map Result 1

Map Result 2

Batch program, completely pipelined Data is never materialized anywhere (in this example)

17

Batch on an streaming engine

MapOperator

MapOperator

MapOperator

Data Source (small)

Stream

Stream

Stream

Data Sink

Data Sink

Data Sink

JoinOperator

Stream build side

in parallel

Data Source (large)

Data Sink

in parallel (once build side finished)

Map

Stream probe side

Batch processing requirements

Get the data processed as fast as possible• Automatic job optimizer• Efficient memory management

Robust processing• provide fault-tolerance• again, memory management

18

Optimizer Cost-based optimizer Select data shipping strategy (forward, partition, broadcast) Local execution (sort merge join/hash join) Caching of loop invariant data (iterations)

19

case class Path (from: Long, to: Long)val tc = edges.iterate(10) { paths: DataSet[Path] => val next = paths .join(edges) .where("to") .equalTo("from") { (path, edge) => Path(path.from, edge.to) } .union(paths) .distinct() next }

Optimizer

Type extraction

stack

Pre-flight (Client)Data

Sourceorders.tbl

Filter

MapDataSour

celineitem.tbl

JoinHybrid Hash

buildHT

probe


GroupRed

sort

forward

Program

DataflowGraph

20

Two execution plans

DataSourceorders.tbl

Filter


JoinHybrid Hash

buildHT probe

broadcast forward

Combine

GroupRed

sort

DataSourceorders.tbl

Filter


JoinHybrid Hash

buildHT probe


hash-part [0,1]

GroupRed

sort

forwardBest plan

depends onrelative sizes of

input files

21

Memory Management

Operators on managed memory

22

Smooth out-of-core performance

23More at: http://flink.apache.org/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html

Blue bars are in-memory, orange bars (partially) out-of-core

Machine Learning AlgorithmsIterative data flows

24

26

Iterate in the Dataflow

API and runtime support Automatic caching of loop invariant

data

IterationState state = getInitialState(); while (!terminationCriterion()) {

state = step(state); } setFinalState(state);

Example: Matrix Factorization

27

Factorizing a matrix with28 billion ratings forrecommendations

More at: http://data-artisans.com/computing-recommendations-with-flink.html

Setups:• 40 medium instances ("n1-highmem-8" - 8

cores, 52 GB)• 40 large instances ("n1-highmem-16" - 16

cores, 104 GB)

28

Flink ML – Machine Learning Provide a complete toolchain

• scikit-learn style pipelining• Data pre-processing

various algorithms • Recommendations: ALS• Supervised learning: Support Vector Machines• …

ML on streams: SAMOA. We are planning to add support for streaming into ML

Graph AnalysisStateful Iterations

29

Graph processing characteristics

# o

f ele

men

ts u

pd

ate

d

iteration

31

Iterate natively with state/deltas Keep state in an controlled way by having a partitioned hash-

map Relax immutability assumption of batch processing

32

… fast graph analysis

More at: http://data-artisans.com/data-analysis-with-flink.html

33

Gelly – Graph Processing API

Transformations: map, filter, subgraph, union, reverse, undirected

Mutations: add vertex/edge, remove … Pregel style vertex centric iterations Library of algorithms Utilities: Special data types, loading, graph properties

Gelly and Flink ML:

Available in Flink 0.9 (so far only beta release) Still under heavy development Seamlessly integrate with DataSet abstraction

Preprocess data as neededUse results as needed

Easy entry point for new contributors

34

Closing

35

Flink Meetup Groups

SF Spark and Friends• June 16, San Francisco

Bay Area Flink Meetup• June 17, Redwood City

Chicago Flink Meetup• June 30

Stockholm, Sweden Berlin, Germany

36

Flink Forward registration & call for abstracts is open now

flink.apache.org 37

• 12/13 October 2015• Meet developers and users of

Flink!• With Flink Workshops / Trainings!

flink.apache.org@ApacheFlink

Technology

Apache Flink Deep-Dive @ Hadoop Summit 2015 in San Jose, CA