136
Flink Pure Streaming Paco Guerrero Big Data & Solutions Architect 9/21/16

Flink. Pure Streaming

Embed Size (px)

Citation preview

Page 1: Flink. Pure Streaming

Flink Pure Streaming

Paco GuerreroBig Data & Solutions Architect 9/21/16

Page 2: Flink. Pure Streaming

Not for Geeks

Page 3: Flink. Pure Streaming
Page 4: Flink. Pure Streaming

Life as Time

4

Page 5: Flink. Pure Streaming

Anything as Time

Page 6: Flink. Pure Streaming

Flink as Time

Page 7: Flink. Pure Streaming

Streaming vs Batch

7

Page 8: Flink. Pure Streaming

“Abstraction of reality used to facilitate information processing”

MicroBatch Batch

Page 9: Flink. Pure Streaming

Batch

Page 10: Flink. Pure Streaming

Batch

Page 11: Flink. Pure Streaming

Batch

All Input

Page 12: Flink. Pure Streaming

Batch

Batch Job

All Input

Page 13: Flink. Pure Streaming

Batch

Batch Job

All Input

All Output

Nothing about timeTimestamps used as trick to keep real time fingerprint

Page 14: Flink. Pure Streaming

Streaming

“Continuous processing of data that is continuously produced”

Page 15: Flink. Pure Streaming

Streaming

“Streaming is the next programming paradigm for data applications, and you need to start thinking in terms of streams”

“Continuous processing of data that is continuously produced”

Page 16: Flink. Pure Streaming

Streaming

“Streaming is the next programming paradigm for data applications, and you need to start thinking in terms of streams”

“Continuous processing of data that is continuously produced”

Data Stream: Infinite sequence of data arriving in a continuous fashion.

Page 17: Flink. Pure Streaming

Streaming

“Streaming is the next programming paradigm for data applications, and you need to start thinking in terms of streams”

“Continuous processing of data that is continuously produced”

Data Stream: Infinite sequence of data arriving in a continuous fashion.

Stream processing is the backbone of the new data infrastructure.

Page 18: Flink. Pure Streaming

Streaming

“Streaming is the next programming paradigm for data applications, and you need to start thinking in terms of streams”

“Continuous processing of data that is continuously produced”

Data Stream: Infinite sequence of data arriving in a continuous fashion.

Stream processing is the backbone of the new data infrastructure.

“The world beyond batch” A high-level tour of modern data-processing concepts. By Tyler Akidau

August 5, 2015 https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101

Page 19: Flink. Pure Streaming

Streaming

Page 20: Flink. Pure Streaming

Streaming

Streaming Job

Page 21: Flink. Pure Streaming

Streaming

Streaming Job

Page 22: Flink. Pure Streaming

Streaming

Streaming Job

Real Life Time !!

Page 23: Flink. Pure Streaming

Streaming is the biggest change in

data infraestructure since Hadoop

Streaming

Page 24: Flink. Pure Streaming

The biggest change is moving from

batch to streaming is handling time explicitly

Streaming

Page 25: Flink. Pure Streaming

Micro Batch

Page 26: Flink. Pure Streaming

Micro Batch

Page 27: Flink. Pure Streaming

Micro Batch

Batch Job 1

Page 28: Flink. Pure Streaming

Batch JobBatch Job 2

All Output

Batch Job 1

Micro Batch

Page 29: Flink. Pure Streaming

Batch JobBatch Job 2

All Input

All OutputBatch Job 3All Output

All Output

Batch Job 1

Batch Frequency ?Timestamps keeps real time fingerprint

Micro Batch

Page 30: Flink. Pure Streaming

Streaming Technologies

Batch StreamingMicro Batch

StateLess – Record acknowledgementsCPU bounded performanceNot expressive declarative functional API – Low Level APINot auto scalingLow level programmatic topology Poor Streaming Windows funcionalitiesNot compatible with Hadoop APIs

Streams

Page 31: Flink. Pure Streaming

Streaming Technologies

Batch StreamingMicro Batch

StateLess – Record acknowledgementsCPU bounded performanceNot expressive declarative functional API – Low Level APINot auto scalingLow level programmatic topology Poor Streaming Windows funcionalitiesNot compatible with Hadoop APIs

Streams

Page 32: Flink. Pure Streaming

Streaming Technologies

Batch StreamingMicro Batch

StateLess – Record acknowledgementsCPU bounded performanceNot expressive declarative functional API – Low Level APINot auto scalingLow level programmatic topology Poor Streaming Windows funcionalitiesNot compatible with Hadoop APIs

Streams

Page 33: Flink. Pure Streaming

Streaming Technologies

Batch StreamingMicro Batch

StateLess – Record acknowledgementsCPU bounded performanceNot expressive declarative functional API – Low Level APINot auto scalingLow level programmatic topology Poor Streaming Windows funcionalitiesNot compatible with Hadoop APIs

Streams

Page 34: Flink. Pure Streaming

Streaming Technologies

Page 35: Flink. Pure Streaming

Streaming Technologies

Batch StreamingMicro Batch

Page 36: Flink. Pure Streaming

Streaming Technologies

Batch StreamingMicro Batch

Page 37: Flink. Pure Streaming

Streaming Technologies

Batch StreamingMicro Batch

Page 38: Flink. Pure Streaming

Apache Flink

38

Page 39: Flink. Pure Streaming

FlinkOpen Source Stream Processing Framework. Last available Release 1.1.1

Top Level Apache Project since Dec '14

Page 40: Flink. Pure Streaming

FlinkOpen Source Stream Processing Framework. Last available Release 1.1.1

Top Level Apache Project since Dec '14

Main FeaturesNative Stream Low LatencyHigh throughputStatefulExactly-one guaranteesDistributedExpressive APIsAnd more ….

Page 41: Flink. Pure Streaming

FlinkOpen Source Stream Processing Framework. Last available Release 1.1.1

Top Level Apache Project since Dec '14

Main FeaturesNative Stream Low LatencyHigh throughputStatefulExactly-one guaranteesDistributedExpressive APIsAnd more ….

Page 42: Flink. Pure Streaming

Flink Flink

Page 43: Flink. Pure Streaming

Flink

Page 44: Flink. Pure Streaming

Flink Integration

Page 45: Flink. Pure Streaming

YARN upcoming...

Flink Integration

Page 46: Flink. Pure Streaming

Flink Integration

YARN upcoming...

Page 47: Flink. Pure Streaming

Flink Integration

YARN upcoming...

Page 48: Flink. Pure Streaming

upcoming...

Flink Integration

YARN upcoming...

Page 49: Flink. Pure Streaming

Flink Stack

Page 50: Flink. Pure Streaming

Flink Runtime Engine

Page 51: Flink. Pure Streaming

Flink Runtime Engine

Page 52: Flink. Pure Streaming

Distributed pipelined processing

Execute everything as Stream

Iterative ( cyclic ) dataflows

Mutable state in operations

Operate on managed memory (*)

Also works on batch !!

Job Manager

Client

Optimizer

Dataflow Graph

Flink Runtime Engine

Page 53: Flink. Pure Streaming

Distributed pipelined processing

Execute everything as Stream

Iterative ( cyclic ) dataflows

Mutable state in operations

Operate on managed memory (*)

Also works on batch !!Workers ( Task Managers )

Job Manager

Client

Optimizer

Dataflow Graph

Execution Graph

Flink Runtime Engine

Page 54: Flink. Pure Streaming

Stream Job

Batch Job

ML Job

Flink Runtime Engine

Graph Job

optimizer

optimizer

optimizer

optimizer

Page 55: Flink. Pure Streaming

Stream Job

Batch Job

ML Job

Flink Runtime Engine

Graph Job

optimizer

optimizer

optimizer

optimizer

Page 56: Flink. Pure Streaming

Tasks scheduled and executed in workers ( slots )

Tasks as chain of operators

Run operator logic in a pipelined fashion

State is kept in operators

Stream Job

Batch Job

ML Job

Flink Runtime Engine

Graph Job

optimizer

optimizer

optimizer

optimizer

Page 57: Flink. Pure Streaming

Tasks scheduled and executed in workers ( slots )

Tasks as chain of operators

Run operator logic in a pipelined fashion

State is kept in operators

Stream Job

Batch Job

ML Job

Flink Runtime Engine

Graph Job

optimizer

optimizer

optimizer

optimizer

Page 58: Flink. Pure Streaming

Tasks scheduled and executed in workers ( slots )

Tasks as chain of operators

Run operator logic in a pipelined fashion

State is kept in operators

Stream Job

Batch Job

ML Job

Flink Runtime Engine

Graph Job

optimizer

optimizer

optimizer

optimizer

Page 59: Flink. Pure Streaming

If you want to know one thing about Flink is that you don't need to know

the internals of Flink

Page 60: Flink. Pure Streaming

Events Time &

Windows

Fault Tolerance &

CorrectnessState Handling

Low Latency &

High ThroughputAPI Libraries SQL

Building Blocks

Page 61: Flink. Pure Streaming

Events Time &

Windows

Fault Tolerance &

CorrectnessState Handling

Low Latency &

High ThroughputAPI Libraries SQL

Building Blocks

lTime references

lOut of order events

lPowerful Windowing

Page 62: Flink. Pure Streaming

Event Times & Windowing

Page 63: Flink. Pure Streaming

Event Times & Windowing

EventTime

EventTime

Page 64: Flink. Pure Streaming

Event Times & Windowing

Flink Data Source

EventTime

EventTime

Ingestion Time

Page 65: Flink. Pure Streaming

Event Times & Windowing

Flink Data Source

Flink Window Operator

EventTime

EventTime

Ingestion Time

Processing Time

Page 66: Flink. Pure Streaming

Event Time: when data is generated

Ingestion time: when data is loaded from source

Processing time: when data is processed

Event time help to process out- of-order events and replay elements as the ocurred ( deterministic results )

Explicit handling of time. 3 choices:

Event Times & Windowing

Page 67: Flink. Pure Streaming

Event Times & Windowing

Page 68: Flink. Pure Streaming

Event time. Out or Order

Page 69: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10

Event time. Out or Order

Page 70: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10

Event time. Out or Order

Out or Order

1 2 3 5 74 6 8 9 10

Page 71: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10 1 2 3 5 74 6 8 9 104

Event time. Out or Order

Ingestion Time Windows

Out or Order

1 2 3 5 74 6 8 9 10

Page 72: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10 1 2 3 5 74 6 8 9 10

1 2 3

4

4 5

Event time. Out or Order

6 7 8 9 10

Event Time Windows

Ingestion Time Windows

Out or Order

1 2 3 5 74 6 8 9 10

Page 73: Flink. Pure Streaming

Event time. Watermarks

Page 74: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10

Event time. Watermarks

Page 75: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10

1 2 3 54 6 8

1 2 3 54 6 8

1 2 3

4

4 5

Event time. Watermarks

6 8

Event Time Windows

Ingestion Time Windows

Out or Order

Page 76: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10

1 2 3 54 6 8 910

1 2 3 54 6 8 910

1 2 3

4

4 5

Event time. Watermarks

6 8 9 10

Event Time Windows

Ingestion Time Windows

Out or Order

Not event time before 5 will come

Late Time of 2

5

Page 77: Flink. Pure Streaming

1 2 3 5 7

4 6 8 9 10

1 2 3 5 74 6 8 910

1 2 3 5 74 6 8 910

1 2 3

4

4 5

Event time. Watermarks

6 7 8 9 10

Event Time Windows

Ingestion Time Windows

Out or Order

Not event time before 10 will come

Late Time of 2

10

Page 78: Flink. Pure Streaming

Windowing

Windows: grouping of events according to time, session*, count

Page 79: Flink. Pure Streaming

Windowing

Windows: grouping of events according to time, session*, count

Powerful built-in windows:

Page 80: Flink. Pure Streaming

Windowing

Windows: grouping of events according to time, session*, count

Powerful built-in windows: Count: number of events to trigger the window. Process X last events each Y events.

Page 81: Flink. Pure Streaming

Windowing

Windows: grouping of events according to time, session*, count

Powerful built-in windows: Count: number of events to trigger the window. Process X last events each Y events.

Time: l Tumbling: trigger every X time with received events

l Sliding: trigger every X time with received events in last Y time

Page 82: Flink. Pure Streaming

Windowing

Windows: grouping of events according to time, session*, count

Powerful built-in windows: Count: number of events to trigger the window. Process X last events each Y events.

Time: l Tumbling: trigger every X time with received events

l Sliding: trigger every X time with received events in last Y time

Session: all events from session/user X until session time expired ( Gap )

Page 83: Flink. Pure Streaming

Windowing

Windows: grouping of events according to time, session*, count

Powerful built-in windows: Count: number of events to trigger the window. Process X last events each Y events.

Time: l Tumbling: trigger every X time with received events

l Sliding: trigger every X time with received events in last Y time

Session: all events from session/user X until session time expired ( Gap )

High level API for user windows: Window Assigner, Trigger, Evictor

Page 84: Flink. Pure Streaming

Events Time &

Windows

Fault Tolerance &

CorrectnessState Handling

Low Latency &

High ThroughputAPI Libraries SQL

Building Blocks

lManaged operator state for backup/recovery

lSavepoints

Page 85: Flink. Pure Streaming

Stateful Streaming

Op

Stateless StreamProcessing

Page 86: Flink. Pure Streaming

Stateful Streaming

Op Op

State

Stateless StreamProcessing

Stateful StreamProcessing

lBuilt-in internal state in each operator for exactly-once semantics

lUser state can be declared in each operator to be saved locally in memory ( API, key/value pars )

lSnapshots: periodically local states in memory are persisted in lightweight distributed snapshots. No global pause !!

lCheckpoint as global consistent point-in-time snapshot build by set of distributed snapshots.

lPluggable state backend for snapshots:JobManager, HDFS, RocksDB

lSavepoints: user-triggered retained checkpoint

Page 87: Flink. Pure Streaming

Events Time &

Windows

Fault Tolerance &

CorrectnessState Handling

Low Latency &

High ThroughputAPI Libraries SQL

Building Blocks

lExactly-once semantics with managed operator state

lDistributed Snapshotting Algorithm

Page 88: Flink. Pure Streaming

Periodically

Chandy-Lamport Snapshots

“The global-state-detection algorithm is to be superimposed on the underlying computation: It must run concurrently with, but no alter, this underlying computation”

. Triggers snapshots asynchronously

. Embedded snapshots algorithm in stream of data ( barriers )

. No global pause, lightweight impact in performance

Handling Checkpoints

Page 89: Flink. Pure Streaming

Periodically

Chandy-Lamport Snapshots

“The global-state-detection algorithm is to be superimposed on the underlying computation: It must run concurrently with, but no alter, this underlying computation”

. Triggers snapshots asynchronously

. Embedded snapshots algorithm in stream of data ( barriers )

. No global pause, lightweight impact in performance

Handling Checkpoints

Page 90: Flink. Pure Streaming

Periodically

Chandy-Lamport Snapshots

“The global-state-detection algorithm is to be superimposed on the underlying computation: It must run concurrently with, but no alter, this underlying computation”

. Triggers snapshots asynchronously

. Embedded snapshots algorithm in stream of data ( barriers )

. No global pause, lightweight impact in performance

Handling Checkpoints

Page 91: Flink. Pure Streaming

snapshot

Job Manager

Periodically pushes barriers for new state

New state X+1

Ack for Snapshot state X from Task N

Handling Checkpoints

Page 92: Flink. Pure Streaming

snapshot

Job Manager

Handling Checkpoints

Page 93: Flink. Pure Streaming

snapshot

Job Manager

Handling Checkpoints

Page 94: Flink. Pure Streaming

snapshot

Job Manager

Handling Checkpoints

All Acks received

Register Checkpoint for restore in case of fail

Page 95: Flink. Pure Streaming

Streaming Fault Tolerance

In case of fail, last global checkpoint is recovered ( recovery from partial checkpoint / individual snapshots is coming )

Need of stateful source like kafka to ensure end-to-end exactly-once semantic in case of fail.

Kafka sink doesn't guarantee end-to-end exactly-once ( multiple writes in topic ) ( at least-once )

Semantics in Flink:

At Least Once: never loses events, events might be reprocessed

Exactly once: neither reprocessed nor lost events.

Exactly once by default, with low impact in performance and development effort, unlike another tools like Storm or Spark.

Page 96: Flink. Pure Streaming

If you want to know one thing about Flink is that you don't need to know

the internals of Flink

Page 97: Flink. Pure Streaming

Events Time &

Windows

Fault Tolerance &

CorrectnessState Handling

Low Latency &

High ThroughputAPI Libraries SQL

Building Blocks

lPipelined runtime

lLatency vs throughput tunning

Page 98: Flink. Pure Streaming

Exactly-once semantic with low impact in performance

Controllable checkpointing overhead

Higher throughput using processing time

Performance improvements thanks to:

. operator chaining during optimization phase

. own optimized serialization stack with code generation

Performance Tunning

Page 99: Flink. Pure Streaming

Benchmark for “Streaming Computation” published by Yahoo. Dec 18, 2015https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at

Production use-case

l counting ad impressions group by campaign

l aggregations over a 10 second window

l save current aggregate value to Redis every second

Streaming Benchmark

Page 100: Flink. Pure Streaming

Throughput vs Latency Graph

Throughput ( 1000 events / sec )

99 PercentileLatency ( ms )

Not Operator combinig in Storm, more complicate topology, more steps for events and more overhead

Page 101: Flink. Pure Streaming

Apache Storm Without Tridentl At least once / Double counting after fail / Lost state after Failuresl CPU bounded

Apache Sparkl Latency increase with throughput

Apache Flinkl Exactly once / No double counting / No state lossl Limited by bandwidth between Kafka and Flink cluster l (1 GigE).

l kafka brokers within Kafka Cluster ( 10 GigE ) l Achieved 15 million messages /sec l ( before 3 million m/sec) with exactly once semantic

10,000,000 20,000,000

1 GigE

10 GigE

Performance Tunning

Page 102: Flink. Pure Streaming

Events Time &

Windows

Fault Tolerance &

CorrectnessState Handling

Low Latency &

High ThroughputAPI Libraries SQL

Building Blocks

lHigh Level API

lWide range of basic and advanced operators

lJava , Scala. Python soon !!

Page 103: Flink. Pure Streaming

API

Page 104: Flink. Pure Streaming

API

Working on data streams ( bounded ? )

Page 105: Flink. Pure Streaming

API

Working on data streams ( bounded ? )

Stream Processing: Explicit Handling of Time

Page 106: Flink. Pure Streaming

API

Working on data streams ( bounded ? )

Stream Processing: Explicit Handling of Time

Java & Scala. Python coming. Java: Bean type classes vs Tuples with position addresses. Scala: case classes.

Page 107: Flink. Pure Streaming

API

Working on data streams ( bounded ? )

Stream Processing: Explicit Handling of Time

Java & Scala. Python coming. Java: Bean type classes vs Tuples with position addresses. Scala: case classes.

Operators:

Sources: kafka, FileSystem, Cassandra …

Sinks: Kafka, HDFS, Cassandra ….

Transformations: Basic: map, flatmap, filter, grouping, iterate, project, join, cross, … Streaming: Windowing + Aggregations, Temporal Binary Iterative Stream operators

Page 108: Flink. Pure Streaming

API

Working on data streams ( bounded ? )

Stream Processing: Explicit Handling of Time

Java & Scala. Python coming. Java: Bean type classes vs Tuples with position addresses. Scala: case classes.

Operators:

Sources: kafka, FileSystem, Cassandra …

Sinks: Kafka, HDFS, Cassandra ….

Transformations: Basic: map, flatmap, filter, grouping, iterate, project, join, cross, … Streaming: Windowing + Aggregations, Temporal Binary Iterative Stream operators

DataStream<?> DataSet<?>

Core API

1 implementation*, 2 interfaces

Page 109: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Map

Source

Operators

Page 110: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Map

Source

Operators

Page 111: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Source

Filter

Operators

Page 112: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Source

Filter

Operators

Page 113: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Source

Reduce

Operators

Page 114: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Source

Reduce

Operators

Page 115: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Source

Join

Operators

Page 116: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Source

Join

Operators

Page 117: Flink. Pure Streaming

Source Map Reduce

Fliter

Join Sum Sink

Source

Operators

Page 118: Flink. Pure Streaming

Events Time &

Windows

Fault Tolerance &

CorrectnessState Handling

Low Latency &

High ThroughputAPI Libraries SQL

Building Blocks

lEasy to use. SQL !!

lBased on Apache Calcite

Page 119: Flink. Pure Streaming

API extension for DataSets y DataStreams

Based on relational Table abstraction

Table <=> Source / DataSet / DataStream

Operators like: where, select, as, groupBy, join, union, minus, distinct, orderBy, ...

Table API

Page 120: Flink. Pure Streaming

Execute SQL-Like sentences on DataSets and Datastreams

Resuts returned as Table ( Table API ), convertible to DataStream or DataSets

SQL and Table API can be seamlessly mixed over DataStream/DataSets

Flink’s SQL support is not feature complete, yet.

Queries that include unsupported SQL will fail !!

SQL Support

SQL

Page 121: Flink. Pure Streaming

Parsing and Logical plan for Table operators and SQL are optimized using Apache Calcite

Only supported a Subset of the comprehensive SQL standard

Apache Calcite provides with:

SQL Parsing

API for building expressions in relational algebra

Query planning engine

Provides SQL for Streaming Queries with windows aggregations

SELECT STREAM TUMBLE_END(rowtime, INTERVAL '1' HOUR) AS rowtime, productId, COUNT(*) AS c, SUM(units) AS units FROM Orders GROUP BY TUMBLE(rowtime, INTERVAL '1' HOUR), productId;

Apache Calcite

SQL Sentence

Apache Calcite: SQL to Logical

Plan as Relational Algebra

Flink Optimizer: Logical Plan to Execution Plan

Page 122: Flink. Pure Streaming

If you want to know

one thing about Flink

is that you don't need

to know the internals of Flink

So … Batch

Page 123: Flink. Pure Streaming

Batch on Stream

Page 124: Flink. Pure Streaming

Stream: Unbounded Data Stream

Unbounded Data Stream

Batch on Stream

Page 125: Flink. Pure Streaming

Stream: Unbounded Data Stream

Batch: Bounded stream ( dataset ) on a stream processor

Global window over the entire dataset

Optimization in operators for joins and grouping, with blocking data exchange if needed

Unbounded Data Stream

Bounded Data Set

Batch on Stream

Page 126: Flink. Pure Streaming

Stream: Unbounded Data Stream

Batch: Bounded stream ( dataset ) on a stream processor

Global window over the entire dataset

Optimization in operators for joins and grouping, with blocking data exchange if needed

Unbounded Data Stream

Bounded Data Set

Batch on Stream

Page 127: Flink. Pure Streaming

Stream: Unbounded Data Stream

Batch: Bounded stream ( dataset ) on a stream processor

Global window over the entire dataset

Optimization in operators for joins and grouping, with blocking data exchange if needed

Batch specific optimizations:

Cost-based optimizer: dataset size known before hand

Manage memory on / off-heap for join, sort, …

Optimization serialization stack for user-types

Bounded Data Set

Batch on Stream

Unbounded Data Stream

Page 128: Flink. Pure Streaming

Conclusions

Page 129: Flink. Pure Streaming

Conclusions

Page 130: Flink. Pure Streaming

Conclusions

Flink Pure streaming engine matches real life. No Abstraction

Page 131: Flink. Pure Streaming

Conclusions

Flink Pure streaming engine matches real life. No Abstraction

Batch on streaming

Page 132: Flink. Pure Streaming

Conclusions

Flink Pure streaming engine matches real life. No Abstraction

Batch on streaming

Flexible Windowing Semantics with Explicit Time handling

Page 133: Flink. Pure Streaming

Conclusions

Flink Pure streaming engine matches real life. No Abstraction

Batch on streaming

Flexible Windowing Semantics with Explicit Time handling

Competitive Performance, low latency and hight throughput

Page 134: Flink. Pure Streaming

Conclusions

Flink Pure streaming engine matches real life. No Abstraction

Batch on streaming

Flexible Windowing Semantics with Explicit Time handling

Competitive Performance, low latency and hight throughput

Apache Beam, open sourced by Google, uses Flink as its first order runner forBatch and Streaming processing in partnership with Data Artisans.

100% Compliance of data processing model “what, where, when, how “

Page 135: Flink. Pure Streaming
Page 136: Flink. Pure Streaming

Indizen Technologies, S.L

Paseo de la Castellana, 130 - 4ª planta

28046 Madrid, Spain

Tel. 91 535 85 68

www.indizen.com

@indizen_corp¡gracia

s!

136

Francisco José Guerrero

Big Data & Solutions Architect

Tel. 91 535 85 68

Mov. XXX YYY ZZZ

[email protected]