Flink forward SF 2017: Ufuk Celebi - The Stream Processor as a Database: Building Online...

Preview:

Citation preview

UfukCelebi@iamuce

The Stream Processoras a Database

The(Classic)UseCaseRealtimeCountsandAggregates

2

(Real-)TimeSeriesStatistics

3

StreamofEvents Real-timeStatistics

TheArchitecture

4

collect messagequeue

analyze serve&store

TheFlinkJob

5

case class Impressions(id: String, impressions: Long)

val events: DataStream[Event] = env.addSource(new FlinkKafkaConsumer09(…))

val impressions: DataStream[Impressions] = events.filter(evt => evt.isImpression).map(evt => Impressions(evt.id, evt.numImpressions)

val counts: DataStream[Impressions]= stream.keyBy("id").timeWindow(Time.hours(1)).sum("impressions")

TheFlinkJob

6

case class Impressions(id: String, impressions: Long)

val events: DataStream[Event] = env.addSource(new FlinkKafkaConsumer09(…))

val impressions: DataStream[Impressions] = events.filter(evt => evt.isImpression).map(evt => Impressions(evt.id, evt.numImpressions)

val counts: DataStream[Impressions]= stream.keyBy("id").timeWindow(Time.hours(1)).sum("impressions")

TheFlinkJob

7

case class Impressions(id: String, impressions: Long)

val events: DataStream[Event] = env.addSource(new FlinkKafkaConsumer09(…))

val impressions: DataStream[Impressions] = events.filter(evt => evt.isImpression).map(evt => Impressions(evt.id, evt.numImpressions)

val counts: DataStream[Impressions]= stream.keyBy("id").timeWindow(Time.hours(1)).sum("impressions")

TheFlinkJob

8

case class Impressions(id: String, impressions: Long)

val events: DataStream[Event] = env.addSource(new FlinkKafkaConsumer09(…))

val impressions: DataStream[Impressions] = events.filter(evt => evt.isImpression).map(evt => Impressions(evt.id, evt.numImpressions)

val counts: DataStream[Impressions]= stream.keyBy("id").timeWindow(Time.hours(1)).sum("impressions")

TheFlinkJob

9

case class Impressions(id: String, impressions: Long)

val events: DataStream[Event] = env.addSource(new FlinkKafkaConsumer09(…))

val impressions: DataStream[Impressions] = events.filter(evt => evt.isImpression).map(evt => Impressions(evt.id, evt.numImpressions)

val counts: DataStream[Impressions]= stream.keyBy("id").timeWindow(Time.hours(1)).sum("impressions")

TheFlinkJob

10

KafkaSource map() window()/

sum() Sink

KafkaSource map() window()/

sum() Sink

filter()

filter()

keyBy()

keyBy()

State

State

Puttingitalltogether

11

Periodically(everysecond)flushnewaggregates

toRedis

TheBottleneck

12

Writestothekey/valuestoretaketoolong

Queryable State

13

QueryableState

14

QueryableState

15

Optional,andonlyattheendof

windows

QueryableState:ApplicationView

16

Database

realtimeresults olderresults

Application QueryService

currenttimewindows

pasttimewindows

QueryableStateEnablers§ Flinkhasstateasafirstclasscitizen

§ Stateisfaulttolerant (exactlyoncesemantics)

§ Stateispartitioned (sharded)togetherwiththeoperatorsthatcreate/updateit

§ Stateiscontinuous (notminibatched)

§ Stateisscalable

17

StateinFlink

18

window()/sum()

Source/filter()/map()

Stateindex(e.g.,RocksDB)

Eventsarepersistentandordered (perpartition/key)

inthemessagequeue(e.g.,ApacheKafka)

Eventsflowwithoutreplicationor synchronouswrites

StateinFlink

19

window()/sum()

Source/filter()/map()

Triggercheckpoint Injectcheckpointbarrier

StateinFlink

20

window()/sum()

Source/filter()/map()

Takestatesnapshot Triggerstatecopy-on-write

StateinFlink

21

window()/sum()

Source/filter()/map()

Persiststatesnapshots Durablypersistsnapshots

asynchronously

Processingpipelinecontinues

QueryableState:Implementation

22

QueryClient

StateRegistry

window()/sum()

JobManager TaskManager

ExecutionGraph

StateLocationServer

deploy

status

Query:/job/state-name/key

StateRegistry

window()/sum()

TaskManager

(1)Getlocationof"key-partition"of"job"

(2)Lookuplocation

(3)Respondlocation

(4)Querystate-nameandkey

localstate

register

QueryableStatePerformance

23

Conclusion

24

Takeaways§ Streamingapplicationsareoftennotboundbythestream

processoritself.Crosssysteminteraction isfrequentlybiggestbottleneck

§ Queryablestatemitigatesabigbottleneck:Communicationwithexternalkey/valuestorestopublishrealtimeresults

§ ApacheFlink'ssophisticatedsupportforstatemakesthispossible

25

TakeawaysPerformanceofQueryableState

§ Datapersistenceisfastwithlogs• Appendonly,andstreamingreplication

§ Computedstateisfastwithlocaldatastructuresandnosynchronousreplication

§ Flink'scheckpointmethodmakescomputedstatepersistentwithlowoverhead

26

Questions?§ eMail:uce@apache.org§ Twitter:@iamuce§ Code/Demo:https://github.com/dataArtisans/flink-

queryable_state_demo

27

Appendix

28

Flink Runtime+APIs

29

DataStreamAPI

RuntimeDistributedStreamingDataFlow

TableAPI&StreamSQL

ProcessFunction API

Building Blocks: Streams, Time, State

ApacheFlinkArchitectureReview

30