33
Big Data Streaming Analysis without code STEFANO PAMPALONI [email protected]

SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

Embed Size (px)

Citation preview

Page 1: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

Big Data Streaming Analysis without code

STEFANO PAMPALONI [email protected]

Page 2: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

2

Let’s take a trip back in time. Each application has its

own database for storing information. But we want

that information elsewhere for analytics and

reporting.

Page 3: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

3

We don't want to query the transactional system, so

we create a process to extract from the source to a

data warehouse / lake

Page 4: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

4

Let’s take a trip back in time

We want to unify data from multiple systems, so

create conformed dimensions and batch processes

to federate our data. This is all batch driven, so

latency is built in by design.

Page 5: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

5

Let’s take a trip back in time

As well as our data warehouse, we want to use our

transactional data to populate search replicas,

Graph databases, noSQL stores…all introducing

more point-to-point dependencies in our system

Page 6: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

6

Let’s take a trip back in time

Ultimately we end up with a spaghetti architecture. It

can't scale easily, it's tightly coupled, it's generally

batch-driven and we can't get data when we want it

where we want it.

Page 7: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

7

But…there's hope!

Page 8: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

8

Apache Kafka, a distributed streaming platform,

enables us to decouple all our applications creating

data from those utilising it. We can create low-

latency streams of data, transformed as necessary.

Page 9: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

Kafka concepts

Page 10: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

Before

Page 11: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

After

Page 12: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

9

But…to use stream processing, we need to be Java

coders…don't we?

Page 13: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

10

Happy days! We can actually build streaming data

pipelines using just our bare hands, configuration

files, and SQL.

Page 14: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 15: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 16: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 17: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 18: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 19: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 20: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

A Developer Preview of

KSQL An Open Source Streaming SQL

Engine for Apache Kafka

Page 21: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"

• Enables stream processing with zero coding required • The simplest way to process streams of data in real-time • Powered by Kafka: scalable, distributed, battle-tested • All you need is Kafka–No complex deployments

Page 22: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 23: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 24: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 25: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 26: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 27: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 28: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 29: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 30: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 31: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 32: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"
Page 33: SFScon17 - Stefano Pampaloni: "Big Data Streaming Analysis without code"