58
#JCConf Apache Kafka A high-throughput distributed messaging system 陸振恩 (popcorny) popcorny@cacafly.com

Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

Embed Size (px)

DESCRIPTION

隨著大數據的需求,資料的產生流動儲存處理已經越分越細。早期的 RDBMS 已經無法達到需求,而因此出現了 NoSQL 來解決更大資料量的問題。至於傳統的 Message Queue System 也漸漸無法達到大數據的需求,Apache Kafka 也試著解決 Message 在超大數據的需求。Kafka 可以視為新的 Message Queue Solution,或者更像 Activity Logging System,目標是達到快速、分散式、持續性、延展性等特色。為了這些特性,其中有許多設計上的巧思也值得去關注。所以在這次分享中,除了介紹 Kafka 功能以外,也會針對內部設計做更進一步的介紹。

Citation preview

Page 1: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Apache KafkaA high-throughput distributed messaging system

陸振恩 (popcorny) [email protected]

Page 2: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● What is Kafka ● Basic concept ● Why Kafka fast ● Programming Kafka ● Using scenarios ● Recap

Outline

2

Page 3: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

What is Kafka

Page 4: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● More and more data and metrics need to be collected - Web activity tracking - Operation metrics - Application log aggregation - Commit log - …

● We need a message bus to collect and relay these data - Big Volume - Fast and Scalable

Motivation

4

Page 5: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Developed by Linkedin ● Message System

- Queue - Pub / Sub

● Written in Scala ● Features

- Durability - Scalability - High Availability - High Throughput

Kafka

5

Page 6: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

BigData World

6

Traditional BigData

File System NFS HDFS, S3

Database RDBMS Cassandra, HBase

Batch Processing SQL Hadoop MapReduce

Spark

Stream Processing In-App Processing Strom,

Spark Streaming

Message Service AMQP-compliant Kafka

Page 7: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Durability - All messages are persisted - Sequential read & write (like log file) - Consumers keep the message offset (like file

descriptor) - The log files are rotated (like logrotate) - Messages are only deleted on expired. (like

logrotate) - Support Batch Load and Real Time Usage!

• cat access.log | grep ‘jcconf’ ● tail -F access.log | grep ‘jcconf’

Features

7

Page 8: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Design like Message Queue Implementation like Distributed Log File

8

Page 9: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Scalability - Horizontal scale out - Topic is partitioned (sharded)

● High Availability - Partition can be replicated

Features

9

Page 10: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● High Throughput

Features

10

source: http://www.infoq.com/articles/apache-kafka

Page 11: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Basic Concept

Page 12: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Producer - The role to send message to broker ● Consumer -The role to receive message from broker ● Broker - One node of Kafka cluster. ● ZooKeeper - Coordinator of Kafka cluster and costumer

groups.

Kafka Cluster

Physical Components

12

Producer BroakerBroakerBroaker

Zookeeper

Consumer Group

Consumer

Page 13: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Topic!- The named destination of

partition. ● Partition

- One Topic can have multiple partition

- Unit of parallelism ● Message!

• Key/value pair • Message offset

Logical ComponentsTopic

B

Partition 2

0 1

E

2

F

3

M

4

N

5

Q

6

R

7

S

8

Y

9

b

C

Partition 3

0 1

D

2

K

3

L

4

O

5

P

6

T

7

U

A

Partition 1

0 1

G

2

H

3

I

4

J

5

V

6

W

7

X

8

c

13

Page 14: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

One Partition One Consumer (Queue)

P CA

Partition 1

0 1

B

2

C

3

D

4

E

5

F

6

G

7

H

8

I

9

J

offset = 8

14

Consumers keep the offset.Broker has no idea about if message is proceeded

Page 15: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

One Partition Multiple Consumer (Pub/Sub)

P A

Partition 1

0 1

B

2

C

3

D

4

E

5

F

6

G

7

H

8

I

9

J

C1

C2

C3

offset = 8

offset = 7

offset = 9

15

Each Consumer keep its own offset.

Page 16: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

broker2

Multiple Partitions

broker1

P

A

Partition 1

0 1

G

2

H

3

I

4

J

5

V

6

W

7

X

8

c

B

Partition 2

0 1

E

2

F

3

M

4

N

5

Q

6

R

7

S

8

Y

9

b

C

Partition 3

0 1

D

2

K

3

L

4

O

5

P

6

T

7

U

16

C1

p1.offset = 7 p2.offset = 9 p3.offset = 7

Dispatched by hashed key

Page 17: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

broker2

Multiple Partitions

broker1

P

A

Partition 1

0 1

G

2

H

3

I

4

J

5

V

6

W

7

X

8

c

B

Partition 2

0 1

E

2

F

3

M

4

N

5

Q

6

R

7

S

8

Y

9

b

C

Partition 3

0 1

D

2

K

3

L

4

O

5

P

6

T

7

U

17

C2

offset = 9

offset = 7

C3

offset = 7

C1

Page 18: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Can we auto-rebalance the consumers to partitions?

18

Yes, Consumer Group!!

Page 19: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● A group of workers ● Share the offsets ● Offsets are synced to ZooKeeper ● Auto Rebalancing

Consumer Group

19

Page 20: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Consumer Group

20

broker2

broker1

P

A

Partition 1

0 1

G

2

H

3

I

4

J

5

V

6

W

7

X

8

c

B

Partition 2

0 1

E

2

F

3

M

4

N

5

Q

6

R

7

S

8

Y

9

b

C

Partition 3

0 1

D

2

K

3

L

4

O

5

P

6

T

7

U

Consumer Group ‘group1’

C2

p1.offset = 7 p2.offset = 9 p3.offset = 7

C1

Page 21: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Consumer Group

21

broker2

broker1

P

A

Partition 1

0 1

G

2

H

3

I

4

J

5

V

6

W

7

X

8

c

B

Partition 2

0 1

E

2

F

3

M

4

N

5

Q

6

R

7

S

8

Y

9

b

C

Partition 3

0 1

D

2

K

3

L

4

O

5

P

6

T

7

U

’group1’

C2

C1

C1

’group2’

Page 22: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Consumer Group

P A

Partition 1

0 1

B

2

C

3

D

4

E

5

F

6

G

7

H

8

I

9

J

C1

C2

C3

offset = 9

Consumer Group

22

Partition to Consumer is Many to One relation (In One Consumer Group)

Page 23: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Messages from the same partition guarantee FIFO semantic

● Traditional MQ can only guarantee message are delivered in order

● Kafka can guarantee messages are handled in order (for same partition)

Message Ordering

23

P BC1

C2

PP1 C1

C2P2

Traditional MQ Kafka

Page 24: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● At most once - Messages may be lost but are never redelivered.

● At least once - Messages are never lost but may be redelivered.

● Exactly once - each message is delivered once and only once. (this is what people actually want) - Two-Phase Commit - At least once + Idempotence

Delivery Semantic

24

Apply multiple times without changing the final result

Page 25: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Which part do we discuss?

Delivery Semantic

25

Producer Broker Consumer

Producer Broker Consumer

Page 26: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● At most once - Async send ● At least once - Sync send (with retry count) ● Exactly once!

- Idempotent delivery does not support until next version (0.9)

Producer To Broker

26

Producer Broker Consumer

Page 27: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● At most once - Store the offset before handling the message

● At least once - Store the offset after handling the message

● Exactly once - At least once + Idempotent operation

Broker to Consumer

27

Producer Broker Consumer

Page 28: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● The unit of replication is the partition!● Each partition has a single leader and zero or more

followers ● All reads and writes go to the leader of the partition

Replication

28

source: http://www.infoq.com/articles/apache-kafka

Leader FollowerFollower

Producer Consumer

sync sync

write read

Page 29: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�29

Page 30: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Many data system retain a latest state for data by some key.

● Log compaction adds an alternative retention mechanism, log compaction, to support retaining messages by key instead of purely by time.

● This would describe both many common data systems — a search index, a cache, etc

Log Compaction

30

Page 31: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Log Compaction

31

Page 32: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Log Compaction

32

Page 33: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Why Kafka Fast?

Page 34: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�34

Persistence and Fast?

Page 35: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Don’t fear file system ● Six 7,200 RPM SATA RAID-5 array

- Sequential write: 600MB/sec - Random write: 100K/sec

● Sequential read in disk faster than random access in memory?

Sequential vs Random

35

source: http://queue.acm.org/detail.cfm?id=1563874

Page 36: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

If we persist data, should we cache the data in memory?

36

Page 37: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● In-Process Cache - Message as object - Cache in JVM heap.

● Page Cache - Disk cache by OS

In-Process Cache vs Page Cache

37

Page 38: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

In-Process Cache vs Page Cache

38

In Process Cache Disk Page CacheMemory Usage In-heap memory Free Memory

Overhead Object overhead No

Garbage Collection Yes No

Process Restart Lost Still Warm

Controled by App OS

Page 39: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Fact - All disk reads and writes will go through page

cache. This feature cannot easily be turned off without using direct I/O, so even if a process maintains an in-process cache of the data, this data will likely be duplicated in OS pagecache, effectively storing everything twice.

● Conclusion - Relying on pagecache is superior to maintaining

an in-process cache or other structure

In-Process Cache vs Page Cache

39

Page 40: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

How to transfer to consumers?

40

Page 41: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Application Copy vs Zero Copying

41

Page 42: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Traditional Queue - Broker keep the message state and metadata - B-Tree O(log n) - Random Access

● Kafka - Consumers keep the offset - Sequential Disk Read/Write O(1)

Constant Time

42

Page 43: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Programming Kafka

Page 44: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Producer

44

Sync Send

Page 45: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Producer

45

Async Send

Page 46: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

High Level Consumer

46

Open The Consumer Connector

Open the stream for topic

Page 47: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

High Level Consumer

47

Receive the message

Page 48: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Using Scenarios

Page 49: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Realtime processing and analyzing ● Stream processing frameworks

- Strom - Spark Streaming - Samza

● Distributed stream source + Distributed stream processing

● All these three frameworks support Kafka as stream source.

Source of Stream Processing

49

Kafka Cluster

Stream Processing

Page 50: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● The most reliable source for stream processingSource of Stream Processing

50

source: http://www.slideshare.net/ptgoetz/apache-storm-vs-spark-streaming

Page 51: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Centralized Log Framework ● Distributed Log Collectors

- Logstash - Fluentd - Flume

Source and/or Sink of Distributed Log Collectors

51

Kafka Cluster

Distributed Log Collector

Other Sink

Kafka Cluster

Distributed Log Collector

Other Source

Page 52: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Push vs Pull

● Distributed Log Collector provide Configurable producer and consumer

● Kafka Cluster provide distributed, high availability, reliable message system

Source and/or Sink of Distributed Log Collectors (cont.)

52

Distributed Log Collector

Kafka Cluster

pull

pull

push

push

Page 53: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● What is lambda architecture? - Stream for realtime data - Batch for historical data - Query by merged view.

Source of Lambda Architecture

53

source: http://lambda-architecture.net/

Page 54: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

Lambda Architecture (cont.)

54

source: https://metamarkets.com/2014/building-a-data-pipeline-that-handles-billions-of-events-in-real-time/

Page 55: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Features - Durability - Scalability - High Availability - High Throughput

● Basic Concept - Producer, Broker, Consumer, Consumer Group - Topic, Partition, Message - Message Ordering - Delivery Semantic - Replication

● Why Kafka fast ● Using Scenarios

- Source of stream processing - Source or sink of distributed log framework - Source of lambda architecture

Recap

55

Page 56: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

● Kafka Documentationkafka.apache.org/documentation.html

● Kafka Wikihttps://cwiki.apache.org/confluence/display/KAFKA/Index

● The Log: What every software engineer should know about real-time data's unifying abstractionengineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying

● Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines)engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines

● Apache Kafka for Beginnersblog.cloudera.com/blog/2014/09/apache-kafka-for-beginners/

Reference

56

Page 57: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�57

producer.send(“thanks”);

Page 58: Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014

#JCConf�

// any question? question = consumer.receive();

58