50
Apache BookKeeper A High Performance and Low Latency Storage Service

Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

  • Upload
    volien

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Apache BookKeeper

A High Performance and Low Latency Storage Service

Page 2: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

I am Sijie Guo- PMC Chair of Apache BookKeeper- Co-creator of Apache DistributedLog- Twitter Messaging/Pub-Sub Team- Yahoo! R&D Beijing

Hello!

Page 3: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Challenges in Distributed Systems

Page 4: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Expect Failures

up to 10% annual failure rates for disks/servers

Page 5: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Symptoms

Page 6: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Problem 1: Not Available

Page 7: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Problem 1: Not Available

Page 8: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Problem 2: Inconsistencies

Page 9: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

CAP

Page 10: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

More Issues

Page 11: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Problem 3: Split Brain

Writer A Writer A

Write A’

Writer A

Write A’

Two Writers

Page 12: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Problem 4: Failure Detection

B

A

C

Page 13: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Problem 5: Recovery

B

A

C

Recovery Protocol

Consistency

Page 14: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Solutions

Page 15: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

OverviewEnter Apache BookKeeper

Page 16: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

BookKeeper - Durable Storage

A building block for reliable systems

Commodity Hardware

Durability

Replication Consistency Recovery

Client Library

Page 17: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Ledger Abstraction

Page 18: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Ledger

◉ Segment

◉ Block / Object

◉ Append-Only File

◉ ...

Page 19: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Guarantees

If an entry

has been acknowledged,

it must be readable

If an entry

is read once,

it must always be readable

Page 20: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

History

◉ Initial Use Case - Hadoop NameNode HA

◉ 2008: Open Sourced Contrib of ZooKeeper

◉ 2011: Sub-Project of ZooKeeper

◉ 2012: Yahoo! Push Notification

◉ 2012~Now: DistributedLog, Pulsar, Majordodo

◉ 2015~Now: Salesforce Distributed Store

Page 21: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Inside of Apache BookKeeper

Details

Page 22: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Architecture

Bookie

Bookie

Bookie

APPC

lient

Metadata Store

Ledger

Page 23: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Reliable Writes

◉ Store digest along with entry

◉ Fsync entries before responding

◉ Ack when

○ All Previous Entries

○ This Entry

Bookie

Bookie

Bookie

Accepted

by

Quorum

Page 24: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Consistency - LastAddPushed

0 1 2 3 4 7 8 9

LastAddPushed

10 11 12

Writer

Add entries

Page 25: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Consistency - LastAddConfirmed

0 1 2 3 4 7 8 9 10 11 12

LastAddConfirmed

Reader Reader

LastAddConfirmed

Writer WriterOwnership Changed

Add entriesAck Adds

Fencing

Page 26: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Fencing

Page 27: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Read Entry & Read LAC

B1 B2 B3

Client

Read Entry K

Speculative ReadsOn Timeouts

B1 B2 B3

Client

Read LAC

Quorum Read

Page 28: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Long Poll Read

B1 B2 B3

Client

Long Poll ReadSpeculativeLong Poll

Page 29: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Inside a Bookie

Page 30: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Use CasesApache BookKeeper as a Building Block

Page 31: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Projects built on BookKeeper

◉ Twitter: Apache DistributedLog

◉ Yahoo: Pulsar - Cloud Messaging Service

◉ Salesforce Distributed Store.

◉ Huawei - HDFS NameNode HA

◉ HubSpot - WAL

◉ Majordodo - Distributed Resource Manager

Page 32: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Apache DistributedLog(Twitter)

Page 33: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Apache DistributedLog

1 2 3 4 5 6 7 11 12

13

14

15

16 1

7

Oldest Newest

Log SegmentX

Log SegmentX+1

Log SegmentX+2

Apache BookKeeper

Page 34: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Apache DistributedLogM

etad

ata

Stor

e

Log SegmentStore(BK)

ColdStorage(HDFS)

Log Streams - Abstraction & Naming- Data Management

- Efficient Write & Read- Intra-cluster & Geo Replication

- Segments

- Raw Streams

WriteProxy

ReadProxy

- Ownership Tracking- Batching, Compression

Record Cache -Rate Limiting, Quota -

- Serving

- Applications

- Different

Consumer

models

DBs - e.g.,Twitter’s

Manhattan

DeferredRPC

(queuing)

Self-servePub/Sub

StreamComputing

Cross DCReplication

Page 35: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

DistributedLog at Twitter

◉ Manhattan Key/Value Store - WAL

◉ Durable Deferred RPC - Journal

◉ Real-Time Search Indexing - Change Propagation

◉ Self-serve Pub/Sub - Message Delivery, Ads Pipeline

◉ Stream Computing

○ Source & Sink

○ Stateful Processing in Heron (coming soon)

◉ Reliable Cross Datacenter Replication

Page 36: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Scale DistributedLog at Twitter

◉ 1.5 trillion records/day, 17.5 petabytes/day

◉ O(10) thousands streams, O(1) million live ledgers

◉ O(10^2) bookies, O(10^3) proxies

◉ Records size from 100 bytes to 20 KB to even more

◉ Data is kept from hours to days, even up to a year

◉ Replication factor is 3 or 5. 9 or 15 for global use

case.

Page 37: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

DistributedLog Resources

◉ Website - https://distributedlog.io

◉ Mail List -

[email protected]

◉ Project Ideas - https://cwiki.apache.org/confluence/display/DL/Project+Ideas

◉ Paper - “DistributedLog: A high performance

replicated log service” (ICDE 2017)

Page 38: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Yahoo! Pulsar(Cloud Messaging Service)

Page 39: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Yahoo! Pulsar

◉ Distributed Pub/Sub Messaging Platform

◉ Flexible Messaging Model - Topic and Queue

◉ Durable, Low Latency

◉ Strong Ordering and Consistency Guarantees

◉ Geo Replication

◉ Apache BookKeeper as Durable Message Store

Page 40: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Yahoo! Pulsar

Page 41: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Scale Pulsar at Yahoo!

◉ 100 billion messages per day

◉ More than 1.4 million topics

◉ Avg publish latency across services of less than 5ms

◉ 10+ data centers, cross-region replications

Page 42: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Pulsar Performance

Page 43: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Salesforce Distributed Store

Page 44: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Salesforce Application Storage

◉ Store for Persistent WAL, Data and Objects

◉ Low, Constant Write Latencies

◉ Low, Constant Random Read Latencies

◉ Highly Available, Consistent

◉ Distributed and Linearly Scalable

◉ On Commodity Hardware

Page 45: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Heterogeneous Stores

Page 46: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Roadmap, Releases, Future

Community

Page 47: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Community

◉ 7 PMC Members◉ 10+ Committers◉ 20+ Active Contributors◉ 5+ Companies actively using/contributing

○ Twitter○ Yahoo!○ Salesforce○ Huawei○ EMC

Page 48: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Release 4.5.0

◉ Netty 4 Upgrade - Performance Improvements

◉ Security (Authentication & Authorization) Support

◉ Explicit LAC

◉ Long Poll Read Support

◉ Auto Re-replication Improvements

◉ ...

Page 49: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Future

◉ Scalable Segment Store○ Object, Log, File, Stream, …

◉ Long Term Storage○ Disk Scrubber

○ Better Lifecycle Management

○ …

◉ Beyond the limit○ 128 bits support

○ Scalable metadata management

Page 50: Apache BookKeeper Log, File, Stream, … Long Term Storage Disk Scrubber Better Lifecycle Management … Beyond the limit

Any questions ?You can find me at

◉ @sijieg◉ [email protected]

Thanks!