12
Brian Hess & Cliff Gilmore DataStax Advanced Replication

DataStax | DataStax Enterprise Advanced Replication (Brian Hess & Cliff Gilmore) | Cassandra Summit 2016

Embed Size (px)

Citation preview

Brian Hess & Cliff Gilmore

DataStax Advanced Replication

Why Advanced Replication•Standard Cassandra replication has its limits

• Lots of disconnected “edge” nodes/data centers/clusters• Replicating to central “mother ship” for aggregating• Inconsistent connectivity• All data centers are read-write – no read-only DCs

2© 2016 DataStax, All Rights Reserved.

What is Advanced Replication•Advanced Replication supports:

• Many edge clusters replicating to a central hub• Consistent or sporadic connectivity – “store and forward”• Prioritized streams for limited bandwidth situations• One-way replication• Active queries at the edge, as well as replicating to the hub• Search/Analytics supported at edge and hub clusters

3© 2016 DataStax, All Rights Reserved.

Company Confidential

“What was Brian’s average

purchase per store this week?”

Analytics Over All Data

“What did Brian buy today across

all stores?”

Can Query Global Sales

“What was the hottest product

here this week?”

Analytics of Local Sales

“What did Brian buy here today?”

Can Query Local Sales

Each Store

Central Hub

Example: Retail Sales

© 2016 DataStax, All Rights Reserved.

Oil and

Gas

Company Confidential

Mobile

deploy

Transport

Telecom

Retail

Banking,

FinanceIndustrial IoT

Key Verticals

© 2016 DataStax, All Rights Reserved.

Advanced Replication Key Terminology• Edge – DSE Cluster that is the source of change events

• Hub – DSE Cluster that receives change events• Replication Log – A table on the edge cluster that stores changes

• Channel – Defined replication configuration between an edge and hub table

• Collection Agent – Captures change events to the replication log table

• Replication Agent – Reads replication log and writes to the hub

6© 2016 DataStax, All Rights Reserved.

Architecture – Edge View

7

Client

Edge

ReplicationLog

CollectionAgent Table

ReplicationAgent

Hub Cluster

Table

© 2016 DataStax, All Rights Reserved.

Architecture – Edge View

8

Client

Edge

ReplicationLog

CollectionAgent Table

ReplicationAgent

Hub Cluster

Table

Normal CQL Operation

CQL Trigger captures mutation

Maintained in C* table for Fault

Tolerance

Pulls from Replication Log in priority/time order

Replicates to Hub via normal CQL

driver

High Priority mutations opportunistically sent to

Hub asynchronously

© 2016 DataStax, All Rights Reserved.

Points of Nuance• Does it handle TTLs?

• The edge cluster will NOT capture the TTL of of the base record• The hub table can have default TTL that is different than edge table

• Can I repair from edge to hub?• Because these are separate clusters there is no repair mechanism• Replication mechanism ensures writes make it to hub eventually

• This looks like Hints!• More robust than Hinted Handoff

9© 2016 DataStax, All Rights Reserved.

Topology

10© 2016 DataStax, All Rights Reserved.

West

East

Central

Store #1

Store #7

Store #2

Store #6

Store #5

Store #4

Store #3

Store #8

Store #9

Store #10

Store #11

Questions?

Brian Hess – [email protected] Gilmore – [email protected]