View
129
Download
2
Category
Preview:
Citation preview
Why Advanced Replication•Standard Cassandra replication has its limits
• Lots of disconnected “edge” nodes/data centers/clusters• Replicating to central “mother ship” for aggregating• Inconsistent connectivity• All data centers are read-write – no read-only DCs
2© 2016 DataStax, All Rights Reserved.
What is Advanced Replication•Advanced Replication supports:
• Many edge clusters replicating to a central hub• Consistent or sporadic connectivity – “store and forward”• Prioritized streams for limited bandwidth situations• One-way replication• Active queries at the edge, as well as replicating to the hub• Search/Analytics supported at edge and hub clusters
3© 2016 DataStax, All Rights Reserved.
Company Confidential
“What was Brian’s average
purchase per store this week?”
Analytics Over All Data
“What did Brian buy today across
all stores?”
Can Query Global Sales
“What was the hottest product
here this week?”
Analytics of Local Sales
“What did Brian buy here today?”
Can Query Local Sales
Each Store
Central Hub
Example: Retail Sales
© 2016 DataStax, All Rights Reserved.
Oil and
Gas
Company Confidential
Mobile
deploy
Transport
Telecom
Retail
Banking,
FinanceIndustrial IoT
Key Verticals
© 2016 DataStax, All Rights Reserved.
Advanced Replication Key Terminology• Edge – DSE Cluster that is the source of change events
• Hub – DSE Cluster that receives change events• Replication Log – A table on the edge cluster that stores changes
• Channel – Defined replication configuration between an edge and hub table
• Collection Agent – Captures change events to the replication log table
• Replication Agent – Reads replication log and writes to the hub
6© 2016 DataStax, All Rights Reserved.
Architecture – Edge View
7
Client
Edge
ReplicationLog
CollectionAgent Table
ReplicationAgent
Hub Cluster
Table
© 2016 DataStax, All Rights Reserved.
Architecture – Edge View
8
Client
Edge
ReplicationLog
CollectionAgent Table
ReplicationAgent
Hub Cluster
Table
Normal CQL Operation
CQL Trigger captures mutation
Maintained in C* table for Fault
Tolerance
Pulls from Replication Log in priority/time order
Replicates to Hub via normal CQL
driver
High Priority mutations opportunistically sent to
Hub asynchronously
© 2016 DataStax, All Rights Reserved.
Points of Nuance• Does it handle TTLs?
• The edge cluster will NOT capture the TTL of of the base record• The hub table can have default TTL that is different than edge table
• Can I repair from edge to hub?• Because these are separate clusters there is no repair mechanism• Replication mechanism ensures writes make it to hub eventually
• This looks like Hints!• More robust than Hinted Handoff
9© 2016 DataStax, All Rights Reserved.
Topology
10© 2016 DataStax, All Rights Reserved.
West
East
Central
Store #1
Store #7
Store #2
Store #6
Store #5
Store #4
Store #3
Store #8
Store #9
Store #10
Store #11
Recommended