View
32
Download
1
Tags:
Embed Size (px)
Citation preview
DAIS'13, June 3-5, 2013, Florence, Italy
Strengthening Consistency in the Cassandra Distributed Key-Value
Store
P. Garefalakis, P. Papadopoulos, I. Manousakis and K. MagoutisInstitute of Computer Science (ICS)Foundation for Research and Technology – Hellas (FORTH)Heraklion, Greece
DAIS'13, June 3-5, 2013, Florence, Italy
Motivation
• This is the age of big data• Platforms for analyzing them are important• Relying on distributed key value stores
DAIS'13, June 3-5, 2013, Florence, Italy
Motivation
• Companies such as Amazon and Google and open-source communities such as Apache have proposed several key-value stores– Availability and fault tolerance through data
replication
• We focused on Apache Cassandra which is the technology choice for large data-driven organizations:
DAIS'13, June 3-5, 2013, Florence, Italy
Cassandra
DAIS'13, June 3-5, 2013, Florence, Italy
Cassandra’s Architecture
DAIS'13, June 3-5, 2013, Florence, Italy
Cassandra’s Architecture
DAIS'13, June 3-5, 2013, Florence, Italy
Cassandra’s Architecture
DAIS'13, June 3-5, 2013, Florence, Italy
Cassandra’s Architecture
2/3 Responses: {X,Y}
Need for reconciliation!
DAIS'13, June 3-5, 2013, Florence, Italy
Goal
• Strengthen data consistency by replacing Cassandra’s replication mechanism with strongly consistent alternatives – Oracle BDB HA key-value store– Alternative: an in-house solution – A benefit of our design is simplicity
• Improve availability by rapidly propagating changes to clients via new membership protocol– Replaces Cassandra’s gossip-based membership
DAIS'13, June 3-5, 2013, Florence, Italy
System Architecture
DAIS'13, June 3-5, 2013, Florence, Italy
System Components & interactions
DAIS'13, June 3-5, 2013, Florence, Italy
Implementation & Preliminary Results
• Cluster of 6 Cassandra nodes (single replica RG’s).
• Flexiant VMs: 2CPUs, 2GB memory and 20GB remote mounted disk space.
• Yahoo Cloud Serving Benchmark (YCSB).– 4 threads and read 1 GB of Data
DAIS'13, June 3-5, 2013, Florence, Italy
Implementation insight
• We decided to extend Cassandra’s default storage system
• Difficulty to convert Cassandra’s complex schema into Oracle BDB simpler key-value schema– Cassandra row key maps to BDB key, value is a large BLOB
• large cost of updates
– Cassandra (r, cf, cq, v) key maps to BDB key, value is Cassandra cell
• row explosion
DAIS'13, June 3-5, 2013, Florence, Italy
Future Work
• Evaluation of scalability, availability.
• Elasticity: stream a number of key ranges to a newly joining RG.
DAIS'13, June 3-5, 2013, Florence, Italy
Conclusions
• We strengthened Cassandra’s data replication semantics.– Leverage large existing application base
• Our system is :– applicable to a wide range of applications– simpler, easier to reason about
• Currently working on more advanced functionality.
DAIS'13, June 3-5, 2013, Florence, Italy
Email : [email protected]