Upload
datastax-academy
View
1.910
Download
3
Embed Size (px)
Citation preview
Who am I and what do I do?
• Ben Bromhead
• Co-founder and CTO of Instaclustr -> www.instaclustr.com
• Instaclustr provides Cassandra-as-a-Service in the cloud
• Currently in AWS, Azure and IBM Softlayer
• We currently manage 400+ nodes
The driver
• The Cassandra driver contains the logic for connecting to Cassandra and running queries in a fast and efficient manner
• Focus on the Datastax Open Source drivers:
Cassandra Drivers• All have a similar architecture that consists of:
• Session & pool management
• Chainable policies for managing failure and performance
• Sync vs Async queries
• Failover & Retry
• Tracing
Cassandra DriversA basic example in Java:
Clustercluster=Cluster.builder().addContactPoints("52.89.183.67").withPort(9042).build();Sessionsession=cluster.newSession();session.execute("SELECT*FROMfoo…");
Cassandra DriversA basic example in Python:
cluster=Cluster(contact_points=["52.89.183.67"],port=9042)session=cluster.connect()rows=session.execute("SELECTname,age,emailFROMusers")
Cassandra DriversA basic example in Ruby:
cluster=Cassandra.cluster(:hosts=>["52.89.183.67","52.89.99.88","54.69.217.141"],:datacenter=>'AWS_VPC_US_WEST_2')
session=cluster.connect()rows=session.execute("SELECTname,age,emailFROMusers")
Cassandra Drivers• Architecture makes the driver similar across languages
• What happens under the hood?
• Cluster object creates configuration (auth, load balancing, contact points).
• Session object holds the thread pool and manages connections.
• Session object authenticates and maintains connections.
• Session can be shared and is threadsafe!
Different ways of querying• Synchronous:
session.execute("SELECT*FROMfoo..”);
• Asynchronous:
ResultSetFutureresult=session.executeAsync("SELECT*FROMfoo..”);
result.get();
How do these perform?Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
Op/s
Different ways of queryingPrepared Statements:
PreparedStatementstatement=getSession().prepare(
"INSERTINTOsimplex.songs"+
"(id,title,album,artist,tags)"+
"VALUES(?,?,?,?,?);");
Different ways of queryingboundStatement=newBoundStatement(statement);
getSession().execute(boundStatement.bind(
UUID.fromString("2cc9ccb7-6221-4ccb-8387-f22b6a1b354d"),
UUID.fromString("756716f7-2e54-4715-9f00-91dcbea6cf50"),
"LaPetiteTonkinoise",
"ByeByeBlackbird",
"JoséphineBaker"));
Drivers and consistency
• Within the different ways of querying Cassandra you can also adjust the consistency level per query.
• Lets have a quick consistency refresh
A brief intro to tuneable consistency
• Cassandra is considered to be a db that favours Availability and Partition Tolerance.
• Let’s you change those characteristics per query to suit your application requirement
Two consistency levers
• Consistency level - How many acknowledgements/responses from replicas before a query is considered a success.
• Replication Factor (RF) - How many copies of a record do I store.
Two consistency levers
• Consistency level - Chosen by the client at query time
• Replication Factor (RF) - Determined client on schema definition
Consistency Levels
• ALL - Every replica
• *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM)
• Numbered - (ONE, TWO, THREE, LOCAL_ONE)
• *SERIAL - (SERIAL, LOCAL_SERIAL)
• ANY
What does it all mean
• At the client level (your application) you have total control
• Define implicit and explicit failure handling
• Isolate queries to a single geography
• Trade consistency for latency (a decision is better than no decision)
How does CL impact Op/s ?Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does CL impact latency ?Latency
0
30
60
90
120
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
Write CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
✓✓
Required responses: floor(3 * 0.5) + 1 = 2
How does an outage impact Op/s ?Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does an outage impact latency ?Latency
0
25
50
75
100
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
We are now have a replica that is not consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff
• Read repair
We are now have a replica that is not consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff - lets cover this quickly
• Read repair
How does hinted handoff work?
partition_key: b1
2
3
4
Gossip: 2 is now UP Node 1: I have stored hints for 2
Some things to keep in mind• Cassandra will only store hints for a certain period of time, set by
max_hint_window_in_ms. 3 hours by default
• Hints are not a reliable delivery mechanism
• Hint replay will cause counters to overcoat
• CF of ANY will cause a hint to be stored even if no replicas are available. Sometimes called extreme availability… also called who knows where and if your data is safe?
Hinted handoff performance• Causes the same volume of writes to occur in a cluster with
reduced capacity (local write amplification on the co-ordinator node)
• Hints are written to system.hints, each replica has hints stored in a single partition.
• Hints use TTLs and tombstones.. the hint table is actually a queue!
• When cassandra starts compacting or throwing tombstone warnings on the system.hints table… things are bad
Hinted handoff performance• Rewritten in Cassandra 3.0 (in beta now)
• Takes a commitlog approach:
• No compaction
• no TTL
• no tombstones
• no memtables
How does this relate to the driver?
• With a node outage the “latency” on the down node becomes hours/days until it becomes consistent
• Cassandra itself takes over the client portion of ensuring the write makes it to the node that was down.
• You can control whether C* handles this (via repair, HH etc) or whether your application controls this (have your client receive an exception instead).
Driver policies
• Cassandra driver policies allow you to control failure
• Cassandra driver policies allow you to control how the driver routes requests
• The driver is your load balancer
• This can reduce your latency and/or increase op/s (in some cases)
Retry Policy
• Default Retry Policy
• Downgrading Consistency Retry Policy
• Fall through Retry Policy
• Logging Retry Policy
Driver policies impact latency ?Latency
0
0.3
0.6
0.9
1.2
Read Sync Write Sync
Round Robin Token Aware Latency Aware
Last but not least• Use one Cluster instance per (physical) cluster (per application
lifetime)
• Use at most one Session per keyspace, or use a single Session and explicitly specify the keyspace in your queries
• If you execute a statement more than once, consider using a PreparedStatement
• You can reduce the number of network roundtrips and also have atomic operations by using Batches