Client Drivers and Cassandra, the Right Way

Cassandra Drivers

instaclustr.com @Instaclustr

http://instaclustr.com

Who am I and what do I do?

• Ben Bromhead

• Co-founder and CTO of Instaclustr -> www.instaclustr.com

• Instaclustr provides Cassandra-as-a-Service in the cloud

• Currently in AWS, Azure and IBM Softlayer

• We currently manage 400+ nodes

http://www.instaclustr.com

What this talk will cover

• Driver basics

• Sync vs Async

• Driver connection policies and tuning

The driver

• The Cassandra driver contains the logic for connecting to Cassandra and running queries in a fast and efficient manner

• Focus on the Datastax Open Source drivers:

The driver• Java

• .NET (C#)

• C/C++

• Python

• Node.js

• Ruby

• PHP

Cassandra Drivers• All have a similar architecture that consists of:

• Session & pool management

• Chainable policies for managing failure and performance

• Sync vs Async queries

• Failover & Retry

• Tracing

Cassandra DriversA basic example in Java:

Clustercluster=Cluster.builder().addContactPoints("52.89.183.67").withPort(9042).build();Sessionsession=cluster.newSession();session.execute("SELECT*FROMfoo…");

Cassandra DriversA basic example in Python:

cluster=Cluster(contact_points=["52.89.183.67"],port=9042)session=cluster.connect()rows=session.execute("SELECTname,age,emailFROMusers")

Cassandra DriversA basic example in Ruby:

cluster=Cassandra.cluster(:hosts=>["52.89.183.67","52.89.99.88","54.69.217.141"],:datacenter=>'AWS_VPC_US_WEST_2')

session=cluster.connect()rows=session.execute("SELECTname,age,emailFROMusers")

Cassandra Drivers• Architecture makes the driver similar across languages

• What happens under the hood?

• Cluster object creates configuration (auth, load balancing, contact points).

• Session object holds the thread pool and manages connections.

• Session object authenticates and maintains connections.

• Session can be shared and is threadsafe!

Different ways of querying• Synchronous:

session.execute("SELECT*FROMfoo..”);

• Asynchronous:

ResultSetFutureresult=session.executeAsync("SELECT*FROMfoo..”);

result.get();

How do these perform?Operations

0

7500

15000

22500

30000

Read Sync Write Sync Read Async Write Async

Op/s

How do these perform?Latency

0

20

40

60

80


ms

Different ways of queryingPrepared Statements:

PreparedStatementstatement=getSession().prepare(

"INSERTINTOsimplex.songs"+

"(id,title,album,artist,tags)"+

"VALUES(?,?,?,?,?);");

Different ways of queryingboundStatement=newBoundStatement(statement);

getSession().execute(boundStatement.bind(

UUID.fromString("2cc9ccb7-6221-4ccb-8387-f22b6a1b354d"),

UUID.fromString("756716f7-2e54-4715-9f00-91dcbea6cf50"),

"LaPetiteTonkinoise",

"ByeByeBlackbird",

"JoséphineBaker"));

Drivers and consistency

• Within the different ways of querying Cassandra you can also adjust the consistency level per query.

• Lets have a quick consistency refresh

A brief intro to tuneable consistency

• Cassandra is considered to be a db that favours Availability and Partition Tolerance.

• Let’s you change those characteristics per query to suit your application requirement

Two consistency levers

• Consistency level - How many acknowledgements/responses from replicas before a query is considered a success.

• Replication Factor (RF) - How many copies of a record do I store.

Two consistency levers

• Consistency level - Chosen by the client at query time

• Replication Factor (RF) - Determined client on schema definition

Consistency Levels

• ALL - Every replica

• *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM)

• Numbered - (ONE, TWO, THREE, LOCAL_ONE)

• *SERIAL - (SERIAL, LOCAL_SERIAL)

• ANY

What does it all mean

• At the client level (your application) you have total control

• Define implicit and explicit failure handling

• Isolate queries to a single geography

• Trade consistency for latency (a decision is better than no decision)

How does it all work?

Write CL:QUORUM

RF:3

1

2

3

4

partition_key: a


Write CL:QUORUM

RF:3

1

2

3

4

partition_key: a


Write CL:QUORUM

RF:3

1

2

3

4

partition_key: a


Write CL:QUORUM

RF:3

1

2

3

4

partition_key: a


How does CL impact Op/s ?Operations

0

7500

15000

22500

30000


ONE QUORUM ALL

How does CL impact latency ?Latency

0

30

60

90

120


ONE QUORUM ALL

What happens when something goes wrong?

Write CL:QUORUM

RF:3

1

2

3

4

partition_key: b


Write CL:QUORUM

RF:3

1

2

3

4

partition_key: b


Write CL:QUORUM

RF:3

1

2

3

4

partition_key: b


Write CL:QUORUM

RF:3

1

2

3

4

partition_key: b


✓✓

Required responses: floor(3 * 0.5) + 1 = 2

Write CL:QUORUM

RF:3

1

2

3

4

partition_key: b


✓✓

Success!

How does an outage impact Op/s ?Operations

0

7500

15000

22500

30000


ONE QUORUM ALL

How does an outage impact latency ?Latency

0

25

50

75

100


ONE QUORUM ALL

We are now have a replica that is not consistent

• Anti-entropy repair (only guaranteed way to make things consistent)

• Hinted handoff

• Read repair

We are now have a replica that is not consistent

• Anti-entropy repair (only guaranteed way to make things consistent)

• Hinted handoff - lets cover this quickly

• Read repair

What is hinted handoff ?

• A performance optimisation for “catching up” nodes who missed writes.

What isn’t hinted handoff ?

• A consistent distribution mechanism

Write CL:QUORUM

RF:3

partition_key: b1

2

3

4


Write CL:QUORUM

RF:3

partition_key: b1

2

3

4


How does hinted handoff work?

1

2

3

4

host / key A B

1 ✔ ✔

2 ?

3 ✔ ✔

…

✔


partition_key: b1

2

3

4


partition_key: b1

2

3

4

Gossip: 2 is now UP Node 1: I have stored hints for 2


partition_key: b1

2

3

4

Some things to keep in mind• Cassandra will only store hints for a certain period of time, set by

max_hint_window_in_ms. 3 hours by default

• Hints are not a reliable delivery mechanism

• Hint replay will cause counters to overcoat

• CF of ANY will cause a hint to be stored even if no replicas are available. Sometimes called extreme availability… also called who knows where and if your data is safe?

Hinted handoff performance• Causes the same volume of writes to occur in a cluster with

reduced capacity (local write amplification on the co-ordinator node)

• Hints are written to system.hints, each replica has hints stored in a single partition.

• Hints use TTLs and tombstones.. the hint table is actually a queue!

• When cassandra starts compacting or throwing tombstone warnings on the system.hints table… things are bad

Hinted handoff performance• Rewritten in Cassandra 3.0 (in beta now)

• Takes a commitlog approach:

• No compaction

• no TTL

• no tombstones

• no memtables

How does this relate to the driver?

• With a node outage the “latency” on the down node becomes hours/days until it becomes consistent

• Cassandra itself takes over the client portion of ensuring the write makes it to the node that was down.

• You can control whether C* handles this (via repair, HH etc) or whether your application controls this (have your client receive an exception instead).

Driver policies

• Cassandra driver policies allow you to control failure

• Cassandra driver policies allow you to control how the driver routes requests

• The driver is your load balancer

• This can reduce your latency and/or increase op/s (in some cases)

Retry Policy

• Default Retry Policy

• Downgrading Consistency Retry Policy

• Fall through Retry Policy

• Logging Retry Policy

Load Balancing Policy

• Round Robin

• DC Aware

• TokenAware

• LatencyAware

Driver policies impact latency ?Latency

0

0.3

0.6

0.9

1.2

Read Sync Write Sync

Round Robin Token Aware Latency Aware

Last but not least• Use one Cluster instance per (physical) cluster (per application

lifetime)

• Use at most one Session per keyspace, or use a single Session and explicitly specify the keyspace in your queries

• If you execute a statement more than once, consider using a PreparedStatement

• You can reduce the number of network roundtrips and also have atomic operations by using Batches

Thank you! Questions?

Technology

Client Drivers and Cassandra, the Right Way