Thoughts on consistency models

Preview:

DESCRIPTION

Some thought on the cap theorem, tradeoffs etc. as presented to the melbourne mongodb user group.

Citation preview

cap

cap theorem

• Eric Brewer (ex-Inktomi)

• Proved by Lynch and Gilbert

cap theorem

It is impossible in the asynchrounous networkmodel to implement a read/write objectthat garantuees the following properties:

- Availability- Atomic consistency in fair transactions

Or: If the network is broken, your database won’t work

AP vs CP

• Real choices are

• Available - Partition

• Consistent - Partion

AP• Multiple Nodes participate in writes

• System will be Eventually Consistent

• Storage System guarantees if there are no new updates, all reads will eventually return the same, last updated value

Examples:- DNS- ASync replication- MongoDB with Slave-OK- Memcache

eventual consistencyMaster

Slave Slave

Client Client

Asuming update 1,2,3,4,5

Client will expect 1,2,2,2,3,4,5,5,5

eventual consistencyMaster

Slave Slave

Client Client

However, we could get this: 1,2,2,4,2,5

eventual consistency

• Monotonic read consistency

• Pin client to certain slave / app server

• Failover still fails

multi masterDynamo model

R - number of servers to read fromW - number of servers to get response fromN - Replication Factor

R + W > N has nice properties

multi masterExample 1

R + W <= N

R = 1W = 1N = 5

Possibly Stale DataHigher Availability

Example 2

R + W > N

R = 2 R =1 W = 1 W = 2N = 2 N = 2

‘Consistent’ Data

R + W > N

If R + W > N you can’t both have fast local reads and writes

network partitions

trivial network partition

network write possibilities

• deny all writes

• read fully consistent data

• allow writes on one side

• allow reads on other side (stale)

• allow writes on both sides

• give up consistency

multiple writer strategies

• Last one wins

• vector clocks

• Insert

• insert often means:

• if (!exist(x)) set(x)

• exist is hard to implement in eventually consistent systems

delete

op1: set joe, age 40op2: delete joeop3: set joe, 41

- consider switching 2 and 3 - tombstone: remember delete and apply last op wins

multiple writer strategies• programmatic merge

• store ops instead of state

• replay operations

• did I get the last one ?

• Commutative operations

• conflict free

• anything that’s foldable

CP

• Sometimes we need global state

• Unique - constraints

• User registration

• ACL changes

Finally

uptime(CP + average developer)>=uptime(AP + average developer)

Where uptime is the system is up and non-buggy

Recommended