20
Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Embed Size (px)

Citation preview

Page 1: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Consistency without consensusLinearizable Resilient Data Types (LRDT)

Kaushik RajanSagar Chordia Kapil Vaswani

Ganesan RamalingamSriram Rajamani

Page 2: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Consistency & consensus

Add(The Hobbit)

Add(Kindle)

GetCart()

Processes agree on ordering of operations

GetCart()

No deterministic algorithm in the presence

of failures [FLP]

Page 3: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Commuting updates• What if all update operations commute?– Ordering of updates doesn’t matter!– Eventual consistency reduces to eventual message delivery– Single round trip latency

• What if we desire linearizability?– Updates don’t commute with arbitrary reads – Reads must be consistently ordered with updates– Semantics of queries like the current top(k) elements well

understood

Page 4: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Commuting updates

Add(The Hobbit)

Add(Kindle)

GetCart()

GetCart()

{}

{The Hobbit, Kindle}

Reads must observe comparable sets of operations

Page 5: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Linearizable resilient data types

Possible ImpossibleDon’t know

SS’

op1

op2op1

op2

P1 : commutes(s,op1,op2)

op1

op2

S

S1

S2

op1

P2 : nullify(s,op1,op2)

op2

S

S1

S2

op2

op1

Page 6: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Examples• Read write register :

every pair of writes nullify• Read write memory :

writes to the same location nullify, writes to different locations commute

Page 7: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Examples• Set : add, remove and read the whole set– Add(u), Remove(v) commute– Add(u), Remove(u) nullify – Add(*), Add(*) commute– Remove(*) Remove(*) commute

• Counter : IncrBy(x), DecrBy(x), SetTo(v), Read()– SetTo(v) nullifies all other operations– Other pairs of updates commute

• Other examples Heaps, union-find, atomic snapshot objects…

Page 8: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Lattice agreement• Consistency reduces to lattice agreement– Weaker problem than consensus– Solvable in an asynchronous distributed system

• Assumptions– t < n/2 failures– Eventual message delivery

Page 9: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Lattice agreement• processes, each process starts with a value belonging

to a join semi lattice• Each non-faulty process outputs a value– (Validity) Each process’ output is a join of one or more input

values including its own– (Consistency) Any two output values are comparable– (Liveness) Every correct process eventually outputs a value

Page 10: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Lattice agreement

{}

{𝑎} {𝑏} {𝑐 }

{𝑎 ,𝑏} {𝑏 ,𝑐 } {𝑎 ,𝑐 }

{𝑎 ,𝑏 ,𝑐 }

𝑝1 𝑝2

𝑝3𝑝2

𝑝3𝑝2

𝑝1

a = Add(The Hobbit)b = Add(Kindle)c = Add(Lumia)

Page 11: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Send to all acceptors

All Acks

?

Output

𝑣 𝑖←⋁ ∀ 𝑁𝑎𝑐𝑘 (𝑎 𝑗 )𝑎 𝑗

wait for majority of acceptors to respond

On receiving

𝑎𝑖≤𝑣 𝑗

S S

Y

N

Y N

PROPOSERS ACCEPTORSInitially

𝑎𝑖=𝑎𝑖∨𝑣 𝑗 𝑎𝑖=𝑎𝑖∨𝑣 𝑗

Page 12: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Safety and liveness• Safety always guaranteed• Lattice agreement is t-resilient – Liveness guaranteed if quorum of processes are non-faulty

and communication is reliable– Processes output value in at-most n round trips, where n is

the number of processes

Page 13: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Generalized lattice agreement• Generalization of lattice agreement – Processes receive sequence of values– Values belong to an infinite lattice

• Processes output a sequence of values– (Validity) Every output value is a join of some received values – (Consistency) Any two output values are comparable (i.e.

output values form a chain)– (Liveness) Every value received by a correct process is

eventually included in an output value

Page 14: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

GLA algorithm• Liveness (t-resilient)– Every received value is eventually included in some output in

n round trips– Adaptive, complexity depends on contention

• Fast path – Received values output in one round trip

• Reconfigurable – Replicas can be added/removed dynamically

Page 15: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

From GLA to linearizability• Update commands form power set lattice• Updates return once majority of processes have learnt a

command set that includes the update command• Read performed by (ABD style algorithm)

1. reading the learnt command set from a quorum of processes2. Writing back the largest among these to a quorum3. Constructing state corresponding to the largest command set

by exploiting commutativity and nullification

• Multi-master replication– Does not require a single primary/leader

Page 16: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Impossibility

• Consensus reductionConsensus(b)

Si S0

if(b) then op1 else op2s = read()if(s = S1,S12) return

trueelse return false

Pair of idempotent update operations that neither commute nor nullify at some state s0

S0

S1

S1

2

S2

S2

1

op2

op1

op1

op2

Si

Op*

op2

op1

Page 17: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Implications for designing ADTs

Most commands commute

Page 18: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Implications for designing ADTs

neither commute nor nullify at

;

Page 19: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

The Gap : Open problems Doubly saturating counter

0 1Incr()

Decr()

2Incr()

Decr()

nIncr()

Decr()Decr()

Incr()

Incr() and Decr() commute at 1 … n-1Incr() and Dect() nullify at 0 and n

Don’t know if this is possible or impossible

Page 20: Consistency without consensus Linearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani

Summary

graph, RW mem… queues, sequences

Possible Impossible??Saturating

counter