54
Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein, Sriram Srinivasan Basho Chats #004 June 27, 2012

Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Embed Size (px)

Citation preview

Page 1: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Logic and Lattices for Distributed Programming

Neil ConwayUC Berkeley

Joint work with:Peter Alvaro, Peter Bailis,

David Maier, Bill Marczak,Joe Hellerstein, Sriram Srinivasan

Basho Chats #004June 27, 2012

Page 2: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Programming

Page 3: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Distributed Programming

Page 4: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Dealing with Disorder

Introduce order– Paxos, Zookeeper, Two-Phase Commit, …– “Strong Consistency”

Tolerate disorder– Correct behavior in the face of many

possible network orders– Typical goal: replicas converge to same

final state• “Eventual Consistency”

Page 5: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Eventual Consistency

Popular Hard toprogram

Page 6: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Help developers buildreliable programs on top ofeventual consistency

Page 7: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

This Talk1. Theory– CRDTs, Lattices, and CALM

2. Practice– Programming with Lattices– Case Study: KVS

Page 8: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Read: {Alice, Bob}

Write: {Alice, Bob, Dave}

Write: {Alice, Bob, Carol}

Students{Alice, Bob, Dave}

Students{Alice, Bob, Carol}

Client0

Client1

Read: {Alice, Bob} Students{Alice, Bob}

How to resolve?

Students{Alice, Bob}

Page 9: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Problem

Replicas perceive different event orders

Goal Same final state at all replicas

Solution

Commutative operations (“merge functions”)

Page 10: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}Client0

Client1

Merge = Set Union

Page 11: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Commutative Operations

• Used by Dynamo, Riak, Bayou, etc.• Formalized as CRDTs: Convergent

and Commutative Replicated Data Types– Shapiro et al., INRIA (2009-2012)– Based on join semilattices– Commutative, associative, idempotent

• Practical libraries: Statebox, Knockbox

Page 12: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Set(Union)

Integer(Max)

Boolean(Or)

“Growth”:Larger Sets

“Growth”:Larger Numbers

“Growth”:false true

Page 13: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Client0

Client1

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}

Teams{<Alice, Bob>}

Teams{<Alice, Bob>}

Read: {Alice, Bob, Carol, Dave}

Read: {<Alice,Bob>}Write: {<Alice,Bob>, <Carol,Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Remove: {Dave} Students{Alice, Bob, Carol}

Replica Synchronization

Students{Alice, Bob, Carol}

Teams{<Alice, Bob>,

<Carol, Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Teams{<Alice, Bob>,

<Carol, Dave>}

Page 14: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Client0

Client1

Students{Alice, Bob, Carol,

Dave}

Students{Alice, Bob, Carol,

Dave}

Teams{<Alice, Bob>}

Read: {Alice, Bob, Carol}

Read: {<Alice,Bob>}Teams

{<Alice, Bob>}

Remove: {Dave} Students{Alice, Bob, Carol}

Replica Synchronization

Students{Alice, Bob, Carol}

Nondeterministic Outcome!

Teams{<Alice, Bob>}

Teams{<Alice, Bob>}

Page 15: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Possible Solution:Wrap both replicated values

in a single complex CRDT

Page 16: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Goal:Compose larger application

using “safe” mappingsbetween simple lattices

Page 17: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Set(merge = Union)

Integer(merge = Max)

Boolean(merge = Or)

size() >= 5

Monotone functionfrom set max

Monotone functionfrom max boolean

Page 18: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Monotonicity in Practice

“The more you know, the more you

know”

Never retractprevious outputs(“mistake-free”)

Typical patterns:• immutable data• accumulate knowledge over

time• threshold tests (“if” w/o

“else”)

Page 19: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Monotonicity and Determinism

Agents strictly learn more knowledge over

time

Monotone: different learning order, same

final outcome

Result:Program is deterministic!

Page 20: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

20

A program is confluent if it produces the same results regardless of network nondeterminism

Page 21: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

21

A program is confluent if it produces the same results regardless of network nondeterminism

Page 22: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Consistency

As

Logical

Monotonicity

CALM Analysis

1.All monotone programs are confluent

2.Simple syntactic test for monotonicity

Result: Simple static analysis for eventual consistency

Page 23: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Handling Non-Monotonicity

… is not the focus of this talk

Basic choices:1. Nodes agree on an event order using a

coordination protocol (e.g., Paxos)2. Allow non-deterministic outcomes• If needed, compensate and apologize

Page 24: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Putting It Into Practice

What we’d like:• Collection of agents• No shared state

( message passing)• Computation over

arbitrary lattices

Page 25: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Bloom

Organization Collection of agents

Communication

Message passing

State Relations (sets)

Computation Relational rules over sets (Datalog, SQL)

Page 26: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Bloom BloomL

Organization Collection of agents

Collection of agents

Communication

Message passing Message passing

State Relations (sets) Lattices

Computation Relational rules over sets (Datalog, SQL)

Functions over lattices

Page 27: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

27

Quorum Vote in BloomL

QUORUM_SIZE = 5RESULT_ADDR = "example.org"

class QuorumVote include Bud

state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] lset :votes lmax :vote_cnt lbool :got_quorum end

bloom do votes <= vote_chn {|v| v.voter_id} vote_cnt <= votes.size got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE) result_chn <~ got_quorum.when_true { [RESULT_ADDR] } endend

Map set ! max

Map max ! bool

Threshold test on bool

Lattice state declarations

Communication interfaces

Accumulate votesinto set

Annotated Ruby class

Program state

Program logic

Merge function for set lattice

Page 28: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

28

Builtin LatticesName Description ? a t b Sample Monotone

Functions

lbool Threshold test false a ∨ b when_true() ! v

lmax Increasing number

1 max(a,b)

gt(n) ! lbool+(n) ! lmax-(n) ! lmax

lmin Decreasing number

−1 min(a,b)

lt(n) ! lbool

lset Set of values ; a [ b intersect(lset) ! lsetproduct(lset) ! lset

contains?(v) ! lboolsize() ! lmax

lpset Non-negative set

; a [ b sum() ! lmax

lbag Multiset of values

; a [ b mult(v) ! lmax+(lbag) ! lbag

lmap Map from keys to lattice values

empty

map

at(v) ! any-latintersect(lmap) ! lmap

Page 29: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Case Study

Page 30: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Goal:Provably eventually consistent

key-value store (KVS)

Assumption:Map keys to lattice values

(i.e., values do not decrease)

Solution:Use a map lattice

Page 31: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

Nested lattice value

Page 32: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

Add new K/V pair

Page 33: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

“Grow” value in extant K/V pair

Page 34: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

Replica Synchronization

Page 35: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Goal:Provably eventually consistent KVS that stores arbitrary values

Solution:Assign a version to each

key-value pair

Each replica stores increasing versions, not increasing

values

Page 36: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Object Versions in Dynamo/Riak

1. Each KV pair has a vector clock version

2. Given two versions of a KV pair, prefer the one with the strictly greater version

3. If versions are incomparable, invoke user-defined merge function

Page 37: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Vector Clock:Map from node IDs logical

clocks

Logical Clock:Increasing counter

Solution:Use a map lattice

Solution:Use an increasing-int lattice

Page 38: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Version-Value Pairs

Pair = <fst, snd>

Pair merge(Pair o){ if self.fst > o.fst: self elsif self.fst < o.fst: o else new Pair(self.fst.merge(o.fst), self.snd.merge(o.snd))}

Page 39: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

Page 40: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

Version increase;NOT value increase

Page 41: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

R1’s version replacesR2’s version

Page 42: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

New version @ R2

Page 43: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

Concurrent writes!

Page 44: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Time

Replica 1 Replica 2

Merge VC (automatically),value merge via user’s lattice(as in Dynamo)

Page 45: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Lattice Composition in KVS

Page 46: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Conclusion

Dealing with EC

Many event orders order-independent (disorderly) programs

Lattices Disorderly state

Monotone Functions

Disorderly computation

Monotone Bloom

Lattices + monotone functions for safe distributed programming

Page 47: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Questions Welcome

Please try Bloom!

http://www.bloom-lang.org

Or:

gem install bud

Page 48: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Backup Slides

Page 49: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

49

Lattices

hS,t,?i is a bounded join semi-lattice iff:– S is a partially ordered set– t is a binary operator (“least upper bound”)

• For all x,y 2 S, x t y = z where x ·S z, y ·S z, and there is no z’ z 2 S such that z’ ·S z.

• Associative, commutative, and idempotent

– ? is the “least” element in S (8x 2 S: ? t x = x)

Example: increasing integers– S = Z, t = max, ? = -∞

Page 50: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

50

Monotone Functions

f : ST is a monotone function iff

8a,b 2 S : a ·S b ) f(a) ·T f(b)

Example: size(Set) ! Increasing-Int

size({A, B}) = 2size({A, B, C}) = 3

Page 51: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

51

From Datalog ! Lattices

Datalog (Bloom) BloomL

State Relations Lattices

Example Values [[“red”, 1], [“green”, 2]]

set: [“red”, “green”]map: {“red” => 1, “green” => 2}counter: 5condition: false

Computation Rules over relations Functions over lattices

Monotone Computation

Monotone rules Monotone functions

Program Semantics

Fixpoint of rules(stratified semantics)

Fixpoint of functions(stratified semantics)

Page 52: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

52

Bloom Operational Model

Page 53: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

53

QUORUM_SIZE = 5RESULT_ADDR = "example.org"

class QuorumVote include Bud

state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] table :votes, [:voter_id] scratch :cnt, [] => [:cnt] end

bloom do votes <= vote_chn {|v| [v.voter_id]} cnt <= votes.group(nil, count(:voter_id)) result_chn <~ cnt {|c| [RESULT_ADDR] if c >= QUORUM_SIZE} endend

Quorum Vote in Bloom

Communication

Persistent Storage

Transient Storage

Accumulate votes

Send message when quorum reached

Not (set) monotonic!

Page 54: Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,

Current Status

Writeups

BloomL: UCB Tech ReportBloom/CALM: CIDR’11, website

LatticeRuntime

Available as a git branch• To be merged soon-ish

Examples, Case Studies

• KVS• Shopping carts• Causal deliveryUnder development:• MDCC, concurrent editing