43
QUORUMS By gil ben-zvi

QUORUMS By gil ben-zvi. definition Assume a universe U of servers, sized n. A quorum system S is a set of subsets of U, every pair of which intersect,

Embed Size (px)

Citation preview

QUORUMS

By gil ben-zvi

definition• Assume a universe U of servers, sized n. A

quorum system S is a set of subsets of U, every pair of which intersect, each Q belongs to S is called a quorum.

EXAMPLES

• Weighted majorities: assume that every server s in the universe U is assigned a number of votes w(s). Then weighted majorities is a quorum set defined by

}2/)()(:{

Qs Us

swswUQS

EXAMPLES

• MAJORITIES : a weighted majorities quorum system when all weights are the same.

• Singleton: a weighted majorities quorum system when for one server s: w(s)=1, and for each v of the other servers w(v)=0. (only quorum is s)

EXAMPLES

• Grid: suppose n is a square of some integer k. arrange the universe in a k x k grid. A quorum is the union of a full row and one element from each row below.

• FPP: suppose a projective plane over a field sized q. each point is an element, and each line is a quorum. By projective plane attributes, each quorum intersect.

More definitions• Coterie: a coterie S is a quorum system

such that for any Q1,Q2 quorums in S: Q1 isn’t included in Q2

• Domination: coterie S1 dominates coterie S2 if for every quorum Q2 belongs to S2, there exist Q1 in S1, such that S1 is contained in S2.

• Strategy: a probability vector representing the probability to access each quorum.

measures

• Load: the load L(S) of a quorum system is the minimal access probability minimized over the strategies.

• Resilience: resilience is k, if k is the largest number such that for every k server crashes, one quorum remains unhit.

measures

• Failure probability: if every server has certain probability to crash (assuming independently here), the probability that each quorum is hit. Usually assuming each server has the same crash probability p.

Measures examples

• Singleton: load=1, resilience=0, failure probability=p

• Majorities: load is about ½. Resilience about (n-1)/2. failure probability (if p < ½) smaller than exp(e,-n).

• Grid: load is O(1/sqrt(n)). Resilience = sqrt(n)-1, failure probability tends to 1 as n grows.

Access protocol

• Implements the semantics of a multi-writer multi-reader atomic variable.

• Assumes all clients and servers are non byzantine, unique timestamp for a client

• Write: a client asks some quorum to obtain a set of value/timestamps pairs, then he writes his value with higher timestamp than each of the timestamps received to each server in the quorum.

Access protocol

• Read: a client asks for each server in some quorum to obtain a set of value/timestamp. The client chooses the pair with the highest timestamp. It writes back the pair to each server in some quorum

• Server S updates a pair of value/timestamp, only if the timestamp is greater than the timestamp currently in S

Byzantine quorum systems

• We will use access protocol to demonstrate the subject

• Assuming communication is reliable, clients are correct, servers can be byzantine, assuming that a non-empty set of subsets of U: BAD, is known, some B in BAD contains all the faulty servers.

Masking quorum systems• A quorum system S is a masking quorum

system for a fail-prone system BAD if the following properties are satisfied:

QBSQBADBtyAvailabiliM

BBQQ

BADBBSQQyConsistencM

:,:

21\)21(

:2,1)2,1(:

Access protocol

• write: remains the same

• Read: for a client to read the variable x, it queries servers for some quorum Q to obtain a set of value/timestamp pairs

]}[1

]1[[1:,{

:

},{

ttvvBu

BBBADBQBtvC

computes

tvA

uu

Quuu

Access protocol

• The client chooses the pair with the highest timestamp in C, or null if C is empty.

Access protocol

• Claim: a read operation that is concurrent with no write operations return the value written by the last preceding write operation in some serialization of all preceding write operations.

• Claim: there exists a masking quorum system for BAD iff is a masking quorum system for BAD

}:\{ BADBBUS

Access protocol• Criterion: there exists a masking quorum

system for BAD iff for all

4321

43)21(\

43\))2\()1\((

:4,3,2,1:

4321,4,3,2,1

BBBBU

BBBBU

BBBUBU

BADBBBBproof

BBBBUBADBBBB

F-masking quorum systems

• F-masking quorum system: A masking quorum system where BAD is the set of all groups of servers sized f.

• By previous claims:

• There exists a masking quorum system for BAD iff n>4f

• Each pair of quorums must intersect by at least 2f+1 elements.

examples

• For f-masking quorums:

}12||,}...1{}{,:{\

)13(

:

}2/)12(|:|{

fInjIRCQ

nf

grid

fnQUQQ

Iiij

Dissemination quorum systems

• Assumes clients can digitally sign the value/timestamp they propagate.

• Therefore weaker demands than masking

• A quorum system S is a dissemination quorum system for a fail-prone system BAD if the following properties are satisfied:

QBSQBADBtyAvailabiliD

BQQBADBSQQyConsistencD

:,:

21:,2,1:

Dissemination quorum systems

• The same way as masking we reach the (different) criterion:

• There exists a dissemination quorum system for BAD iff

• If no more than f servers can fail, but any set of f servers can fail, then must hold: n>3f

321,3,2,1 BBBUBADBBB

Opaque masking quorum systems

• Motivation: We want not to expose the fail-prone system BAD.

• done by majority decision.• properties for quorum system to become opaque

masking system:

QBSQBADBtyAvailabiliO

BQBQQ

BADBSQQyConsistencO

QQBQBQQ

BADBSQQyConsistencO

:,:

|2||\)21(|

:,2,1:2

|)1\2()2(||\)21(|

:,2,1:1

Opaque masking quorum systems

• Read: the modification is that the client choose the pair <v,t> that appears most often, if there are multiple such sets, it chooses the newest one.

• Claim: Suppose maximum f servers can fail, there exists an opaque quorum system for BAD iff n>=5f, sufficient because quorums sized [(2n+2f)/3] is an opaque quorum system for B.

Opaque masking quorum system

• Claim: The load of any opaque system is at least ½.

• Proof: if we sum up the load of a certain quorum, we’ll get it’s bigger than it’s size/2. the claim follows.

• Example: hadamard matrix, world of size exp(2,l)

Faulty clients

• Solves the problem that a client will try to fail the protocol.

• The treatment here provides a single-writer multi-reader semantics.

• The write operation starts when the 1st server receives update request, and ends when the last server sent acknowledgment.

Faulty clients

• Write: for a client c to write the value v, it chooses legal timestamp, larger than any timestamp it has chosen before, chooses a quorum Q, And then it sends <update,Q,v,t> to each server in Q, if after some timeout period it has not received acknowledgment, than it chooses another quorum.

Faulty clients-servers protocol• The servers protocol is as follows:

1. if a server receives <update,Q,v,t> from a client c, with legal timestamp, then it sends <echo,Q,v,t> to each member of Q.

2. If a server receives identical echo messages <echo,Q,v,t> from every server in Q, then it sends <ready,Q,v,t> to each member of Q.

Faulty clients-servers protocol

3. If a server receives identical ready messages <ready,Q,v,t> from a set of servers that certainly doesn’t contain faulty server, it sends <ready,Q,v,t> to Q.

4. If a server receives identical ready messages <ready,Q,v,t> from a set Q1 of servers, such that Q1=Q\B for some B in BAD, it sends acknowledgment for c, and update the pair if t is greater than the timestamp it currently has.

Faulty servers-properties

• Agreement: if a correct server delivers <v,t> and a correct server delivers <r,t> then r=v

• Proof: if a correct server delivers <v,t>, then echo must have been send by all correct servers in Q1. same about Q2, they intersect in a correct server, which doesn’t send different value with the same timestamp

Faulty servers-properties

• Claim: Read received last written value if it’s not concurrent with write operations.

• Proof: same as masking quorum system.

• Propagation: similar ideas to r.b, and byzantine agreement, if server decides to deliver, it is promised that all other decides that too.

• Validity: at the end a correct quorum will be accessed, so the write can end.

Load, capacity, availability • Load: we will mark L(S), definition as

before

• availability: failure probability with the same “p” for all the servers, we will mark it as Fp(S)

• Capacity: we’ll define a(S,k) as the maximum number of quorum accesses that S can handle during a period of k time units. Capp(S) is the limit of a(S,k)/k as k tends to infinity.

Load, capacity, availability

• Example: majorities

• The claim is that cap(S)=1/L(S), and there is a trade off between good availability and good load.

definitions• The cardinality of the smallest quorum is

denoted by c(S)

• The degree of an element i in a quorum system S is the number of quorums that contain i

• Let S be a quorum system. S is a s-uniform if |Q| = s for each Q in S

• S is (s,d) fair if it is s-uniform and deg(i)=d foreach i, it is called s-fair if it is (s,d) fair for some d.

LP

• We can use a linear programming to calculate the load and the strategy achieving the load.

0

,

1

.,min)(:

,0

1

Lw

UiLw

w

tsLSLLP

j

sij

m

jj

j

DUAL LP

• Some time we want to use the dual linear program, in which we give probabilities over the elements of the world. It is a known fact that DLP<=LP

0{

0{

,{

1{

.,max)(:

1

T

y

SQTy

y

tsTStDLP

i

Qii

n

ii

The load with failures

• A configuration is a vector in which it holds 1 in places representing the failing elements in the world

• Dead(x) is the group of elements failed, live(x) is the non failed ones

• S(x) is the sub collection of functioning quorums

nx }1,0{

)}(:{)( xliveQSQxS

Load with failiures

• The load of quorum system S over a configuration x, if S(x) is empty then L(S(x)) = 1, if there are functioning quorums we define it in similar way as before by linear programming problem.

• Let the elements fail with probabilities P=(p1,………,pn). Then the load is a random variable Lp(S) defined by:

LSxL

X xdeadi xlivei

pipiLSLpP

)()( )(

)1())((

Load with fails

• Claim: E(Lp(S))>=Fp(S)

• Claim: If (configurations) x>=z than L(S(x))>=L(S(z))

• Proof: S(z) contains S(x), strategy for S(x) is for S(z) too.

• Claim: E(Lp(S)) is a non decreasing function.

Properties of the load

• Claim: L(S)>=c(S)/n

• Claim: L(S)>=1/c(S)

• Proof: if we choose probability 1/c(S) for every element in c(S) and 0 in the rest, we achieve possible solution for the DLP problem.

• Conclusion: L(S)>1/sqrt(n) (achieved when c(S) is close to sqrt(n)

Load/fail probability trade off

• Claim: Fp(S)>=exp(p,n*L(S))

• Proof: the probability that all the elements in the smallest quorum will fail, (and therefore the quorum system fails) = exp(p,c(S)). Since c(S)<=nL(S) the claim follows.

examples

• Optimal load, optimal load/ failure tradeoff, good failure load – paths system

• B-grid system

• SC-grid system

• AndOr system

Load analyses

• Claim: Non dominated coteries have lower bounds.

• The claim follows if you choose strategy for the dominator by giving the probability only in quorums which contained by a quorum in the dominated quorum system

• Claim: voting systems have high load (more than ½)

Last slide!!!!

• Proof: if we define V=the sum of all votes (Vi), then the vector Yi=Vi/V is a solution for DLP larger than ½.