42
Distributed Consensus (continued)

Distributed Consensus (continued)

  • Upload
    alize

  • View
    46

  • Download
    0

Embed Size (px)

DESCRIPTION

Distributed Consensus (continued). Byzantine Generals Problem Solution with signed message. A signed message satisfies all the conditions of oral message, plus two extra conditions Signature cannot be forged. Forged message are detected and discarded by loyal generals. - PowerPoint PPT Presentation

Citation preview

Page 1: Distributed Consensus (continued)

Distributed Consensus (continued)

Page 2: Distributed Consensus (continued)

Byzantine Generals ProblemSolution with signed message

A signed message satisfies all the conditions of oral message, plus two extra conditions

• Signature cannot be forged. Forged message are detected and discarded by loyal generals.

• Anyone can verify its authenticity of a signature.

Signed messages improve resilience.

Page 3: Distributed Consensus (continued)

Examplecommander 0 commander 0

lieutenent 1 lieutenant 2 lieutenent 1 lieutenant 2

(a) (b)

1{0} 1{0}

0{0,2}

1{0}

0{0}

0{0,2}

1{0,1}

discard

Using signed messages, byzantine consensus is feasible with 3 generals and 1 traitor. In (b) the the loyal lieutenants compute theconsensus value by applying some choice function on the set of values

Page 4: Distributed Consensus (continued)

Signature list

0

1

7 2

4

v{0} v{0,1}

v{0,1,7}

v{0,1,7,4}

Page 5: Distributed Consensus (continued)

Byzantine consensus:The signed message algorithms SM(m)

Commander i sends out a signed message v{i} to each lieutenant j ≠ i

Lieutenant j, after receiving a message v{S}, appends it to a set V.j, only if (i) it is not forged, and (ii) it has not been received before.

If the length of S is less than m+1, then lieutenant j (i) appends his own signature to S, and (ii) sends out the signed message to every other lieutenant whose signature does not appear in S.

Lieutenant j applies a choice function on V.j to make the final decision.

Page 6: Distributed Consensus (continued)

Theorem of signed messages

If n ≥ m + 2, where m is the maximum number of traitors,

then SM(m) satisfies both IC1 and IC2.

Proof.Case 1. Commander is loyal. The bag of each process willcontain exactly one message, that was sent by the commander.

(Try to visualize this)

Page 7: Distributed Consensus (continued)

Proof of signed message theorem

Case 2. Commander is traitor.

• The signature list has a size (m+1), and there are m traitors, so at least one lieutenant signing the message must be loyal.

• Every loyal lieutenant i will receive every other loyal lieutenant’s message. So, every message accepted by j is also accepted by i and vice versa. So V.i = V.j.

Page 8: Distributed Consensus (continued)

Example

ab c

fa

b

c

0

1

2

3

{a, b,-}

{a, b, c}

With m=2 and a signature list of length 2, the loyal generals maynot receive the same order from the commander who is a traitor.When the length of the signature list grows to 3, the problem is resolved

3 accepts c,but 2 rejects f

Page 9: Distributed Consensus (continued)

Concluding remarks

• The signed message version tolerates a larger number (n-2) of faults.

• Message complexity however is the same in both cases.

Message complexity = (n-1)(n-2) … (n-m+1)

Page 10: Distributed Consensus (continued)

Failure detectors

Page 11: Distributed Consensus (continued)

Failure detector for crash failures

• The design of fault-tolerant algorithms will be simple if processes can detect (crash) failures.

• In synchronous systems with bounded delay channels, crash failures can definitely be detected using timeouts.

Page 12: Distributed Consensus (continued)

Failure detectors for asynchronous systems

In asynchronous distributed systems, the detection of

crash failures is imperfect. There will be false positives

and false negatives. Two properties are relevant:

Completeness. Every crashed process is eventually suspected.

Accuracy. No correct process is ever suspected.

Page 13: Distributed Consensus (continued)

13

Failure Detectors

However:

• Hints may be incorrect

• FD may give different hints to different processes

• FD may change its mind (over & over) about the

operational status of a process

An FD is a distributed oracle that provides hints about the operational status of processes.

Page 14: Distributed Consensus (continued)

14

Typical FD Behavior

downProcess p

up

FD at qtrust

suspect

trust

suspect(permanently)

trust

suspect

Page 15: Distributed Consensus (continued)

Revisit the Consensus problem

input output

1 2 3 4

Agreed value

Page 16: Distributed Consensus (continued)

Example

0

6

1 3

5

247

0 suspects {1,2,3,7} to have failed. Does this satisfy completeness?Does this satisfy accuracy?

Page 17: Distributed Consensus (continued)

Classification of completeness

• Strong completeness. Every crashed process is eventually suspected by every correct process, and remains a suspect thereafter.

• Weak completeness. Every crashed process is eventually suspected by at least one correct process, and remains a suspect thereafter.

Note that we don’t care what mechanism is used for suspecting a process.

Page 18: Distributed Consensus (continued)

Classification of accuracy

• Strong accuracy. No correct process is ever suspected.

• Weak accuracy. There is at least one correct process that is never suspected.

Page 19: Distributed Consensus (continued)

Transforming completenessWeak completeness can be transformed into strong completeness

Program strong completeness (program for process i};define D: set of process ids (representing the suspects);initially D is generated by the weakly complete failure detector of i;

{program for process i}do true

send D(i) to every process j ≠ i;receive D(j) from every process j ≠ i;D(i) := D(i) D(j);∪if j D(i) ∈ D(i) := D(i) \ j fi

od

Page 20: Distributed Consensus (continued)

Eventual accuracy

A failure detector is eventually strongly accurate, if there exists a time T after which no correct process is suspected.

(Before that time, a correct process be added to and removed from the list of suspects any number of times)

A failure detector is eventually weakly accurate, if there exists a time T after which at least one process is no more suspected.

Page 21: Distributed Consensus (continued)

Classifying failure detectors

Perfect P. (Strongly) Complete and strongly accurateStrong S. (Strongly) Complete and weakly accurateEventually perfect ◊P.

(Strongly) Complete and eventually strongly accurateEventually strong ◊S

(Strongly) Complete and eventually weakly accurate

Other classes are feasible: W (weak completeness) andweak accuracy) and ◊W

Page 22: Distributed Consensus (continued)

MotivationQuestion 1. Given a failure detector of a certain type,

how can we solve the consensus problem?

Question 2. How can we implement these classes of failure detectors in asynchronous distributed systems?

Question 3. What is the weakest class of failure detectors that can solve the consensus problem?

(Weakest class of failure detectors is closest to reality)

Page 23: Distributed Consensus (continued)

23

Application of Failure Detectors

• Group Membership• Group Communication• Atomic Broadcast • Primary/Backup systems

• Atomic Commitment• Consensus• Leader Election• …..

Applications often need to determine which processes are up (operational) and which are down (crashed). This service is provided by Failure Detector. FDs are at the core of many fault-tolerant algorithms and applications, like

Page 24: Distributed Consensus (continued)

24

p

q

rs

t

q

q

q

q

s

s

SLOW

Page 25: Distributed Consensus (continued)

25

p

q

rs

t

5

7

82

8

Consensus

5

55

5

Crash!

Page 26: Distributed Consensus (continued)

26

Solving Consensus

• In synchronous systems: Possible

• In asynchronous systems: Impossible [FLP83]

even if:• at most one process may crash, and• all links are reliable

Page 27: Distributed Consensus (continued)

A more complete classification of failure detectors

strong completeness

weak completeness

strong accuracy weak accuracy ◊ strong accuracy ◊ weak accuracy

Perfect P Strong S ◊P ◊S

Weak W ◊W

Page 28: Distributed Consensus (continued)

Consensus using P{program for process p, t = max number of faulty processes}

initially Vp := ( , , , …, ); {array of size n}⊥ ⊥ ⊥ ⊥

Vp[p] = input of p; Dp := Vp; rp :=1

{Vp[q] = ⊥ means, process p thinks q is a suspect. Initially everyone is a suspect}

{Phase 1} for round rp= 1 to t +1

send (rp, Dp, p) to all;

wait to receive (rp, Dq, q) from all q, {or else q becomes a suspect};

for k = 1 to n Vp[k] = (r⊥ ∧ ∃ p, Dq, q): Dq[k] ≠ ⊥ Vp[k] := Dq[k] end for

end for

{at the end of Phase 1, Vp for each correct process is identical}

{Phase 2} Final decision value is the input from the first element Vp[j]: Vp[j] ≠ ⊥

Page 29: Distributed Consensus (continued)

Understanding consensus using P

Why continue (t+1) rounds?

It is possible that a process p sends out the first message to q

and then crashes. If there are n processes and t of them

crashed, then after at most (t +1) asynchronous rounds, Vp for

each correct process p becomes identical, and contains all

inputs from processes that may have transmitted at least once.

Page 30: Distributed Consensus (continued)

Understanding consensus using P

1 2 t

Sends (1, D1) and then crashes

Sends (2, D2) and then crashes

Sends (t, Dt) and then crashes

Completely connected topology

Well, I received D from 1, butdid everyone receive it? To ensure multiple rounds of broadcasts arenecessary …

Well, I received D from 1, butdid everyone receive it? To ensure multiple rounds of broadcasts arenecessary …

Page 31: Distributed Consensus (continued)

Consensus using other type of failure detectors

Algorithms for reaching consensus with several other forms of failure detectors exist. In general, the weaker is the failure detector, the closer it is to reality (a truly asynchronous system), but the harder is the algorithm for implementing consensus.

Page 32: Distributed Consensus (continued)

Consensus using S

Vp := ( , , , …, ⊥ ⊥ ⊥ ⊥); Vp[p] := input of p; Dp := Vp

(Phase 1) Same as phase 1 of consensus with P – it runs for (t+1) asynchronous rounds

(Phase 2) send (Vp, p) to all;

receive (Dq, q) from all q;

for k = 1 to n V∃ q[k]: Vp[p] ≠ V⊥ ∧ q[k] = ⊥ Vp[k] := Dp[k] := ⊥ end for

(Phase 3) Decide on the first element Vp [j]: Vp [j] ≠ ⊥

Page 33: Distributed Consensus (continued)

Consensus using S: example

Assume that there are six processes: 0,1,2,3,4,5. Of

these 4, 5 crashed. And 3 is the process that will never

be suspected. Assuming that k is the input from

process k, at the end of phase 1, the following is

possible:

V0 = (0, , 2, ⊥ 3, ,⊥ ⊥)

V1 = ( , 1, , ⊥ ⊥ 3, ,⊥ ⊥)

V2 = (0, 1, 2, 3, ,⊥ ⊥)

V3 = ( , 1, , ⊥ ⊥ 3, ,⊥ ⊥)

At the end of phase 3, the processes agree upon the

input from process 3

0 1

2 3

(0, , 2, ⊥ 3, ,⊥ ⊥) ( , 1, , ⊥ ⊥ 3, ,⊥ ⊥)

5 4

(0, 1, 2, 3, ,⊥ ⊥) ( , 1, , ⊥ ⊥ 3, ,⊥ ⊥)

Page 34: Distributed Consensus (continued)

Conclusion

◊W

Asynchronous system

W

◊S

◊P

S

P Consensus Problem

Cannot solveconsensus

Cannot solveconsensusCan solveconsensus

Page 35: Distributed Consensus (continued)

Paxos

• A solution to the asynchronous consensus problem due to Lamport.

• Runs on a completely connected network of n processes• Tolerates up to m failures, where n >2m+1. Processes can

crash and messages may be lost, but Byzantine failures are ruled out

• Although the requirements for consensus are agreement, validity, and termination, Paxos primarily guarantees agreement and validity.(If it guaranteed all three properties, then that would violate FLP)

Page 36: Distributed Consensus (continued)

PropertiesSafety PropertiesValidity. Only a proposed value can be chosen as the final decision.Agreement. Two different processes cannot make different

decisions.

Liveness PropertiesTermination. Some proposed value is eventually chosen.

(Is it really satisfied without some form of randomization?)Notification. If a value has been chosen, a node can eventually learn

the value.

Page 37: Distributed Consensus (continued)

Three roles of processes

• Each process may play three different roles: proposer, acceptor and learner

acceptor decision

proposer

Too simplistic. What if the acceptor crashes?

proposer

proposer

Page 38: Distributed Consensus (continued)

Paxos algorithm

Phase 1 (prepare):Step 1.1. Each proposer sends a proposal (v, n) to each acceptorStep 1.2. If n is the largest sequence number of a proposal

received by an acceptor, then it sends an ack (n,-,-) to its proposer, which is a promise that it will ignore all proposals numbered lowered than n. In case an acceptor has already accepted a proposal with a sequence number n’< n and a proposed value v, it responds with an ack (n, v, n’). (it implies that the proposer has no point in trying to push the same value with a larger sequence no) [It can however send a new request with the value v]

Page 39: Distributed Consensus (continued)

Paxos algorithm

Phase 2 (accept):Step 2.1. If a proposer receives an ack (n,-,-) from a majority of

acceptors, then it sends accept (n,v) to all acceptors, asking them to accept this value. (Note. If however, an acceptor returned an ack (n,v,n’) to the proposer in phase 1 (which means that it already accepted proposal with value v ) then the proposer must include the value v with the highest sequence number in its request to the acceptors.)

Step 2.2. An acceptor accepts a proposal (n,v) unless it has already promised to consider proposals with a sequence number greater than n.

Page 40: Distributed Consensus (continued)

The final decision

When a majority of the acceptors accepts a proposed value, it becomes the final decision value.

The acceptors multicast the accepted value to the learners. It enables them to determine if a proposal has been accepted by a majority of acceptors.

The learners convey it to the client processes.

Page 41: Distributed Consensus (continued)

Observations

Observation 1. An acceptor accepts a proposal with a sequence number n if it has not sent a promise to any proposal with a sequence number n’> n .

Observation 2. If a proposer sends an accept (v,n) message in phase 2, then either no acceptor in a majority has accepted a proposal with a sequence number n’< n , or v is the value in the highest numbered proposal among all accepted proposals with sequence numbers n’< n accepted by at least once acceptor in a majority of them.

Page 42: Distributed Consensus (continued)

What about Liveness?

(Phase 1) Proposer 1 sends out prepare (v, n1);(Phase 1) Proposer 2 sends out prepare (v,n2), where n2 > n1 ;

(Phase 2) Proposer 1’s accept (n1) is declined, since the acceptor has already promised to proposer 2 that it will not accept any proposal numbered lower than n2. So proposer 1 restarts phase 1 with a higher number n3 > n2;(Phase 2) Proposer 2’s accept request is now declined on a similar ground;

The race can go on forever! To avoid this, either elect a single proposer (how?) or use randomization.

Consider the following scenario: