INTRODUCTION TO AGREEMENT PROBLEMS

COMPSCI 711 Byzantine radu/2009

1.31

o Agenda:

o Overview of agreement.

o Stopping failures:

With FloodSet.

With EIG (Exponential Information Gathering).

o Classical Byzantine algorithms:

With Authentication.

Vs TMR (Triple Modular Redundancy).

Impossibility example.

With EIG.

INTRODUCTION TO AGREEMENT PROBLEMS

Textbook sections 6, 6.1 – 6.4.


2.31

o Here we will mostly consider complete undirected graphs:

o with reliable communication links.

o but process faults are possible (up to f faulty nodes).

o Some processes may at arbitrary times stop and remain stopped forever (turn into “back-

holes”).

o We use “stop” for such failures, and “halt/terminate” for normal end.

o Some processes may at arbitrary times (but need not) exhibit arbitrary behaviour,

accidentally, or intentionally.

o We use “Byzantine faults” for such failures.

o A stopped fault can be subsumed as a special case of a Byzantine fault.

o A non-faulty process can also be considered as a special case of a Byzantine fault (quite

useful in proofs).

OVERVIEW OF AGREEMENT.

Our terminology :

o non-faulty = elf, elves at least n-f elves

o faulty = orc, orcs up to f orcs

o Some data structures use graphs or trees

o Do not confuse a process node with a

graph/tree node

The complete graph K4

with n=4 nodes


3.31

o Each process starts with its own initial value v V.

o The basic case is binary V = {0, 1}.

o We usually don’t care if this initial value is created by the process itself or received from

an external source.

o We may care about the initial value of a stopping-fault process (e.g., for Commit).

o We usually do not care about the initial value of a Byzantine-fault process

unless it receives the initial value from an external process.

o Ideally each process (or at least each non-faulty process) should eventually (i.e., after a finite

number of rounds) terminate after reaching a decision, usually a value v’ V.

o In some exceptional cases we will have to allow non-faulty process to remain forever

undecided (“blocked”).

o A decision should never be changed.

o But a process that takes a decision may (or may not) continue to exchange useful

messages.


4.31

o There are additional consistency and validity conditions that a decision must meet.

o The decision should be unique otherwise we don’t have an agreement:

o In exceptional cases it will be better without any decision than with an inconsistent one.

o We usually care about the decision taken by a stopping-fault process (that later fails).

o We do not care about the decision value taken by a Byzantine faulty process (why?).

o Extensions are possible though, e.g., to decide on approximately the same value.

o To be valid the decision should be non-trivial:

o should reflect the initial values:

initial values of all nodes:

if there are no failures or,

even if there are (e.g., Commit).

initial values of non-faulty nodes.

o we must exclude trivial algorithms such as decide(0) regardless of initial values.


5.31

o Formal Termination Conditions:

o All non-faulty processes eventually decide

[STRONG].

o If there are no failures then all processes eventually

decide [WEAK].

o Formal Agreement Conditions:

o No two processes (even faulty) ever decide on

different values [STRONG].

o No two non-faulty processes ever decide on different

values [WEAK].

o Formal Validity Conditions:

o If all non-faulty processes start with the same initial

value v V, then v is the only one possible decision

value [STRONG].

o If all processes start with … (rest as above) …

[WEAK].

o Any decision value is the initial value of some

process [STRONG].

o Other more specific conditions (e.g., Commit)

o Explain why each condition with a

[STRONG] label is effectively

stronger than its associated

condition with a [WEAK] label.


6.31

o It is rather straightforward for simple assertions

o Example:

INTERLUDE : WHAT STRONGER MEANS?

A SIMILAR SCENARIO IN PROGRAMMING

TYPE HIERARCHIES

WEAK:

John has a new car

STRONG:

John has a Toyota

Prius 2010

Superclass:

Car

Subclass:

Toyota Prius

Superclass

Subclass

WEAK

STRONG


7.31

o Telling which one is stronger could a bit more difficult for more complex propositions

o Consider two propositions, of the form: hypothesis → conclusion (hypothesis ⊂ conclusion)

T1: H1 → C1 (H1 ⊂ C1)

T2: H2 → C2 (H2 ⊂ C2)

o T1 is stronger than T2 (T1 → T2) if

H2 → H1 (H2 ⊂ H1), and

C1 → C2 (C1 ⊂ C2).

o In words, T1 is stronger than T2 if

o H1 is more general than H2, and

o C1 is more specific than C2

A SIMILAR SCENARIO IN PROGRAMMING

DELEGATE COMPATIBILITY

o In this case, which of the following two methods

seems “better”, can also play the role of the other:

M1: H1 → C1

M2: H2 → C2

o M1 is “better” than M2 if it is:

o contravariant in parameter types: H1 H2, and

o covariant in return types: C1 C2

H1

H2

C2

C1


8.31

o Example: which one is stronger?

1) All non-faulty processes eventually decide.

2) If there are no failures then all processes eventually decide.

o Let’s rewrite these conditions, a bit more formally:

1) [H1] With or without failures, [C1] all non-faulty processes eventually decide.

2) [H2] If there are no failures then [C2] all non-faulty processes eventually decide.

o In this case, condition (1), H1 → C1, is stronger than condition (1), H2 → C2.

o Any algorithm that verifies specs (1) will automatically verify specs (2).

H1:

With or without failures

H2:

If there are no failures

C2 = C2 (here)

all non-faulty processes eventually decide


9.31

STOPPING FAILURES : N > F

Termination: All non-faulty

processes eventually decide

[STRONG].

Agreement: No two processes

(even faulty) ever decide on

different values [STRONG].

Validity: If all processes start with

the same initial value v V, then v

is the only one possible decision

value. [WEAK]

(Or, any decision value is the initial

value of some process [STRONG].

Both W and S conditions are

equivalent in the binary case)

Diagrams for the binary case

Initial Final

Non-faulty

1

1

Decide before fail 1

Fail w/- deciding *

Initial Final

Non-faulty

0

0

Decide before fail 0

Fail w/- deciding *

Initial Final

Non-faulty

1 0

v

Decide before fail v

Fail w/- deciding *

Legend

o 1 = all 1

o 0 = all 0

o v = all v

(same v for all)

o 1 = exists 1

o 0 = exists 0

v must be one of the

initial values for

strong validity,

o automatically in

the bin case.


10.31

PROCESS STOPPING FAILURES : TERMINATION, AGREEMENT, VALIDITY – A BIRD’S EYE VIEW

Validity

initial v → final v

Process #1

initial value v1

Process #2

initial value v2

Process #4

initial value v4

Process #1

decision v

Process #2

decision v

Termination

deciding on something,

here v

Process #5 blocks

(w/- a decision)

(weak termination) Process #5

initial value v5

Process #4 fails

(w/- a decision)

Process #3

initial value v3

Process #3 fails

after decision v

Agreement

same decision v


11.31

Termination: All non-faulty

processes eventually decide

[STRONG].

Agreement: No two non-faulty

processes ever decide on different

values [WEAK, but we can’t do

better in this case].

Validity: If all non-faulty processes

start with the same initial value v

V, then v is the only one possible

decision value. [STRONG]

(There is also a WEAK version but

we don’t consider it here).

Diagrams for the binary case

Initial Final

Non-faulty 1 1

Biz-Faulty * *

Initial Final

Non-faulty 0 0

Biz-Faulty * *

Initial Final

Non-faulty 1 0 v

Biz-Faulty * *

BYZANTINE AGREEMENT : N > 3F

* = orc,

don’t care


12.31

BYZANTINE AGREEMENT EXAMPLES: N = 4, F = 1

Initial choices (#1-4) Final decisions (#1-4) Notes

0 0 0 0 0 0 0 0 REQUIRED

0 0 0 1 0 0 0 0 majority rule? no, it is REQUIRED (why?)

0 0 1 1 v v v v depending on an EIG parameter v0

0 1 1 1 1 1 1 1 majority rule? no, it is REQUIRED (why?)

1 1 1 1 1 1 1 1 REQUIRED

* 0 0 0 * 0 0 0 REQUIRED

* 0 0 1 * 0 0 0, or * 1 1 1 depending on an EIG parameter v0 and the orc

* 0 1 1 * 0 0 0, or * 1 1 1 depending on an EIG parameter v0 and the orc

* 1 1 1 * 1 1 1 REQUIRED

v = unspecified value (but consistent)

* = orc, don’t care

o Some decisions depend on an internal parameter of the specific algorithm, here EIG

o In EIG, this internal parameter, aka v0 , is used to break ties, but tie breaking rules may

differ according various algorithms and implementations.

o v0 is also used as a replacement for illegal/missing messages (could also be different)


13.31

STOPPING FAILURES WITH FLOODSET - example (description in text).

Assume n=3 processes,

with initial values

(stored as a set):

o #1: {0}

o #2: {0}

o #3: {1}

Process #3 will fail, f=1.

#1

#2

#3

Assume the communication pattern

on the right.

o Process #3 fails in round 1,

o after sending its message

1 to #1,

o before sending the same

message

1 to #2.

o before taking any decision

Round #1

2, {0}

1, {0,1} 3, {1}

0 0

0

1

0

Round #2

2, {0,1}

1, {0,1} 3, {1}

{0,1} {0}

{0,1}

{0}

At the end of round

#2, the non-failed

processes #1&#2

agree

o if a single value v,

then on v

o if mixed values,

then either on a

predefined value v0

o or on the min value

(here 0) – strong

validity

f+1=2 rounds


14.31

STOPPING FAILURES WITH EIGSTOP (description in textbook).

EIG Tree for Process #1

0

0 0 1

0 - 0 - 1 -

EIG Tree for Process #2

0

0 0 -

0 - 0 - 1 -

f+1=2 rounds

o Tree nodes at level 0:

o Tree nodes at level 1: 1, 2, 3

o Tree nodes at level 2: 1.2, 1.3, 2.1, 2.3, 3.1, 3.2

Study more about this from the text.


15.31

BYZANTINE AGREEMENT VS TMR (more in text).

Module 1

Module 2

Module 3

Comparator

Module 1

Module 2

Module 3

Comparator 2

Comparator 2

Comparator 3

o In Byz context: Non-faulty modules may

well generate different initial values.

o In TMR: We expect that all non-faulty

modules generate the same initial value.

Only a faulty module will generate a

different initial value.

o Can we trust the comparators?

BYZANTINE AGREEMENT WITH AUTHENTICATION (more in text).

o Uses digital signatures that are practically impossible to forge.

o All initial values are signed by a trusted external source.

o Can we trust the external source? And how secure are the keys?

o All messages are signed by the sender.

o Any forgery is likely to be immediately noticed and forged messages are treated as null.

o Can use a stopping algorithm such as EIGStop.


16.31

INFORMAL IMPOSSIBILITY EXAMPLE n=3, f=1 (description in textbook).

o According to the rules, we should eventually terminate with the following agreements between the

elf processes:

o Left: #2 and #3 should decide 0.

o Right: #1 and #2 should decide 1.

o Middle: #1 and #3 should reach a common decision

(any, but must be consistent, i.e., either both 0, or both 1).

o The orc processes can here use a perfect strategy to disrupt any possible agreement:

o Left: At round 1, #1 pretends a choice of 1, then relays by lying consistently

o Right: At round 1, #3 pretends a choice of 0, then relays by lying consistently

o Middle: At round 1, #2 pretends a choice of 0 to #3 and of 1 to #1, then relays correctly

2:1

1:1 3:x

2:1

1:1 3:x

2:0

1:x 3:0

2:0

1:x 3:0

2:x

1:1 3:0

2:x

1:1 3:0

Color codes:

o red = elf

o blue = orc


17.31

o Consider that they send to each other their initial choice values.

o Process #3 cannot differentiate between the left and middle cases and should therefore

take the same decision in both cases, i.e., 0.

o Process #1 cannot differentiate between the right and middle cases and should therefore

take the same decision in both cases, i.e., 1.

o No common decision is possible for the middle case

o Conclusion: 1 round is not enough…

2:1

1:1 3:x

2:1

1:1 3:x

2:0

1:x 3:0

2:0

1:x 3:0

2:x

1:1 3:0

2:x

1:1 3:0

1 0

0

1

1

1

0

0

1

0

0

1

1

1 0

1 0 0


18.31

o Consider that on the 2nd round the elves correctly relay to each other the value received from

the other node on the 1st round (what they have witnessed)

(e.g., #2 sends to #3 statements such as “#1 told me that his choice is 1”).

o Process #3 still cannot differentiate between the left and middle cases ….

o Process #1 still cannot differentiate between the right and middle cases ….

o No common decision is possible for the middle case

o Conclusion: 2 rounds are not enough.

o Such arguments can continue for any number of rounds …

2:1

1:1 3:x

2:1

1:1 3:x

2:0

1:x 3:0

2:0

1:x 3:0

2:x

1:1 3:0

2:x

1:1 3:0

1,0 0,1

0,0

1,1

1,0

1,1

0,0

0,1

1,0

0,0

0,1

1,0

1,1

1,0 0,1

1,1 0,1 0,0

o No number of rounds seem enough to solve Byz agreement for n=3, f=1.

o Byz agreement is quite difficult…


19.31

NUMBER OF PROCESSES FOR BYZ – LEMMA 6.26

1

2

3

Hypothetical protocol for

n=3, f=1.

1

0

1: 0

2: 0 3: 0

3’: 1 2’: 1

1’: 1

?

Thought experiment

Conclusion: no algorithm for n=3, f=1.


20.31

T

T

T

T

F

F

THEOREM 6.27

No solution for 2 n 3f

Y N

n = 2

for 3 n 3f

o 3 “subnets” with at most f processes in each

o we assume that there is an algorithm that can solve the

Byz agreement for such an n, and we construct an

algorithm that can solve the problem for 3 processes,

o contradiction (with lemma 6.26)


21.31

o Level #1 : 1 group with N=4 siblings

o Level #2 : 4 groups with N-1=3 siblings each

o Level #L : each group has N-L+1 siblings

o For Byz agreement, L=F+1, here F=1

o Observe the node labelling scheme

o The nodes will be filled level-by-level

o Top-down, by L messaging rounds

o Bottom-up, at the end

AN EIG TREE WITH N=4 (# OF PROCESSES) AND L=2 (LEVELS)

o Consider that process #1 is “faulty”

o but #2, #3, #4 are “non-faulty”

o Observe the distribution of labels ending

in one of 2,3,4

o a majority at leaves, if N-L+1>F

o at least 1 along each path, if L>F

therefore at least 1 “cut” across

o These arguments will play a role in the

proof

λ

2 3 4 1

1.2

1.3

1.4

2.1

2.3

2.4

3.1

3.2

3.4

4.1

4.2

4.3


22.31

o Level #1 : 1 group with N=7 siblings

o Level #2 : 7 groups with N-1=6 siblings each

o Level #3 : 7*6 groups with N-2=5 siblings each

o Level #L : each group has N-L+1 siblings

o For Byz agreement, L=F+1, here F=2

o Observe the node labelling scheme...

EIG Tree fragment on next slide

AN EIG TREE WITH N=7 (# OF PROCESSES) AND L=3 (LEVELS)

o Consider that processes #1 #2 are “faulty”

o but #3, #4 , #4, #5, #6 are “non-faulty”

o Observe the distribution of labels ending in

one of 3,4,5,6,7

o similar as for the previous EIG tree with

N=3, L=2.


23.31

o 7

o 6

o 5

λ

2 3 ... 1

1.2

1.3

...

2.1

2.3

...

3.1

3.2

...

4.1

4.2

...

2.3.1

2.3.4

2.3.5

2.3.6

2.3.7


24.31

BYZANTINE AGREEMENT WITH EIG

(complete description in textbook).

, -, -

1, -, - 2, -, - 3, -, - 4, -, -

21, -, - 23, -, - 24, -, -

How val() are filled:

o val(2) is what 2 said.

o val(21) is what 1 said that 2 said

o If 1 is lying about 2 in val(21),

then 3 & 4 will mask this in

val(23) & val(24).

o invalid messages null v0.

This is a fragment of the

EIG tree structure:

o n=4 processes.

o f=1 faults.

o n>3f essential.

o f+1 levels.

For simplicity, in this

example the faulty #1 will

only send 0 on all rounds.

2:1

1:x 3:1

4:1

Each node has two attributes:

o val() = value, top-down, as

received from the messages.

o newval() = computed new

value, bottom-up, in which

the failures are masked by a

local majority voting

procedure ( or, v0 if there is

no majority).

3f+1 = f + 2f + 1

You need two elves for each orc plus one hobbit


25.31

THE EIG PROTOCOL – COMPUTING THE TOP-DOWN val() ATTRIBUTE

assume that x

does not contain

i, j

Process i

x:v

Process j

xi:v

Process k

xij:v


26.31

, 1, -

1, 0, - 2, 1, - 3, 1, - 4, 1, -

21,0, -

23,1, -

24,1, -

31,0, -

32,1, -

34,1, -

41,0, -

42,1, -

43,1, -

12,0, -

13,0, -

14,0, -

o Here follows process #2’s own copy of the EIG tree

o after f+1=2 rounds.

o The copies at processes #3 & #4 are similar,

o and the one at the faulty process #1 can be discarded


27.31

, 1, 1

1, 0, 0 2, 1, 1 3, 1, 1 4, 1, 1

21,0,0

23,1,1

24,1,1

31,0,0

32,1,1

34,1,1

41,0,0

42,1,1

43,1,1

12,0,0

13,0,0

14,0,0

o Next we compute newval() bottom-up using a local majority rule.

o To mask failures.

o Finally at the top level node #2’s decision is 1.

o The trees at processes #3 & #4 will be similar,

o And we don’t care much about process #1’s decision.


28.31

BYZANTINE QUIZ For each elf tree, replace W, X & Y,

s.t. the final decision ? at becomes

1. 0

2. 1

Assume that this is the EIG tree at

a non-faulty elf process #2, #3, #4,

v0 = 0, and #1 is a Byz orc

Val() could be distinct at each process Val() can be changed by the orc, but will still be common

Why shouldn’t we care

about the Z values?

: -: ?

1: W: Y 2: 0: 0 3: 1: 1 4: 1: 1

12: 0: 0

13: 1: 1

14: X: X

21: Z: Z

23: 0: 0

24: 0: 0

31: Z’:Z’

32: 1: 1

34: 1: 1

41: Z”:Z”

42: 1: 1

43: 1: 1


29.31

final decision = 0

o W = 0, for #2 (inferred from 12)


o W = 0 = X, for #4 (to get Y = 0) : -: 0

1: W: 0 2: 0: 0 3: 1: 1 4: 1: 1

12: 0: 0

13: 1: 1

14: 0: 0

21: Z: Z

23: 0: 0

24: 0: 0

31: Z’:Z’

32: 1: 1

34: 1: 1

41: Z”:Z”

42: 1: 1

43: 1: 1

#1 #2

#3 #4

0

0 1


30.31

final decision = 1



o W = 1 = X, for #4 (to get Y = 1) : -: 1

1: W: 1 2: 0: 0 3: 1: 1 4: 1: 1

12: 0: 0

13: 1: 1

14: 1: 1

21: Z: Z

23: 0: 0

24: 0: 0

31: Z’:Z’

32: 1: 1

34: 1: 1

41: Z”:Z”

42: 1: 1

43: 1: 1

#1 #2

#3 #4

0

1 1


31.31

BYZANTINE QUIZ – 4 U

o Find a scenario when the Byz orcs could flip the final decision by

changing only some of their 2nd round messages.

o The proposal with smallest number of such 2nd round messages wins!

Documents

INTRODUCTION TO AGREEMENT PROBLEMS