An evaluation of ring-based algorithms for the Eventually Perfect failure detector class
Joachim WielandMikel LarreaAlberto Lafuente
The University ofthe Basque Country
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 2
Contents
Motivation Unreliable failure detectors System model Communication-efficient implementations of P A non communication-efficient approach Performance evaluation Conclusion
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 3
Motivation
FLP impossibility result (Fischer, Lynch, and Paterson): Consensus cannot be solved deterministically in an asynchronous system subject to even a single process crash
Possibility result (Chandra and Toueg): Consensus can be solved in an asynchronous system subject to failures with an unrenreliable failure detector
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 4
Motivation (2)
Evaluate different ring-based algorithms for P
Two kinds of performance parameters:– Communication efficiency– Quality of service
Two families of algorithms:– Communication-efficient P + some optimizations– Non communication-efficient Q + transformation to P
modular approach designed with quality of service in mind
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 5
Unreliable failure detectors
Distributed oracle that provides (possibly incorrect) hints about the operational status of other processes
Abstractly characterized in terms of two properties: completeness and accuracy
– Completeness characterizes the degree to which crashed processes are suspected by correct processes
– Accuracy characterizes the degree to which correct processes are not suspected, i.e., restricts the false suspicions that a failure detector can make
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 6
Unreliable failure detectors (2)
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 7
System model
Finite set of n processes = {p1, p2, ..., pn} that communicate only by message-passing
Every pair of processes is connected by two unidirectional and reliable communication links
Processes can fail by crashing. Once a process crashes, it does not recover
Processes are arranged in a logical ring
Partially synchronous system
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 8
Communication-efficient implementations of P
A basic communication-efficient algorithm (LLW0):– each process p sends heartbeats to the processes in the ring
between itself (excluded) and its successor succp (included)– p monitors its predecessor predp by hearing heartbeats from it
upon timeout on predp, p suspects it and monitors the predecessor of predp
– if p erroneously suspected q, p starts monitoring q again– processes propagate the list of suspicions around the ring,
piggybacked in the heartbeats upon reception of the list of suspicions from predp, p builds a new list
by merging the list received with its local suspicions. Then, p sets succp to its nearest and non suspected process
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 9
Communication-efficient implementations of P (2)
p1
p2
p3
p4
p6
p5
pred1 = p6, succ1 = p2
pred2 = p1
succ2 = p3
pred3 = p2
succ3 = p4
pred4 = p3, succ4 = p5
pred5 = p4
succ5 = p6
pred6 = p5
succ6 = p1
pred6 = p4
succ6 = p1
{ p5 } { p5 }
{ p5 }
{ p5 }
pred4 = p3, succ4 = p6
LLW0
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 10
Communication-efficient implementations of P (3)
Providing a faster stabilization of the ring (LLW1): sending sporadic one-to-one messages
– upon timeout on predp, p sends (START, p) to pred(predp). Upon reception of this message, pred(predp) sets its successor to p
– when p learns that it is erroneously suspecting q, p sends (START, q) to p’s current predp. Upon reception of this message, p’s current predp sets its successor to q
Broadcasting suspicions to reduce the detection latency (LLW2): sending sporadic one-to-all messages
– upon timeout on predp, p sends (SUSPICION, predp) to all processes– when a process p is being erroneously suspected, it sends
(REFUTATION, p) to all processes
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 11
Communication-efficient implementations of P (4)
p1
p2
p3
p4
p6
p5
pred1 = p6, succ1 = p2
pred2 = p1
succ2 = p3
pred3 = p2
succ3 = p4
pred4 = p3, succ4 = p5
pred5 = p4
succ5 = p6
pred6 = p5
succ6 = p1
pred6 = p4
succ6 = p1
{ p5 } { p5 }
{ p5 }
{ p5 }
pred4 = p3, succ4 = p6
(START, p6)
LLW1
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 12
Communication-efficient implementations of P (5)
p1
p2
p3
p4
p6
p5
pred1 = p6, succ1 = p2
pred2 = p1
succ2 = p3
pred3 = p2
succ3 = p4
pred4 = p3, succ4 = p5
pred5 = p4
succ5 = p6
pred6 = p5
succ6 = p1
pred6 = p4
succ6 = p1
{ p5 } { p5 }
{ p5 }
{ p5 }
pred4 = p3, succ4 = p6
(SUSPICION, p5)
(START, p6)
LLW2
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 13
A non communication-efficient approach
A basic ring-based algorithm implementing Q:– identical monitoring schema to LLW0 + …– … the list of suspicions does not circulate around the ring– … every process p periodically sends (START, p) to predp. When a
process p receives (START, new_succ), it sets succp to new_succ
Providing a faster stabilization of the ring
Transforming Q into P:– propagating suspicions through the ring (LLWQP1)– broadcasting suspicions to reduce the detection latency (LLWQP2)
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 14
A non communication-efficient approach (2)
p1
p2
p3
p4
p6
p5
pred1 = p6, succ1 = p2
pred2 = p1
succ2 = p3
pred3 = p2
succ3 = p4
pred4 = p3, succ4 = p5
pred5 = p4
succ5 = p6
pred6 = p5
succ6 = p1
pred6 = p4
succ6 = p1
pred4 = p3, succ4 = p6
LLWQ
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 15
Performance evaluation
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 16
Performance evaluation (2)
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 17
Performance evaluation (3)
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 18
Performance evaluation (4)
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 19
Performance evaluation (5)
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 20
Performance evaluation (6)
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class - PDP 2007 21
Conclusion
Evaluation of two families of heartbeat-, ring-based algorithms implementing P
Communication-efficient family– n links are eventually used
Non communication-efficient family– modular approach– n + C links are eventually used (with 1 ≤ C ≤ n)– better quality of service