34
Distributed Systems Failure detec2on & Leader Elec2on Rik Sarkar University of Edinburgh Fall 2016

Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

DistributedSystems

Failuredetec2on&LeaderElec2onRikSarkar

UniversityofEdinburgh

Fall2016

Page 2: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Failures•  Howdoweknowthatsomethinghasfailed?•  Let’sseewhatwemeanbyfailed:

•  Modelsoffailure:1.  Assumenofailures2.  Crashfailures:Processmayfail/crash3.  Messagefailures:Messagesmaygetdropped4.  Linkfailures:acommunica2onlinkstopsworking5.  Somecombina2onsof2,3,46.  Morecomplexmodelscanhaverecoveryfromfailures7.  Arbitraryfailures:computa2on/communica2onmaybe

erroneous

DistributedSystems,Edinburgh,2016 2

Page 3: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Failuredetectors

•  Detec2onofacrashedprocess–  (notoneworkingerroneously)

•  Amajorchallengeindistributedsystems•  Afailuredetectorisaprocessthatrespondstoques2onsaskingwhetheragivenprocesshasfailed– Afailuredetectorisnotnecessarilyaccurate

DistributedSystems,Edinburgh,2016 3

Page 4: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Failuredetectors•  Reliablefailuredetectors

–  Replieswith“working”or“failed”

•  Difficulty:–  Detec2ngsomethingisworkingiseasier:iftheyrespondtoamessage,they

areworking–  Detec2ngfailureisharder:iftheydon’trespondtothemessage,themessage

mayhevbeenlost/delayed,maybetheprocessisbusy,etc..

•  Unreliablefailuredetector–  Replieswith“suspected(failed)”or“unsuspected”–  Thatis,doesnottrytogiveaconfirmedanswer

•  Wewouldideallylikereliabledetectors,butunreliableones(thatsaygive“maybe”answers)couldbemorerealis2c

DistributedSystems,Edinburgh,2016 4

Page 5: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Simpleexample

•  SupposeweknowallmessagesaredeliveredwithinDseconds

•  ThenwecanrequireeachprocesstosendamessageeveryTsecondstothefailuredetectors

•  IfafailuredetectordoesnotgetamessagefromprocesspinT+Dseconds,itmarkspas“suspected”or“failed”

DistributedSystems,Edinburgh,2016 5

Page 6: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Simpleexample

•  SupposeweassumeallmessagesaredeliveredwithinDseconds

•  ThenwecanrequireeachprocesstosendamessageeveryTsecondstothefailuredetectors

•  IfafailuredetectordoesnotgetamessagefromprocesspinT+Dseconds,itmarkspas“suspected”or“failed”(dependingontypeofdetector)

DistributedSystems,Edinburgh,2016 6

Page 7: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Synchronousvsasynchronous•  Inasynchronoussystemthereisaboundonmessagedelivery2me(andclockdria)

•  Sothissimplemethodgivesareliablefailuredetector

•  Infact,itispossibletoimplementthissimplyasafunc2on:–  Sendamessagetoprocessp,waitfor2D+ε2me–  Adedicateddetectorprocessisnotnecessary

•  InAsynchronoussystems,thingsaremuchharder

DistributedSystems,Edinburgh,2016 7

Page 8: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Simplefailuredetector

•  IfwechooseTorDtoolarge,thenitwilltakealong2meforfailuretobedetected

•  IfweselectTtoosmall,itincreasescommunica2oncostsandputstoomuchburdenonprocesses

•  IfweselectDtoosmall,thenworkingprocessesmaygetlabeledasfailed/suspected

DistributedSystems,Edinburgh,2016 8

Page 9: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Assump2onsandrealworld

•  Inreality,bothsynchronousandasynchronousareatoorigid

•  Realsystems,arefast,butsome2mesmessagescantakealongerthanusual– Butnotindefinitelylong

•  Messagesusuallygetdelivered,butsome2mesnot..

DistributedSystems,Edinburgh,2016 9

Page 10: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Somemorerealis2cfailuredetectors

•  Have2valuesofD:D1,D2– Markprocessesasworking,suspected,failed

•  Useprobabili2es–  Insteadofsynchronous/asynchronous,modeldelivery2measprobabilitydistribu2on

– Wecanlearntheprobabilitydistribu2onofmessagedelivery2me,andaccordinglyex2matetheprobabilityoffailure

DistributedSystems,Edinburgh,2016 10

Page 11: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Usingbayesrule•  a=probabilitythataprocessfailswithin2meT•  b=probabilityamessageisnotreceivedinT+D

•  So,whenwedonotreceiveamessagefromaprocesswewanttoes2mateP(a|b)–  Probabilityofa,giventhatbhasoccurred

DistributedSystems,Edinburgh,2016 11

P(a | b) = P(b | a)P(a)P(b)

Ifprocesshasfailed,i.e.aistrue,thenofcoursemessagewillnotbereceived!i.e.P(b|a)=1.Therefore:

P(a | b) = P(a)P(b)

Page 12: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Leaderofacomputa2on

•  Manydistributedcomputa2onsneedacoordina2ngorserverprocess– E.g.Centralserverformutualexclusion–  Ini2a2ngadistributedcomputa2on– Compu2ngthesum/maxusingaggrega2ontree

•  Wemayneedtoelectaleaderatthestartofcomputa2on

•  Wemayneedtoelectanewleaderifthecurrentleaderofthecomputa2onfails

DistributedSystems,Edinburgh,2016 12

Page 13: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

TheDis2nguishedleader

•  Theleadermusthaveaspecialpropertythatothernodesdonothave

•  Ifallnodesareexactlyiden2calineverywaythenthereisnoalgorithmtoiden2fyoneasleader

•  Ourpolicy:– Thenodewithhighestiden2fierisleader

DistributedSystems,Edinburgh,2016 13

Ref:NL

Page 14: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Nodewithhighestiden2fier•  Ifallnodesknowthehighestiden2fier(sayn),wedonot

needanelec2on–  Everyoneassumesnisleader–  nstartsopera2ngasleader

•  Butwhatifnfails?Wecannotassumen-1isleader,sincen-1mayhavefailedtoo!Ormaybethereneverwasprocessn-1

•  Ourpolicy:–  Thenodewithhighestiden2fierands2llsurvivingistheleader

•  Weneedanalgorithmthatfindstheworkingnodewithhighestiden2fier

DistributedSystems,Edinburgh,2016 14

Page 15: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy1:Useaggrega2ontree

DistributedSystems,Edinburgh,2016 15

5

2 8

7

3

2

r=4

2

5

7

8 3

8

•  Supposenoderdetectsthatleaderhasfailed,andini2atesleaderelec2on

•  NodercreatesaBFStree

•  Asksformaxnodeidtobecomputedviaaggrega2on–  Eachnodereceivesidvaluesfromchildren–  Eachnodecomputesmaxofownidand

receivedvalues,andforwardstoparent

•  Needsatreeconstruc2on•  Ifnnodesstartelec2on,willneedntrees

–  O(n2)communica2on–  O(n)storagepernode

Page 16: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy1:Useaggrega2ontree•  Supposenoderdetectsthatleaderhas

failed,andini2atesleaderelec2on

•  NodercreatesaBFStree

•  Asksformaxnodeidtobecomputedviaaggrega2on–  Eachnodereceivesidvaluesfromchildren–  Eachnodecomputesmaxofownidand

receivedvalues,andforwardstoparent

•  Needsatreeconstruc2on•  Ifnnodesstartelec2on,willneedntrees

–  O(n2)communica2on–  O(n)storagepernode

DistributedSystems,Edinburgh,2016 16

5

2 8

7

3

2

r=4

2

5

7

8 3

8

Page 17: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing•  Supposethenetworkisaring– Weassumethateachnodehas2pointerstonodesitknowsabout:•  Next•  Previous•  (likeacirculardoublylinkedlist)

–  Theactualnetworkmaynotbearing

–  Thiscanbeanoverlay

DistributedSystems,Edinburgh,2016 17

6

2

45

3

8

Page 18: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing

•  Basicidea:– Suppose6startselec2on– Send“6”to6.next,i.e.2– 2takesmax(2,6),sendto2.next

– 8takesmax(8,6),sendsto8.next

– etc

DistributedSystems,Edinburgh,2016 18

6

2

45

3

8

next

previous

6

6

8

8

8

Page 19: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing

•  Thevalue“8”goesaroundtheringandcomesbackto8

•  Then8knowsthat“8”isthehighestid–  Sinceiftherewasahigherid,thatwouldhavestopped8

•  8declaresitselftheleader:sendsamessagearoundthering

DistributedSystems,Edinburgh,2016 19

6

2

45

3

8

next

previous

6

6

8

8

8

8

Page 20: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing

•  Theproblem:Whatifmul2plenodesstartleaderelec2onatthesame2me?

•  Weneedtoadaptalgorithmslightlysothatitcanworkwheneveraleaderisneeded,andworksformul2pleleader

DistributedSystems,Edinburgh,2016 20

6

2

45

3

8

next

previous

6

6

8

8

8

8

Page 21: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing(Algorithmbychangandroberts)

•  Everynodehasadefaultstate:non-par3cipant

•  Star2ngnodesetsstatetopar3cipantandsendselec3onmessagewithidtonext

DistributedSystems,Edinburgh,2016 21

6

2

45

3

8

next

previous

6

6

8

8

8

8

Page 22: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing(Algorithmbychangandroberts)

•  Ifnodepreceiveselec3onmessagem

•  Ifpisnon-partcipant:–  sendmax(m.id,p.id)top.next–  Setstatetopar2cipant

•  Ifpispar2cipant:–  Ifm.id>p.id:

•  Sendm.idtop.next–  Ifm.id<p.id:

•  donothing

DistributedSystems,Edinburgh,2016 22

6

2

45

3

8

next

previous

6

6

8

8

8

8

Page 23: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing(Algorithmbychangandroberts)

•  Ifnodepreceiveselec3onmessagemwithm.id=p.id

•  Pdeclaresitselfleader– Setsp.leader=p.id– Sendsleadermessagewithp.idtop.next– Anyothernodeqreceivingtheleadermessage•  Setsq.leader=p.id•  Forwardsleadermessagetoq.next

DistributedSystems,Edinburgh,2016 23

Page 24: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy2:Usearing(Algorithmbychangandroberts)

•  Worksinanasynchronoussystem•  Assumingnothingfailswhilethealgorithmisexecu2ng

•  MessagecomplexityO(n^2)– Whendoesthisoccur?–  (hint:allnodesstartelec2on,andmanymessagestraversealongdistance)

•  Whatisthe2mecomplexity?•  Whatisthestoragecomplexity?

DistributedSystems,Edinburgh,2016 24

Page 25: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

•  Assumeallnodeswanttoknowtheleader•  k-neighborhoodofnodep– Thesetofallnodeswithindistancekofp

•  Howdoespsendamessagetodistancek?– Messagehasa“2metolivevariable”– Eachnodedecrementsm.plonreceiving–  Ifm.pl=0,don’tforwardanymore

DistributedSystems,Edinburgh,2016 25

Page 26: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

•  Basicidea:– Checkgrowingregionsaroundyourselfforsomeonewithlargerid

DistributedSystems,Edinburgh,2016 26

Page 27: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

•  Algorithmoperatesinphases•  Inphase0,nodepsendselec2onmessagemtobothp.nextandp.previouswith:– m.id=p.idandpl=1

•  Supposeqreceivesthismessage–  Setsm.pl=0–  Ifq.id>m.id:

•  Donothing–  Ifq.id<m.id:

•  Returnmessagetop

DistributedSystems,Edinburgh,2016 27

Page 28: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

•  Algorithmoperatesinphases•  Inphase0,nodepsendselec2onmessagemtoboth

p.nextandp.previouswith:–  m.id=p.idandpl=1

•  Supposeqreceivesthismessage–  Setsm.pl=0–  Ifq.id>m.id:

•  Donothing–  Ifq.id<m.id:

•  Returnmessagetop•  Ifpgetsbackbothmessage,itdecidesitselfleaderofits1-

neighborhood,andproceedstonextphase

DistributedSystems,Edinburgh,2016 28

Page 29: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

•  IfpisInphasei,nodepsendselec2onmessagemtop.nextandp.previouswith:–  m.id=p.id,andm.pl=2i

•  Anodeqonreceivingthemessage(fromnext/previous)–  Ifm.pl=0:forwardsuitablytoprevious/next–  Setsm.pl=m.pl-1–  Ifq.id>m.id:

•  Donothing–  Else:

•  Ifm.pl=0:returntosendingprocess•  Elseforwardtosuitablytoprevious/next

•  Ifpgetsbothmessageback,itistheleaderofits2ineighborhood,andproceedstophasei+1

DistributedSystems,Edinburgh,2016 29

Page 30: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

•  When2i>=n/2– Only1processsurvives:Leader

•  Numberofphases:O(logn)

•  Whatisthemessagecomplexity?

DistributedSystems,Edinburgh,2016 30

Page 31: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

Inphasei•  Atmostonenodeini2atesmessageinanysequenceof2i-1nodes

•  So,n/2i-1candidates–  Eachsends2messages,goingatmost2idistance,andreturning:2*2*2imessages

•  O(n)messagesinphaseiThereareO(logn)phases•  TotalofO(nlogn)messages

31DistributedSystems,Edinburgh,2016

Page 32: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy3:Usearing–smartly(HirschbergSinclair)

•  Assumesynchronousopera2on•  Assumenodesdonotfailduringalgorithmrun

•  Whatis2mecomplexity?•  Whatisstoragecomplexity?

32DistributedSystems,Edinburgh,2016

Page 33: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy4:BullyAlgorithm•  Assume:

–  Eachnodeknowstheidofallnodesinthesystem(somemayhavefailed)–  Synchronousopera2on

•  Nodepdecidestoini2ateelec2on•  psendselec2onmessagetoallnodeswith id>p.id•  Ifpdoesnothear“Iamalivemessage”fromanynode,pbroadcastsa

messagedeclaringitselfasleader•  Anyworkingnodeqthatreceiveselec2onmessagefromp,replieswith

ownidand“Iamalive”message–  Andstartsanelec2on(unlessitisalreadyintheprocessofanelec2on)

•  Anynodeqthathearsaloweridnodebeingdeclaredleader,startsanewelec2on

DistributedSystems,Edinburgh,2016 33

Ref:CDK

Page 34: Distributed Systems Failure detec2on & Leader Elec2on · • iIf p gets both message back, it is the leader of its 2 neighborhood, and proceeds to phase i+1 Distributed Systems, Edinburgh,

Strategy4:BullyAlgorithm

•  Assume:–  Eachnodeknowstheidofallnodesinthesystem(somemayhavefailed)

–  Synchronousopera2on

•  Worksevenwhenprocessesfail•  Workswhen(some)messagedeliveriesfail.

•  Whatarethestorageandmessagecomplexi2es?

DistributedSystems,Edinburgh,2016 34