Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Organisation Topics Basics Echo Further Readings Project
Introduction
Radu NicolescuDepartment of Computer Science
University of Auckland
16 July 2018
1 / 35
Organisation Topics Basics Echo Further Readings Project
1 Organisation
2 Distributed algorithms
3 Basics
4 Echo algorithm
5 Further algorithms
6 Readings
7 Project and practical work
2 / 35
Organisation Topics Basics Echo Further Readings Project
Organisation
• Before the break: weeks 1–4; after the break: weeks 11–12
• Contents: an introduction to a few fundamental distributedalgorithms
• Exam: theory of the distributed algorithms
• Two projects: practical implementations (emulations ofspecific distributed algorithms on a single computer)
4 / 35
Organisation Topics Basics Echo Further Readings Project
Distributed algorithms
• Computing: computer systems, networking, computerarchitectures, computer programming, algorithms
• Parallel computing: multi-processors/cores, parallel systems,threading, concurrency, data sharing and races, parallelprogramming (data and task parallelism), parallel algorithms
• Distributed computing: distributed systems, message passing,communication channels, distributed architectures, distributedprogramming, distributed algorithms
• concurrency of components• paramount messaging time• lack of global clock in the async case• independent failure of components
• read more: http://en.wikipedia.org/w/index.php?
title=Distributed_computing&oldid=616081922
6 / 35
Organisation Topics Basics Echo Further Readings Project
Overlap between parallel and distributed computing
• One litmus test:
• Parallel computing: tight coupling between tasks
• Distributed computing: loose coupling between nodes
• Another difference – specific for algorithms:
• In classical algorithms, the problem is encoded and given tothe (one single) processing element
• In parallel algorithms, the problem is encoded and partitioned,and then its chunks are given to the processing elements
• In distributed algorithms, the problem is given by the networkitself and solved by coordination protocols
• read more: http://en.wikipedia.org/w/index.php?
title=Distributed_algorithm&oldid=594511669
7 / 35
Organisation Topics Basics Echo Further Readings Project
Typical scenario in distributed computing
• computing nodes have local memories and unique IDs
• nodes are arranged in a network: graph, digraph, ring, ...
• neighbouring nodes can communicate by message passing
• independent failures are possible: nodes, communicationchannels
• the network topology (size, diameter, neighbourhoods) andother characteristics (e.g. latencies) are often unknown toindividual nodes
• the network may or may not dynamically change
• nodes solve parts of a bigger problem or have individualproblems but still need some sort of coordination – which isachieved by distributed algorithms
8 / 35
Organisation Topics Basics Echo Further Readings Project
Recall from graph theory
• Graphs, edges, digraphs, arcs, degrees, connected, completegraphs, size, distance, diameter, radius, paths, weights/costs,shortest paths, spanning trees, ...
• distance between a pair of nodes = minimum length of aconnecting path, aka length of a shortest path (hop count inunweighted graphs)
• min
• diameter = maximum distance, between any pair of nodes
• max min
• radius = minimum maximum distance, over all nodes(minimum attained at centers)
• min max min
10 / 35
Organisation Topics Basics Echo Further Readings Project
Recall from graph theory
• Three distinct spanning trees (rooted at 1) on the sameundirected graph
• Distances? Diameter? Radius? Centres?
11 / 35
Organisation Topics Basics Echo Further Readings Project
Rounds and steps
• Nodes work in rounds (macro-steps), which have threesub-steps:
1 Receive sub-step: receive incoming messages
2 Process sub-step: change local node state
3 Send sub-step: send outgoing messages
• Note: some algorithms expect null or empty messages, as anexplicit confirmation that nothing was sent
12 / 35
Organisation Topics Basics Echo Further Readings Project
Timing models
1 Synchronous models
• nodes work totally synchronised – in lock-step
• easiest to develop, but often unrealistic and less efficient
2 Asynchronous models
• messages can take an arbitrary unbounded time to arrive
• often unrealistic and sometimes impossible
3 Partially synchronous models
• some time bounds or guarantees
• more realistic, but most difficult
• Quiz: guess which models have been first studied?Heard of Nasreddin Hodja’s lamp? ,
13 / 35
Organisation Topics Basics Echo Further Readings Project
Synchronous model – equivalent versions
• Synchronous model – version 1
• all nodes: process takes 1 time unit
• all messages: transit time (send −→ receive) takes 0 time units
• Synchronous model – version 2
• all nodes: process takes 0 time units
• all messages: transit time (send −→ receive) takes 1 time units
• The second (and equivalent) version ensures that synchronisedmodels are a particular case of asynchronous models
14 / 35
Organisation Topics Basics Echo Further Readings Project
Asynchronous model
• Asynchronous model
• all nodes: process takes 0 time units
• each message (individually): transit time (send −→ receive)takes any real number of time units
• or, normalised: any real number in the [0, 1] interval
• thus synchronous models (version 2) are a special case
• often, a FIFO guarantee, which however may affect the abovetiming assumption (see next slide)
15 / 35
Organisation Topics Basics Echo Further Readings Project
Asynchronous model
• Asynchronous model with FIFO channels
• the FIFO guarantee: faster messages cannot overtake earlierslower messages sent over the same channel
• congestion (pileup) may occur and this should be accounted for
• the timing assumption of the previous slide only applies to thetop (oldest) message in the FIFO queue
• after the top message is delivered, the next top message isdelivered after an additional arbitrary delay (in R or [0, 1])...[Lynch]
• essentially, a FIFO “channel” may not be a simple channel, butneed some middleware (to manage the message queue)
• suggestion to develop robust models, who do not rely on anyimplicit FIFO assumption [Tel]
16 / 35
Organisation Topics Basics Echo Further Readings Project
Nondeterminism
• In general, both models are non-deterministic
• Sync and async: often a choice needs to be made betweendifferent requests (e.g. to consider an order among neighbours)
• Async: messages delivery times can vary (even very widely)
• However, all executions must arrive to a valid decision (notnecessarily the same, but valid)
17 / 35
Organisation Topics Basics Echo Further Readings Project
Starting and stopping options
• Starting options
• Single-source or centralised or diffusing: starts from adesignated initiator node
• Many-source or decentralised: starts from many nodes, oftenfrom all nodes, at the same time
• Termination options
• Easy to notice from outside, but how do the nodes themselvesknow this?
• Sometimes one node (often the initiator) takes the finaldecision and can notify the rest (usually omitted phase)
• In general, this can be very challenging and requiressophisticated control algorithms for termination detection
18 / 35
Organisation Topics Basics Echo Further Readings Project
Echo algorithm
• Echo is a fundamental diffusing (single source) algorithm,which is the basis for several others
• combines two phases (imagine a stone thrown into a pondcreating waves which echo back)
• a broadcast, ”top-down” phase, which builds child-to-parentlinks in a spanning tree
• a convergecast, ”bottom-up” or echo phase, which confirmsthe termination
• parent-to-child links can be built either at convergecast time orby additional confirmation messages immediately after thebroadcast (not shown in the next slides)
• after receiving echo tokens from all its neighbours, the sourcenode decides the termination (and can optionally start a thirdphase, to inform the rest of this termination)
20 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Sync
1
2
3
4
Time Units = 0
Messages = 0
21 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Sync
1
2
3
4
Time Units = 1
Messages = 3
21 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Sync
1
2
3
4
Time Units = 2
Messages = 7
21 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Sync
1
2
3
4
Time Units = 3
Messages = 10
21 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Sync
1
2
3
4
Time Units = 3 = 2D + 1
Messages = 10 = 2|E|
21 / 35
Organisation Topics Basics Echo Further Readings Project
Echo programs (Tel)
• Each node has a list, Neigh, indicating its neighbours
• There are two different programs: (1) for the initiator, and (2)for the non-initiators
• Here, the programs are high-level pseudocode, not in the basicReceive/Process/Send state-machine format (but can bereduced to this)
• With the exception of the initiator’s first step, nodes becomeactive only after receiving at least one message; i.e. nodes areidle (passive) between sending and receiving new messages
• Exercise: try to translate the following pseudocodes into astate machine format
22 / 35
Organisation Topics Basics Echo Further Readings Project
Echo program for initiator (Tel)
1 l e t p a r e n t = nu l l2 l e t rec = 034 f o r q i n Neigh do5 send tok to q67 whi le rec < | Neigh | do8 r e c e i v e tok // blocking call or not?9 rec += 1
1011 d e c i d e
23 / 35
Organisation Topics Basics Echo Further Readings Project
Echo program for non-initiators (Tel)
1 l e t p a r e n t = nu l l2 l e t rec = 034 r e c e i v e tok from q // non-deterministic choice!5 p a r e n t = q6 rec += 178 f o r q i n Neigh \ p a r e n t do9 send tok to q
1011 whi le rec < | Neigh | do // count all received tokens12 r e c e i v e tok // forward and return13 rec += 11415 send tok to p a r e n t
24 / 35
Organisation Topics Basics Echo Further Readings Project
Sync vs. Async
• Using the informal complexity measures appearing in thepreceding slides (there are others), we conclude
• Sync: time complexity = O(D), message complexity = O(|E |)
• The same Echo programs run also in the async mode andobtain valid results
• However, the runtime complexity measure changes(drastically)
• Async: time complexity=O(|V |), message complexity=O(|E |)
• Why?
• For the time complexity, we take the supremum over allpossible normalised executions (delivery time in [0, 1])
25 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 0
Messages = 0
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = ε
Messages = 3
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 2ε
Messages = 4
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 3ε
Messages = 6
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 4ε
Messages = 7
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 1
Messages = 7
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 2
Messages = 8
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 3
Messages = 9
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units = 4
Messages = 10
26 / 35
Organisation Topics Basics Echo Further Readings Project
Echo Algorithm - Async
1
2
3
4
Time Units on Broadcast = 1 = D
Messages = 10
Time Units on Convergecast = 3 = |V | − 1
Total Time Units = 4
26 / 35
Organisation Topics Basics Echo Further Readings Project
Further algorithms
• Besides some basic fundamental algorithms, we intend tocover (or have a good look at) some of the most challengingdistributed algorithms
• Distributed MST (Minimal Spanning Tree): Dijkstra prize 2004
• Byzantine agreement: ”the crown jewel of distributedalgorithms” – Dijkstra prize 2005, 2001
• The CAP theorem (Consistency, Availability, Partitiontolerance)
• The relation between Byzantine algorithms and blockchains
• all these have practical significance and applications
28 / 35
Organisation Topics Basics Echo Further Readings Project
Readings
• Textbooks (in library, also limited previews on Google books)
• Lynch, Nancy (1996). Distributed Algorithms. San Francisco,CA: Morgan Kaufmann Publishers. ISBN 978-1-55860-348-6
• Tel, Gerard (2000), Introduction to Distributed Algorithms,Second Edition, Cambridge University Press, ISBN978-0-52179-483-1
• Research and overview articles (generally overlapping thetextbooks) will be indicated for each topic
30 / 35
Organisation Topics Basics Echo Further Readings Project
Readings (optionally)
• Optionally, my articles (in collaboration) on bio-inspiredmodels for distributed algorithms may offer additional views
• network discovery
• firing squad synchronisation (graphs and digraphs)
• DFS, BFS, edge and node disjoint paths
• Byzantine agreement
• dynamic programming, optimisations
• NP complete/hard problems (SAT, TSP)
• image processing (stereo-vision, skeletonisation, regiongrowing)
• asynchronicity
• Several pre-print versions published in the CDMTCS researchreports https://www.cs.auckland.ac.nz/
staff-cgi-bin/mjd/secondcgi.pl?date31 / 35
Organisation Topics Basics Echo Further Readings Project
Project and practical work
• Practical work is necessary to properly understand theconcepts
• Unfortunately, we cannot afford to experiment with realdistributed systems (sorry for this /)
• We need to play the ”Game of Distribution” and simulate or,better, emulate distributed systems on a single lab machine
• For uniformity and fairness in marking, we use .NET,specifically with C# (or F#)
• The final exam is focused on concepts, does not contain”technological” questions (related to .NET)
33 / 35
Organisation Topics Basics Echo Further Readings Project
Prerequisites (cf. 335)
• Familiarity with C#, at least to the level presented in the C#Specifications, Chapter Introduction: http://www.
microsoft.com/en-nz/download/details.aspx?id=7029
• Familiarity with basic .NET API or ability to learn this on thefly, as needed
• Familiarity with the specific API that we will use to emulatedistributed systems
• Familiarity with Visual Studio 2017 (as in the labs)
• Familiarity with searching and reading the MSDN library
• Familiarity with Linqpad http://www.linqpad.net/
34 / 35
Organisation Topics Basics Echo Further Readings Project
How to emulate sync and async distributed systems?
• Over the time, we have considered a variety of technologies
• .NET Remoting (similar to Java RMI)
• WCF = Windows Communication Foundation (high-level layerover TCP and HTTP etc)
• WebAPI
• TPL = Task Parallel Library
• TPL Dataflow Library, which promotes actor/agent-orienteddesigns
• This year we will use... WCF!
• We will have one dedicated lecture on this
35 / 35