Upload
jared
View
46
Download
0
Embed Size (px)
DESCRIPTION
Multicast Protocols. Jed Liu 28 February 2002. Introduction. Recall Atomic Broadcast: All correct processors receive same set of messages. All messages delivered in same order to all processors. Any message sent by a correct processor is eventually delivered to all processors. - PowerPoint PPT Presentation
Citation preview
Multicast Protocols
Jed Liu28 February 2002
2
Introduction Recall Atomic Broadcast:
All correct processors receive same set of messages. All messages delivered in same order to all processors. Any message sent by a correct processor is eventually
delivered to all processors.
3
Introduction (cont’d) But what happens if the network partitions?
Atomic Broadcast becomes unsolvable! Define Totally Ordered Broadcast
If a majority of the processes form a connected component, guarantee Atomic Broadcast for this component only.
COReL is an implementation of this.
4
The Model Network uses datagram message delivery. Asynchronous fail-stop model. Stable storage. Communication links are transient. Message Integrity: Messages cannot be
corrupted or generated by the network spontaneously.
5
The System Architecture
Application
COReL – Totally Ordered Broadcast
Group Communication Service
Application messages
COReL messages
Messages with TS views
Totally Ordered Broadcast messages
Delivered
TotallyOrdered
6
Properties of the GCS No Duplication: Every message delivered at a
process p is delivered only once at p. Total Order: A logical, globally unique timestamp
is attached to every message when it is delivered. Causal order is preserved. GCS delivers messages in TS order.
Virtual Synchrony: Any two processes undergoing the same two consecutive views in a group G deliver the same set of messages in G within the former view
7
Properties of the GCS (cont’d)
P QP and Q in same view.
Deliver mDeliver m’
Deliver m”
Deliver m’Send m”
Q also delivers m.
8
Guarantees Made by COReL Safety:
At each process, messages become totally ordered in an order which is a prefix of some common global total order.
Total ordering of messages preserves the causal partial order.
Liveness: Messages are eventually totally ordered by the
members of a view.
9
The COReL Algorithm GCS supplies a unique timestamp for each
message that gets delivered to COReL. On delivery, the message gets written to stable
storage, and an acknowledgement is sent. Within a majority component, messages are
ordered in TS order. Concurrent messages are ordered such that messages from the majority component come first.
10
The Primary Component Use the notion of a primary component to allow
members of one network component to continue ordering messages when a partition occurs. (Can be a majority, or in general, a quorum.)
Ordering Rule: Members of the current primary component PM are allowed to totally order a message once the message was acknowledged by all members of PM.
11
The Colours Model Green: messages that have been totally ordered
according to the Ordering Rule. Yellow: messages received and acknowledged in
the context of a primary component. May have become green at other members of the primary component.
Red: no knowledge about message’s total order.
12
Invariants Order of green messages determines the global
total order of those messages. Order of such messages cannot change, and processes
have to agree on the order. Causal order of messages is preserved.
13
View Changes Set the primary component bit to FALSE. Stop handling regular messages and stop sending
regular messages. If new view v contains new members, run a
Recovery Procedure. If v is a majority, establish a new primary
component. Continue handling regular messages and sending
regular messages.
14
State Variables Last_Committed_Primary
Number of last primary component that the process has committed to establish.
Last_Attempted_Primary Number of last primary component that the process has
attempted to establish.
15
Recovery Procedure Send state message to members of new group. Wait for state messages from all other group
members. Find a set of Representatives in the group.
Set of processes with the largest Last_Committed_Primary in the group.
Get Representatives to agree on the set of green messages and the set of yellow messages. Set of green messages determined by the union.
Set of yellow messages determined by the intersection.
16
Recovery Procedure (cont’d) A deterministically chosen representative
retransmits green and yellow messages to get all group members to agree on the set of green and yellow messages. Non-representatives re-colour yellow messages as red if
the message is not yellow at any representative. Retransmit red messages as necessary to get all
group members to agree on the state and colour of their message queues.
17
View Change During Recovery? If in the middle of recovery and we get a view
change, we immediately restart recovery with the new view. No need to undo anything.
If view change only removes processes from group, no need to retransmit messages.
18
Establishing a New Primary Component Attempt: Record attempt on stable storage and
send attempt message to all other members. Wait for attempt messages from all other members.
Commit: Record commit on stable storage. Mark all non-green messages as yellow. Send a commit message.
Establish: When commit messages from all other members arrive, set primary component bit to TRUE and mark all messages as green.
19
View Change while Establishing? A process marks the messages in its message
queue as green only when it knows that all other members have marked them as yellow. If a failure occurs during the protocol, the invariants are
not violated.
20
COReL Summary An algorithm for totally-ordered multicast in an
asynchronous environment. Resilient to network partitions and communication
link failures. But only live in the primary component!
Allows members of minority components to initiate messages. These messages can become totally ordered even if the originating process in never a member of the primary component.
21
Transis Another multicast protocol that deals with network
partitions. Regulates network flow to avoid flooding and
message loss. Uses a sliding-window algorithm similar to that used in
TCP.
22
The Persistent Replication Services Layer (PRSL) Built on top of Transis. Provides applications with long term services such as
message logging and replaying, and reconciliation of states among recovered and reconnected endpoints.
With just Transis: Message delivery only guaranteed within the current group. No end-to-end acknowledgement at application level, so no
guarantee that any destination actually acted on the message.
23
Replication Groups The basis of PRSL operations. A static set of processes defined at startup time. Different from multicast groups — can only change
through startup and shutdown of members.
24
Replication Group Operations Uniform multicast. Totally ordered uniform multicast. Stable multicast. Explicit application-level acknowledgement. Startup/shutdown for adding/removing a member
to/from the replication group.