38
EEC 688/788 EEC 688/788 Secure and Dependable Secure and Dependable Computing Computing Lecture 13 Lecture 13 Wenbing Zhao Wenbing Zhao Department of Electrical and Computer Department of Electrical and Computer Engineering Engineering Cleveland State University Cleveland State University [email protected] [email protected]

EEC 688/788 Secure and Dependable Computing

  • Upload
    titus

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

EEC 688/788 Secure and Dependable Computing. Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University [email protected]. Outline. Group communication systems Ordered multicast Techniques to implement ordered multicast Group membership service - PowerPoint PPT Presentation

Citation preview

Page 1: EEC 688/788 Secure and Dependable Computing

EEC 688/788EEC 688/788Secure and Dependable Secure and Dependable ComputingComputing

Lecture 13Lecture 13

Wenbing ZhaoWenbing ZhaoDepartment of Electrical and Computer EngineeringDepartment of Electrical and Computer EngineeringCleveland State UniversityCleveland State [email protected]@ieee.org

Page 2: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

OutlineOutline Group communication systems

Ordered multicast Techniques to implement ordered multicast Group membership service Agreed and safe delivery

Checkpointing and recovery Reference:

Reliable distributed systems, by K. P. Birman, Springer; Chapter 14-16

Page 3: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Group Communication Group Communication SystemSystem Services provided by the GCS

Membership service: who is up and who is down Deals with failure detection and more

Reliable, ordered, multicast service FIFO, causal, total

Virtual synchrony service Virtual synchrony synchronizes membership change with

multicasts GCS is often used to build fault tolerant systems

Page 4: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Reliable MulticastReliable Multicast Reliable multicast – the message is targeted to multiple

receivers, and all receivers receive the message reliably Positive or negative acknowledgement Need to avoid ack/nack implosion

Distinguish receiving from delivery!

Application

Middleware

Receiving

Delivering

Page 5: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Ordered Reliable MulticastOrdered Reliable Multicast Ordered reliable multicast – if many messages are

multicast by many senders, in what order the messages are delivered at the receivers? First in first out (FIFO) Causal – the causal relationship among msgs preserved Total – all msgs are delivered at all receivers in the same order

Page 6: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

FIFO Ordered MulticastFIFO Ordered Multicast FIFO or sender ordered multicast:

Messages are delivered in the order they were sent (by any single sender)

p

q

r

s

a

b c d

e

delivery of c to p is delayed until after b is delivered

Page 7: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Causally Ordered MulticastCausally Ordered Multicast Causal or happens-before ordering:

If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations

p

q

r

s

a

b

Page 8: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Causally Ordered MulticastCausally Ordered Multicast Causal or happens-before ordering:

If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations

p

q

r

s

a

b cdelivery of c to p is delayed until after b is delivered

Page 9: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Causally Ordered MulticastCausally Ordered Multicast Causal or happens-before ordering:

If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations

p

q

r

s

a

b c

e

delivery of c to p is delayed until after b is deliverede is sent (causally) after b

Page 10: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Causally Ordered MulticastCausally Ordered Multicast Causal or happens-before ordering:

If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations

p

q

r

s

a

b c d

e

delivery of c to p is delayed until after b is delivereddelivery of e to r is delayed until after b&c are delivered

Page 11: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Totally Ordered MulticastTotally Ordered Multicast Total ordering:

Messages are delivered in same order to all recipients (including the sender)

p

q

r

s

a

b c d

e

all deliver a, b, c, d, then e

Page 12: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Implementing Total OrderingImplementing Total Ordering Use a token that moves around

Token has a sequence number When you hold the token you can send the next burst of

multicasts Use a sequencer to order all multicast

Message is first multicast to all, including the sequencer; then the sequencer determines the order for the message and informs all

Or send to the sequencer and the sequencer multicast with total order information

Each sender can take turn to serve as the sequencer

Page 13: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Group membership serviceGroup membership service Input:

Process “join” events Process “leave” events Apparent failures

Output: Membership views for group(s) to which those

processes belong

Page 14: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Issues?Issues? The service itself needs to be fault-tolerant

Otherwise our entire system could be crippled by a single failure!

Hence Group Membership Service (GMS) must run some form of protocol (GMP)

Page 15: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

ApproachApproach Assume that GMS has members {p,q,r} at time t Designate the “oldest” of these as the protocol

“leader” To initiate a change in GMS membership, leader will run

the GMP Others can’t run the GMP; they report events to the leader

Page 16: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

GMP ExampleGMP Example

Example: Initially, GMS consists of {p,q,r} Then q is believed to have crashed

pqr

Page 17: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Unreliable Failure DetectionUnreliable Failure Detection

Recall that failures are hard to distinguish from network delay So we accept risk of mistake If p is running a protocol to exclude q because “q

has failed”, all processes that hear from p will cut channels to q Avoids “messages from the dead”

q must rejoin to participate in GMS again

Page 18: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Basic GMPBasic GMP Someone reports that “q has failed” Leader (process p) runs a 2-phase commit

protocol Announces a “proposed new GMS view”

Excludes q, or might add some members who are joining, or could do both at once

Waits until a majority of members of current view have voted “ok”

Then commits the change

Page 19: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

GMP ExampleGMP Example

Proposes new view: {p,r} [-q] Needs majority consent: p itself, plus one more (“current” view

had 3 members) Can add members at the same time

pqr

Proposed V1 = {p,r}

V0 = {p,q,r}OK

Commit V1

V1 = {p,r}

Page 20: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Special Concerns?Special Concerns? What if someone doesn’t respond?

P can tolerate failures of a minority of members of the current view New first-round “overlaps” its commit:

“Commit that q has left. Propose add s and drop r” P must wait if it can’t contact a majority

Avoids risk of partitioning

Page 21: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

What If Leader Fails?What If Leader Fails? Here we do a 3-phase protocol

New leader identifies itself based on age ranking (oldest surviving process)

It runs an inquiry phase “The adored leader has died. Did he say anything to you

before passing away?” Note that this causes participants to cut connections to the

adored previous leader Then run normal 2-phase protocol but “terminate” any

interrupted view changes leader had initiated

Page 22: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

GMP ExampleGMP Example

New leader first sends an inquiry Then proposes new view: {r,s} [-p] Needs majority consent: q itself, plus one more (“current” view

had 3 members) Again, can add members at the same time

pqr

Proposed V1 = {r,s}

V0 = {p,q,r}OK

Commit V1

V1 = {r,s}

Inquire [-p]

OK: nothing was pending

Page 23: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Safe and Agreed DeliverySafe and Agreed Delivery For totally ordered reliable multicast, there

are two delivery policies Safe delivery: a message is delivered only when

all correct processes have received it Agreed delivery: a message is delivered as long

as it is the next message in total order

Page 24: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Safe and Agreed DeliverySafe and Agreed Delivery Safe delivery guarantees the uniformity of

multicast: If a message is delivered to any process, it is

delivered by all correct processes Agreed delivery does not:

It is possible that a message is delivered in one (or more) process, but is not delivered by some correct process

Page 25: EEC 688/788 Secure and Dependable Computing

Checkpointing and RecoveryCheckpointing and Recovery Faults occur over time. How to ensure a fault

tolerant system remain operational for extensive period of time? Recover failed replicas, or replace failed replicas

with new one => Recovery is needed How to recover a failed replica or install a

new replica? Checkpointing a correct replica and transfer the

state to the recovering replica

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Page 26: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

CheckpointingCheckpointing Checkpointing: the act of taking a snapshot of an

entity so that we can restore it later A replica is a process running in an operating system.

The state of a process Processes' memory, stack and registers Threads Open or mmap'ed files Current working directory Interprocess communication:

Semaphores, shared memory, pipes, sockets Dynamic Load Libraries …

Page 27: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

CheckpointingCheckpointing Many tools are available to perform

checkpointing transparently or semi-transparently http://www.checkpointing.org/ Condor, libckpt, etc. Checkpoints taken in general are not portable Checkpoint size might be big

Page 28: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Checkpointing of Application Checkpointing of Application StateState Sometimes it is more efficient to save and store the

application state only Checkpoints can be very portable and compact in size class Counter {

int counter; Counter(int initVal) { counter = initVal; }

void increment() {counter++; } void decrement() {counter--; } void setState(int c) {counter = c; }

int getState() { return counter;}|}

Page 29: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

LoggingLogging Logging of messages

Checkpointing in general is expensive Logging of messages is cheaper => we can periodically do checkpointing, or do

checkpointing on demand and log all messages in between

Logging of other non-deterministic activities Access order to shared data

Page 30: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Roll-Forward RecoveryRoll-Forward Recovery With replication in space, it is possible to

recover a fault while the system is progressing ahead

Roll-forward recovery is made possible by Checkpointing of replica state Logging of incoming messages Reliable, totally ordered group communication

system

Page 31: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Roll-Forward RecoveryRoll-Forward Recovery We want to ensure the newly admitted replica to

have a consistent state with others when it starts Steps of adding a new replica into a group

(with on-demand checkpointing) A recovered (or a new) replica joins a group A join message is multicast in total order On receiving the join message, it is put into incoming

message queue and wait for processing When the join message is at the head of the queue, a

checkpoint is taken and it is transferred to the new replica

Page 32: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Roll-Forward RecoveryRoll-Forward Recovery At the new replica, it starts queueing messages after it

receives the join messages (sent by itself) When the checkpoint is received by the new replica, its

state is restored using the received checkpoint (the checkpoint is delivered out of order!)

The queued messages are delivered in order, at the new replica

Other replicas do not stop and wait for the new replica Steps of adding a new replica into a group

with periodic checkpointing is similar

Page 33: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Steps of Roll-Forward Steps of Roll-Forward RecoveryRecovery

ExistingReplica

Recovery_S tart Recovery_Start

Recovery_Start

Recovery_Starttriggers queueingof messages

Recovery_Startis queued, just like a regular message

Checkpointop1op2

Recovery_Start

New or restartedReplica

(i)

Page 34: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Steps of Roll-Forward Steps of Roll-Forward RecoveryRecovery

op3

op3

op3

op3

R ecovery_Start

ExistingReplica

New messageop3 is queuedwhile waiting forreply of op2

Checkpointop1op2

R ecovery_S tart

New or restartedReplica

(ii)

Page 35: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Steps of Roll-Forward Steps of Roll-Forward RecoveryRecovery

ExistingReplica

op3

op3Recovery_Start

Loggedmessages beforeRecovery_S tartare consolidatedand multicast

op2returns

Checkpointop1op2

Recovery_Start

New or restartedReplica

(iii)

Page 36: EEC 688/788 Secure and Dependable Computing

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Steps of Roll-Forward Steps of Roll-Forward RecoveryRecovery

ExistingReplica

op3 op3

New or restartedReplica

Normal operationis resumed andqueued messages aredelivered

Outgo ing messages as a result of opera tions beforeRecovery_Start aresuppressed

Transferredlog is expandedand the checkpo int is app lied

Checkpoint Checkpointop1 op1op2 op2

(iv)

Page 37: EEC 688/788 Secure and Dependable Computing

Roll-backward RecoveryRoll-backward Recovery Roll-backward recovery is used for systems relying

on replication in time for fault tolerance When a failure occurs, roll back using the most recent

checkpoint (and retry)

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao

Page 38: EEC 688/788 Secure and Dependable Computing

Roll-backward Recovery in a Roll-backward Recovery in a Distributed SystemDistributed System Performing roll-backward recovery in a distributed

system is non-trivial Need to solve the distributed snapshot problem It is easy to perform a local checkpoint of a process, but in

a distributed system, when one process rolls back, other processes must also roll back to a consistent state

04/24/2304/24/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable

ComputingComputing Wenbing ZhaoWenbing Zhao