15
4/13/2008 1 Amitanand S. Aiyer, Lorenzo Alvisi, Allen Clement, Mike Dahlin, Jean-Philippe Martin, Carl Porth Award Paper in the 20 th ACM Symposium on Operating Systems Principles (SOSP 2005). Presented to: Dr. AymanAbdel-Hamid By: Shaimaa Lazem Outline Overview Byzantine-Altruistic-Rational (BAR) model System Architecture Principles of Operations Level 1: BART State Machine Level 2: Partitioning Work Level 3: The Application BAR-B 4/14/2008

Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

1

Amitanand S. Aiyer, Lorenzo Alvisi, Allen Clement, Mike Dahlin, Jean-Philippe Martin, Carl Porth

Award Paper in the 20th ACM Symposium on Operating Systems Principles (SOSP 2005).

Presented to:Dr. Ayman Abdel-Hamid

By:Shaimaa Lazem

Outline� Overview

� Byzantine-Altruistic-Rational (BAR) model

� System Architecture

� Principles of Operations

� Level 1: BART State Machine

� Level 2: Partitioning Work

� Level 3: The Application

� BAR-B

4/14/2008

Page 2: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

2

Overview� Cooperative service in Multiple Administrative Domains (MAD):

� Nodes collaborate to provide some service that benefits each node, but there is no central authority that controls the nodes’ actions (Internet routing, cooperative backup).

� Problem

� Nodes may depart from protocols . Failure, broken, security compromise, selfish nodes.

� Not sufficient to verify experimentally that a protocol tolerates a collection of attacks identified by the protocol’s creator.

� It is necessary to design protocols that provably meet their goals, no matter what strategies nodes may concoct .

4/14/2008

Contributions� Formal model for reasoning about systems in the

presence of nodes’ deviated behavior (BAR Model).

� General architecture and a set of design principles which, together, make it possible to build and reason about BAR tolerant systems.

� The implementation of BAR-B, a cooperative backup system within the BAR model.

4/14/2008

Page 3: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

3

Byzantine-Altruistic-Rational

(BAR) model� Three classes of nodes:

� Rational nodes participate in the system to gain some net benefit and can depart from a proposed program in order to increase their net benefit.

� Byzantine nodes can depart arbitrarily from a proposed program whether it benefits them or not.

� Altruistic nodes that execute a proposed program even if the rational choice is to deviate.

4/14/2008

Byzantine-Altruistic-Rational

(BAR) model (cont.)� Two classes of protocols :

� Incentive-Compatible Byzantine Fault Tolerant (IC-BFT)

A protocol is IC-BFT if it guarantees the specified set of safety and liveness properties and if it is in the best interest of all rational nodes to follow the protocol exactly.

� Byzantine Altruistic Rational Tolerant (BART)

A protocol is BART if it guarantees the specified set of safety and liveness properties in the presence of all rational deviations from the protocol.

4/14/2008

Page 4: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

4

Replicated State Machine (RSM)

� Technique for supporting service replication.

� The service is written as a deterministic state machine; replicated on several machines.

� An RSM substrate coordinates the behavior of the separate state machines so that their executions proceed consistently, even if some of the computers fail.

� A key task of the RSM substrate is to establish a task ordering.

4/14/2008

Replicated State Machine (RSM)

(cont.)

4/14/2008

A typical RSM-based client-server computer system [2].

Page 5: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

5

Replicated State Machine (RSM)

(cont.)

4/14/2008

RSM timing diagram [2].

System Model Assumptions� BART protocols that do not depend on the existence of

altruistic nodes in the system.

� Trusted authority controls which nodes may enter the system.

� Each member has a unique identity corresponding to a cryptographic public key.

� Nodes have an incentive to stay as synchronized as possible through a “penance” mechanism.

4/14/2008

Page 6: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

6

System Model Assumptions (cont.)� Rational Nodes:

� Receive a long term benefit from participating in the protocol.

� Conservative when computing the impact of Byzantine nodes on their utility.

� Colluding nodes are classified as Byzantine.

� Byzantine Nodes:� Exhibit arbitrary behavior. crash, lose data, alter data, and send

incorrect protocol messages.

� At most ((n-2)/3) of the nodes in the system are Byzantine.

� Every non-Byzantine node is rational.

4/14/2008

System Architecture

� Level 1, key abstractions for reliable distributed services.

� RSM gives the abstraction of a correct (reliable and altruistic) node.

� Level 2, build a system in which work can be assigned to specific nodes instead of executed by all replicas in the RSM.

� Level 3, implements a desired service using the levels underneath.

4/14/2008

Page 7: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

7

Principles of Operations� Accountability, nodes are accountable for their behavior,

then rational peers have an incentive to behave correctly.

� Strong identities and restricted membership are parts of the solution.

� How should a system detect and react to incorrect behavior?

� Aggressively Byzantine node, easy to address:

� A node signs a promise to store a file with a particular cryptographic hash and then responds to a request to read the file with a signed message that contains the wrong data.

4/14/2008

Principles of Operations (cont.)� Passive aggressively node:

� A node may decline to send a message that it should send. The receiver is in a position to accuse the node of wrongdoing, but it becomes a case of “he said/she said”.

� A node may exploit non-determinism to provide incomplete information that interfere with the protocol’s operation but are difficult to conclusively prove wrong.� A node transmits a signed copy of the request, but for liveness

it is permitted to transmit a signed timeout message instead.

� Self-interested nodes may choose to send the timeout message rather than transmit the request.

4/14/2008

Page 8: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

8

Principles of Operations-

Addressing the challenges� Level 1 (primitives)

� Nodes unilaterally deny service to nodes that fail to send expected messages. This low-level, local tit-for-tattechnique provides incentives for cooperation without requiring a third party to judge which node is to blame.

� The protocol balances costs so that when nodes have a choice between two messages, there is no incentive to choose the “wrong” one.

� Nodes can unilaterally impose extra work (called penance) when they judge that another node’s response is not timely.

4/14/2008

Principles of Operations -

Addressing the challenges (cont.)� Level 2 (work assignment)

� If a node fails to reply to a request issued via the underlying state machine, then a quorum of nodes in the state machine generates a proof of misbehavior (POM) against the node.

� Level 3 (application)

� Applications make use of reliable work assignment, each request is bound to a reply or timeout.

� The application protocol must be designed so that requests and responses include sufficient information for any node to judge the validity of a request/response pair.

4/14/2008

Page 9: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

9

Level 1: BART State Machine� Terminating Reliable Broadcast (TRB)

� Each TRB instance is organized in a series of turns.

� The sender for instance i is the first leader for instance i.

� If nodes receive the messages on time they accept the value, otherwise nodes send a “set-turn” message.

� Nodes other than the sender are selected round-robin for the leader role.

� Each participant thus has a periodic opportunity to propose values to the state machine (ensure long term benefit).

� An instance can terminate only in two ways to limit non-determinism (sender’s value, default value)

4/14/2008

Level 1: BART State Machine(cont.)

4/14/2008

Page 10: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

10

Level 1: BART State Machine(cont.)� Message Queue

� The message queue used by x contains entries for the messages that x intends to send to y, interleaved with “bubbles”.

� A bubble must be filled with an appropriate message from y before x can proceed to send the messages in the queue.

� Incentive for rational nodes to send messages expected by protocol.

� Balanced Messages:� Whenever the node has the opportunity to choose the

message to send next, the intended message is never more expensive than the alternatives.

4/14/2008

Level 1: BART State Machine(cont.)� Penance

� Each node maintains an untimely vector that tracks their perception of other nodes timeliness.

� A node is considered untimely if any timeout message electing a new leader arrives significantly earlier or later than expected according to the receiver’s local clock.

� When a node x becomes the sender, it includes its untimelyvector with the value it proposes.

� After agreeing on the proposal, all nodes except the sender expect a penance message from each node indicted in the untimely vector.

� Because of the message queues, the untimely nodes must send the penance message to all non-sender nodes.

4/14/2008

Page 11: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

11

Level 1: BART State Machine(cont.)� Timeouts and garbage collection

� Set turn timeout to transfer leadership from a slow leader .� A max response time timeout to garbage collect messages

queued for extremely slow nodes.� If node a remains silent for an extended period of time, it can

force non-Byzantine node b to retain an arbitrarily large set of pending messages to a.

� The cost of participating in the protocol exceeds the benefit.� if a has been holding pending messages for b for more than

max response time, then a:� records b as faulty by adding b to its badlist, � garbage collects all state associated with b,� refuses further communication with b.

4/14/2008

Level 1: BART State Machine(cont.)� Global Punishment

� A mechanism to transform local suspicion against other nodes into POMs.

� The POMs allow nodes to agree that someone misbehaved.

� When node a is the sender of an instance, it includes its badlist as a bit vector with the value it proposes.

� Nodes monitor the badlists they receive from others.

� if over time node b appears on at least f +1 different senders’ badlists, then the receivers of these badlists also begin to consider b faulty.

� They add b to their own badlist, discard the state associated with b, and refuse to communicate with b in the future.

4/14/2008

Page 12: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

12

Level 2: Partitioning Work� Partitions work to reduce the replication overhead

required by cooperative applications.

� Guaranteed Response protocol ensures that every request is answered.

4/14/2008

Level 2: Partitioning Work (cont.)� The Periodic Work protocol ensures that clients

periodically answer implicit requests required by an application.

� Each node will provide the witness with an application specified response type indicating its completion of a periodic task.

� If a node does not supply the expected ReplySummary, the witness node can either unilaterally deny its services to the offending node or generate a POM to be handled by the application.

4/14/2008

Page 13: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

13

Level 2: Partitioning Work (cont.)� The Message Binding protocol binds messages to an

authoritative time.

� Maintains an authoritative time that is recent, non-decreasing, and identical at all state machine nodes.

� Each proposal to the state machine is required to contain a local timestamp generated by the proposer.

� Authoritative time is computed by taking the maximum of the median of the timestamps of the f + 1 most recent decisions and the previous authoritative time.

4/14/2008

Level 3: The Application� BART applications must discharge four responsibilities

in order to take advantage of lower-level abstractions.� Provide rational nodes with a long-term benefit for

participating in the system.

� Assign work to nodes in a fault tolerant manner.

� Determine if the contents of a request or response constitute a Proof of Misbehavior (POM) under the application semantics.

� Sanction nodes that have provably misbehaved.

� Structuring the messages so that incorrect responses act as proofs of misbehavior .

4/14/2008

Page 14: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

14

BAR-B� BAR-B is a cooperative backup system in which nodes commit

to participating in the system’s state machine and contributing an amount of storage to the system in exchange for an equal amount of space on other nodes.

� Guarantees

� Data can be retrieved within the lease period.

� No POM gathered against a node that does not deviate .

� No node store more than its quota without risking being caught.

� If a node crashes, it is guaranteed a window of time during which it can rejoin the system and recover all data it has stored.

4/14/2008

References1. A. Aiyer, L. Alvisi, A. Clement, M. Dahlin, and J-P. Martin, “BAR

Fault Torelance for Cooperative Services,” in Proceedings of the 20th

ACM Symposium on Operating Systems Principles, pp.45-58,

Brighton, United Kingdom, October 23-26, 2005.

2. J. Howell and J. Douceur. Replicated virtual machines. Technical

report MSR-TR-2005-119, Microsoft Research, 2005.

3. http://www.cs.utexas.edu/users/lorenzo/bar.html

4/14/2008

Page 15: Outline - people.cs.vt.edu · Colluding nodes are classified as Byzantine. Byzantine Nodes: Exhibit arbitrary behavior. crash, lose data, alter data, and send incorrect protocol messages

4/13/2008

15

Thank You

Questions ?

4/14/2008