34
Continuous Consistency Continuous Consistency and Availability and Availability Haifeng Yu CPS 212 Fall 2002

Continuous Consistency and Availability

Embed Size (px)

DESCRIPTION

Continuous Consistency and Availability. Haifeng Yu CPS 212 Fall 2002. server. server. server. client. client. client. Consistency in Replication. Replication comes with consistency cost: Reasons for replication: Better performance and availability. - PowerPoint PPT Presentation

Citation preview

Continuous Consistency Continuous Consistency and Availabilityand Availability

Haifeng Yu

CPS 212 Fall 2002

2

Consistency in ReplicationConsistency in Replication

Replication comes with consistency cost: Reasons for replication: Better performance and availability

client

client

clientserver

server

server

Replication transforms client-server

communication to server-server

communication: • Decrease performance

• Decrease availability

3

Strong Consistency and Optimistic ConsistencyStrong Consistency and Optimistic Consistency

Traditionally, two choices for consistency level:• Strong consistency: Strictly “in sync”

• Optimistic consistency: No guarantee at all

• Associated tradeoffs with each model

Availability / Performance /Scalability

Consistency

Optimistic Consistency

Strong Consistency

4

Problems with Binary ChoiceProblems with Binary Choice

Strong consistency incurs prohibitive overheads for many WAN apps• Replication may even decrease performance, availability and scalability

relative to a single server!

Optimistic consistency provides no consistency guarantee at all• Resulting in upset users: Unbounded reservation conflicts

• Potentially render the app unusable: If traffic data is more than 1 hour stale, probably of little use

Applications cannot tune consistency level based on its environment• Need to adapt to client, service and network characteristics

5

Continuous ConsistencyContinuous Consistency Consistency is continuous rather than binary for many

WAN apps • These apps can benefit from exploiting the consistency

spectrum between strong and optimistic consistency.

Availability / Performance /Scalability

Consistency

Optimistic Consistency

Strong Consistency

Consistency

Continuous Consistency

Availability / Performance /Scalability

6

Quantifying ConsistencyQuantifying Consistency

Many ways:• Staleness (TTL in web caching): Invalidate

• Limit number of locally buffered writes

bufferedupdates

To Other

Replicas

7

Applications ?Applications ?

Applications:• Web caching

• Airline reservation

• Distributed games

• Shared editor

Non-Applications:• Some scientific computing problems

• Banking system

• Any application that has binary output

Application’s nature determines whether continuous consistency is applicable

8

Trading Consistency for PerformanceTrading Consistency for Performance

Airline reservation: running at Berkeley, Utah, Duke

0

10

20

30

40

50

0% 50% 100%Inconsistency

Th

rou

gh

pu

t(u

pd

ate

s/s

ec

)

StrongConsistency

OptimisticConsistency

[Yu’02, TOCS]

9

The Cost of Increased PerformanceThe Cost of Increased Performance

Increased performance comes with a cost• Adaptively trade consistency for performance based on client,

network, and service conditions

0%

5%

10%

15%

20%

25%

0% 20% 40% 60% 80% 100%Inconsistency

Res

v. C

on

flic

t R

ate

10

Model vs. ProtocolModel vs. Protocol

Continuous consistency model is a spec.

Protocol is anything that can enforce the spec. • Corollary: Strong consistency protocol is a protocol for any model

Many protocols for a specific model, some are good, others are not

11

Designing a Continuous Consistency ModelDesigning a Continuous Consistency Model

Model is a spec, thus quantifying consistency (in a bad way) is trivial

Only applications know its definition of consistency• Airline reservation vs. distributed games

What is a “good” continuous consistency model?• Can be used by diverse apps

• Practical

12

Distributed Consensus and Leader ElectionDistributed Consensus and Leader Election

What does “continuous consistency” mean ?• Allow at most k decision values

• Allow at most k leaders

Helps overcome some impossibilities• Unique decision value requires ½ majority

• K decision values allow any partition with 1/(k + 1) nodes to decide

13

Group Membership ServiceGroup Membership Service

Def: Keep track of which nodes belong to which group Traditionally, group membership only maintain a single group

• Primary-partition membership services

• Corresponds to strong consistency

Recently, partitionable membership services• Still active area of research

• Corresponds to optimistic consistency

Continuous consistency:• Allow at most k groups

• Again, helps overcome the ½ majority limitation

14

Continuous Consistency SummaryContinuous Consistency Summary

WAN replication needs dynamically tunable consistency

Tradeoff between consistency and performance

How to design a continuous consistency model

Continuous consistency in other context

Next: Availability

15

What is Availability ?What is Availability ?

No well-accepted availability metric for Internet services “Uptime” metric can be misleading for Internet services

• Server may be inaccessible because of network partition

Available: “present or ready for immediate use” • From Webster’s Collegiate Dictionary

• What does “immediate” mean?

• Time-out

Availability = (accepted accesses) / (submitted accesses)

• Implicit time-out in the definition

16

Perform-abilityPerform-ability

User satisfaction is not binary• What if a partial result is returned before time-out ?

• What if the result is sent back after an hour, or a day ?

• Availability is related to performance

Performability = reward function (quality and timeliness of result)

Determining reward function is hard !

17

Availability of an Internet ServiceAvailability of an Internet Service We use user-observed availability in our study:

Availability = (accepted accesses) / (submitted accesses)

Server

client

×2% [Chandra et.al.,

USITS’01]

reject due to server

failure×0.1% [MS press

release,Jan’01]

18

Effects of ReplicationEffects of Replication

Consistency may force a replica to reject an otherwise acceptable request• Network Failure Rate Replica Rejection Rate

client

× < 2%

× reject

Replica Replica

reject×

communication

to maintain

consistency

failed

> 0.1%

19

Limitations of Strong ConsistencyLimitations of Strong Consistency

: Replicas

: Clients

Option 1: accept reads accept reads

reject writes reject writes

Option 2: accept reads reject reads

accept writes reject writes

20

Effects of Continuous ConsistencyEffects of Continuous Consistency

Option 1: accept reads accept reads

reject writes reject writes

New Option 1: accept reads accept reads

accept first 10 writes accept first 5 writes

allow

replica to

buffer

5 writes

21

Effects of Continuous ConsistencyEffects of Continuous Consistency

Option 2: accept reads reject reads

accept writes reject writes

New Option 2: accept reads accept first few reads

accept writes accept first 5 writes

allow

replica to

buffer

5 writes

22

Consistency Impact is Consistency Impact is InherentInherent

Availability

Inconsistency

Hard Bound

0% Consistency

100% Availability

100% Consistency

Hard bound always exist We always know the to end points, but may not know the exact

shape of the curve

23

Effects of Consistency ProtocolEffects of Consistency Protocol

Achieved availability also depends on protocol• Design better protocols

• Job of system designers

Availability

Inconsistency

Upper BoundProtocol A

Protocol B

24

Availability OptimizationsAvailability Optimizations

Technique should not be tied to model

Focus on two techniques:• Retiring replicas

• Aggressive write propagation

25

Limitations of Strong ConsistencyLimitations of Strong Consistency

: Replicas

: Clients

Option 1: accept reads accept reads

reject writes reject writes

Option 2: accept reads reject reads

accept writes reject writes

26

Retiring ReplicasRetiring Replicas

Obviously, such decision may not be optimal unless we have future knowledge• Importance of prediction

Even with future knowledge, it is hard

In option 2, all replicas much reach an agreement• Leader election

• We are experiencing partitions

• One option: Voting

• What if we don’t have majority?

27

Aggressive Write PropagationAggressive Write Propagation

Applicable to continuous consistency

Continuous consistency gives us “buffers” that can be utilized in case of network partition

Keep the buffer empty: • Cannot predict the occurrence of network partitions

• Propagate writes more aggressively

• Cut down the amount of inconsistency accumulated in times of good connectivity

28

Effects of Aggressive PropagationEffects of Aggressive Propagation Baseline: Propagate writes only when necessary (lazily) Aggressive: When necessary and every 3 seconds

0.993

0.994

0.995

0.996

0.997

0.998

Inconsistency

Avail

ab

ilit

y

Avail UpperBound

Aggressive

Baseline

8 replicas with

measured faultload

From

[Yu’01, SOSP]

29

More Aggressive PropagationMore Aggressive Propagation

Aggressive write propagation does not work in all cases

Availability optimizations can incur more communication• Best availability achieved when we use a strong consistency

protocol

Speaks of availability / performance tradeoffs

30

Availability of Other SystemsAvailability of Other Systems

Consensus and leader election• Blocks without majority

Group membership• Blocks without majority

Relaxing consistency enables them to make progress• Open Question: But will these systems still be useful ?

31

Availability SummaryAvailability Summary

Availability definition

Inherent impact of consistency on availability

Availability also depends on consistency protocols

Availability optimizations:• Replica retirement

• Aggressive write propagation

32

Why can we easily approach the upper bound?Why can we easily approach the upper bound?

Simple protocols in our study can approach the upper bound closely• Remember reaching the upper bound in general needs future

knowledge

Related to the characteristics of the faultloads we measured and simulated• Most partitions are singleton partitions

• Most transitions are:

fully-connected → singleton partition → fully-connected

These characteristics are consistent with • Internet hierarchical architecture

33

Dual Effects of Replication Scale on Dual Effects of Replication Scale on AvailabilityAvailability

Consistency may force a replica to reject a request Adding more replicas:

• Network Failure Rate Replica Rejection Rate

Availability = (1 - Network Failure Rate) * ( 1 - Rejection Rate)

• Too large or too small replication scale can hurt availability

34

Optimal Replication ScaleOptimal Replication Scale Optimal replication scale: Adding more replicas can hurt!

• Increase in “replica rejection rate” outweighs decrease in “network failure rate”

Optimal replication scale depends on• Consistency level

• Network failure rate among replicas

0.9840.9860.988

0.990.9920.9940.9960.998

1

1 2 3 4 5 6 7Number of Replicas

Av

ail

ab

ilit

y U

pp

er

Bo

un

d

Failure Rate = 1%,Numerical Error = 250

Failure Rate = 1%,Numerical Error = 0

Failure Rate = 5%,Numerical Error = 250

Failure Rate = 5%,Numerical Error = 0