47
PhD Dissertation Defense Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense1

Composable Consistency for Wide Area Replication

Sai Susarla

Advisor: Prof. John Carter

Page 2: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense2

Overview

Goal: middleware support for wide area caching in diverse distributed applications

Key Hurdle: flexible consistency management

Our Solution: novel consistency interface/model - Composable Consistency

Benefit: supports broader set of sharing needs than existing models. Examples:

file systems, databases, directories, collaborative apps – wider variety than any existing consistency model can support

Demo Platform: novel P2P middleware data store - Swarm

Page 3: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense3

Caching: Overview

The Idea: cache frequently used items locally for quick retrieval

Benefits Within cluster: load-balancing, scalability Across WAN: lower latency, improved throughput & availability

Applications Data stored in one place, accessed from multiple locations Examples:

» File system: personal files, calendars, log files, software, …» Database: online shopping, inventory, auctions, …» Directory: DNS, LDAP, Active Directory, KaZaa, …» Collaboration: chat, multi-player games, meetings, …

Page 4: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense4

Centralized Service

userclient

Internet

Primaryserver cluster

Page 5: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense5

Proxy-based Caching

userclient

Internet

Caching proxyServer cluster

Consistencyprotocol

Primaryserver cluster

Page 6: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense6

Caching: The Challenge

Applications have diverse consistency needs

Application Sharing Characteristics Consistency needs

Static web content, media, s/w updates

Read-mostly Stale data, manual reload ok

Chat, whiteboard Concurrent appends Real-time sync, causal msg order

Auctions, ticket sales, Financial DB

Write-sharing, conflicts, varying contention

Serializability, latest data, atomicity (ACID)

… … …

Page 7: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense7

Caching: The Problem

Consistency requirements are diverse

Caching is difficult over WANs Variable delays, node failures, network partitions, admin domains, …

Thus, most WAN applications either: Roll their own caching solution, or

Do not cache and live with the latency

Can we do better?

Page 8: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense8

Thesis

"A consistency management system that provides

a small set of customizable consistency mechanisms

can efficiently satisfy the data sharing needs of

a wide variety of distributed applications."

Page 9: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense9

Outline

Further Motivation

Application study new taxonomy to classify application sharing needs

Composable Consistency (CC) model Novel interface to express consistency semantics for each access Small option set can express more diverse semantics

Evaluation

Page 10: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense10

Existing Models are Inadequate

Provide a few packaged consistency semantics for specific needs:

e.g., optimistic/eventual, close-to-open, strong

Or, lack enough flexibility to support diverse needs TACT (cannot express weak consistency or session semantics) Bayou (cannot support strong consistency)

Or, leave consistency management burden on applications

e.g., Oceanstore, Globe

Page 11: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense11

Existing Middleware is Inadequate

Existing middleware support specific sharing needs Read-only data: PAST, BitTorrent Rare write-sharing: file systems (NFS, Coda, Ficus …) Master-slave (read-only) replication: storage vendors, mySQL Scheduled (nightly) replication: storage and DB services Read-write replication in a cluster: commercial DB vendors, Petal

Page 12: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense12

Application Survey

40+ applications with diverse consistency needs

Application Sharing Characteristics Consistency needs

Static web content, media, s/w updates

Read-mostly Stale data, manual reload ok

Stock quotes Read-only Limit max. staleness to T secs

Chat, whiteboard Concurrent appends Real-time sync, causal msg order

Multiplayer game Heavy write-sharing Real-time sync, totally order play moves

Auctions, ticket sales, Financial DB

Write-sharing, conflicts, varying contention

Serializability, latest data, atomicity (ACID)

Personal file access Rare write-sharing Eventual consistency

Mobile file access, collaborative sharing

Sequential write-sharing Latest data, session semantics

Directory, calendars, groupware

Write-sharing, mergeable writes Tight sync within campus, relaxed sync across campuses

Page 13: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense13

Survey Results

Found common issues, overlapping choices Are parallel read and writes ok? How often should replicas synchronize? Does update order matter? What if some copies are inaccessible? …

Can we exploit this commonality?

Page 14: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense14

Composable Consistency:Novel interface to express consistency

semantics

Access mode Concurrent Exclusive

Sync frequency Manual push, pull

T-seconds stale, N missed writes

Strength Hard Soft

Causality Yes No

Atomicity Yes No

Update ordering None Total Serial

Inaccessible copy Ignore Fail access

Accept updates Session Immediately

Reveal updates On close() Immediately

Concurrency control

Replica synchronization

Failure handlingView IsolationUpdate Visibility

Page 15: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense15

Example: Close-to-open (AFS)

Access mode Concurrent Exclusive

Sync frequency Manual push, pull

0 seconds stale

Strength Hard Soft

Causality Yes No

Atomicity Yes No

Update ordering None Total Serial

Inaccessible copy Ignore Fail access

Accept updates Session Immediately

Reveal updates On close() Immediately

Allow parallel reads and writes

Latest data guaranteed at open()

Fail access when partitioned

Accept remote updates only at open()

Reveal local updates to others only on close()

Page 16: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense16

Example: Eventual Consistency (Bayou)

Access mode Concurrent Exclusive

Sync frequency Manual push, pull

10 minutes stale

Strength Hard Soft

Causality Yes No

Atomicity Yes No

Update ordering None Total Serial

Inaccessible copy Ignore Fail access

Accept updates Session Immediately

Reveal updates On close() Immediately

Allow parallel reads and writes

Sync copies at most once every 10 minutes

Syncing should not block or fail operations

Accept remote updates as they arrive

Reveal local updates to others as they happen

Page 17: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense17

Handling Conflicting Semantics

What if two sessions have different semantics? If conflicting, block a session until conflict goes away (serialize) Otherwise, allow them in parallel

Simple rules for checking conflicts (conflict matrix)

Examples: Exclusive write vs. exclusive read vs. eventual write: serialize Write-immediate vs. session-grain isolation: serialize Write-immediate vs. eventual read: no conflict

Page 18: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense18

Using Composable Consistency

Perform data access within a session e.g., session_id = open(object, CC_option_vector); read(session_id, buf); write(session_id, buf);

OR, update(session_id, incr_counter(value)); close(session_id);

Specify consistency semantics per-session at open() via the CC option vector

Concurrency control, replica synchronization, failure handling, view isolation and update visibility.

System enforces semantics by mediating each access

Page 19: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense19

Composable Consistency Benefits

Powerful: Small option set can express diverse semantics

Customizable: allows different semantics for each access

Effective: amenable to efficient WAN implementation

Benefit to middleware Can provide read-write caching to a broader set of apps.

Benefit for an application Can customize consistency to diverse and varying sharing needs Can simultaneously enforce different semantics on the same data for

different users

Page 20: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense20

Evaluation

Page 21: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense21

Swarm: A Middleware Providing CC

Swarm: Shared file interface with CC options Location-transparent page-grained file access Aggressive P2P caching Dynamic cycle-free replica hierarchy per file

Prototype implements CC (except causality & atomicity) Per-file, per-replica and per-session consistency

Network economy (exploit nearby replicas)

Contention-aware replication (RPC vs caching)

Multi-level leases for failure resilience

Page 22: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense22

Client-server BerkeleyDB Application

InternetPrimaryApp server

App logic

DB

kernel

App users

FS

LAN

LAN

App users

Page 23: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense23

BerkeleyDB Application using Swarm

InternetPrimaryApp server

App logic

DB

kernel

App users

FS

LAN

LAN

RDB plugin

Swarm server

DB

RDB wrapper

App users

Page 24: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense24

Caching Proxy App Server using Swarm

InternetPrimaryApp server

App logic

DB

kernel

App users

FS

LAN

LAN

RDB plugin

Swarm server

DB

RDB wrapper

App logic

DB

kernel FS

RDB plugin

Swarm server

DB

RDB wrapper

ProxyApp server

App users

Page 25: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense25

Swarm-based Applications

SwarmDB: Transparent BerkeleyDB database replication across WAN

SwarmFS: wide area P2P read-write file system

SwarmProxy: Caching WAN proxies for an auction service with strong consistency

SwarmChat: Efficient message/event dissemination

No single model can support the sharing needs of all these applications

Page 26: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense26

SwarmDB: Replicated BerkeleyDB

Replication support built as wrapper library

Uses unmodified BerkeleyDB binary

Evaluated with five consistency flavors: Lock-based updates, eventual reads Master-slave writes, eventual reads Close-to-open reads, writes Staleness-bounded reads, writes Eventual reads, writes

Compared against BerkeleyDB-provided RPC version

Order-of-magnitude throughput gains over RPC by relaxing consistency

Page 27: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense27

SwarmDB Evaluation

BerkeleyDB B-tree index replicated across N nodes

Nodes linked via 1Mbps links to common router 40ms RTT to each other

Full-speed workload 30% Writes: inserts, deletes, updates 70% Reads: lookups, cursor scans

Varied # replicas from 1 to 48

Page 28: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense28

SwarmDB Write Throughput/replica

Locking writes, eventual reads

Close-to-open

Master-slave writes, eventual reads

10msec stale

Optimistic

20msec stale

RPC over WAN

Local SwarmDB server

Page 29: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense29

SwarmDB Query Throughput/replica

RPC over WAN

Local SwarmDB server

Optimistic

10msec stale

Close-to-open

Page 30: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense30

SwarmDB Results

Customizing consistency can improve WAN caching performance dramatically

App can enforce diverse semantics by simply modifying CC options

Updates & queries with different semantics possible

Page 31: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense31

SwarmFS Distributed File System

Sample SwarmFS path /swarmfs/swid:0x1234.2/home/sai/thesis.pdf

Performance Summary Achieves >80% of local FS performance on Andrew Benchmark More network-efficient than Coda for wide area access Correctly supports fine-grain collaboration across WANs Correctly supports file locking for RCS repository sharing

Page 32: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense32

SwarmFS: Distributed Development

Page 33: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense33

Replica Topology

Page 34: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense34

SwarmFS vs. Coda Roaming File Access

Compile Latency from Cold Cache

0

50

100

150

200

250

300

U1 I1,24ms

C1,50ms

T1,160ms

F1,130ms

seco

nd

s

Coda-s SwarmFS

Coda-s always gets files from distant U1.

SwarmFS gets files from nearest copy.

Network Economy

Page 35: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense35

SwarmFS vs. Coda Roaming File Access

Compile Latency from Cold Cache

0

50

100

150

200

250

300

U1 I1,24ms

C1,50ms

T1,160ms

F1,130ms

LAN#-node#, RTT to Home (U1)

seco

nd

s

Coda-s SwarmFS

Coda-s writes files through to U1 for close-to-open semantics.

Swarm’s P2P pull-based protocol avoids this.

Hence, SwarmFS performs better for temporary files.

P2P protocol more efficient

Page 36: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense36

SwarmFS vs. Coda Roaming File Access

Compile Latency from Cold Cache

0

50

100

150

200

250

300

U1 I1,24ms

C1,50ms

T1,160ms

F1,130ms

seco

nd

s

Coda-s SwarmFS Coda-wEventual consistency inadequate

Coda-w behaves incorrectly

`make’ skipped files

linker found corrupt object files.

Trickle reintegration pushed huge obj files to U1, clogging network link.

Coda-w Compile errors

Page 37: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense37

Evaluation Summary

SwarmDB: gains of customizable consistency

SwarmFS: network economy under write-sharing

SwarmProxy: strong consistency over WANs under varying contention

SwarmChat: update dissemination in real-time

By employing CC, Swarm middleware data store can support diverse app needs effectively

Page 38: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense38

Related Work

Flexible consistency models/interfaces Munin, WebFS, Fluid Replication, TACT

Wide area caching solutions/middleware File systems and data stores:

AFS, Coda, Ficus, Pangaea, Bayou, Thor, … Peer-to-peer systems:

Napster, PAST, Farsite, Freenet, Oceanstore, BitTorrent, …

Page 39: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense39

Future Work

Security and authentication

Fault-tolerance via first-class replication

Page 40: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense40

Thesis Contributions

Survey of sharing needs of numerous applications

New taxonomy to classify application sharing needs

Composable consistency model based on taxonomy

Demonstrated CC model is practical and supports diverse applications across WANs effectively

Page 41: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense41

Conclusion

Can a storage service provide effective WAN caching support for diverse distributed applications? YES

Key enabler: a novel flexible consistency interface called Composable consistency

Allows an application to customize consistency to diverse and varying sharing needs

Allows middleware to serve a broader set of apps effectively

Page 42: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense42

Page 43: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense43

SwarmDB Control Flow

Page 44: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense44

Composing Master-slave

Master-slave replication serialize updates

» Concurrent mode writes (WR)» Serial update ordering (apply updates at central master)

eventual consistency for queries» Options mentioned earlier

Use: mySQL DB read-only replication across WANs

Page 45: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense45

Clustered BerkeleyDB

Page 46: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense46

BerkeleyDB Proxy using Swarm

Page 47: PhD Dissertation Defense 1 Composable Consistency for Wide Area Replication Sai Susarla Advisor: Prof. John Carter

PhD Dissertation Defense47

A Swarm-based Chat Room

callback(handle, newdata){ display(newdata);}main(){ handle = sw_open(kid, "a+"); sw_snoop(handle, callback); while (! done) {

read(&newdata); display(newdata); sw_write(handle, newdata); } sw_close(handle);}

Chat transcript: WR mode, 0 second soft staleness, immediate visibility, no isolation

21

3

4

Update propagation path

P

Sample Chat client code