18
OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02 John Kubiatowicz University of California at Berkeley

OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

  • Upload
    blanca

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02. John Kubiatowicz University of California at Berkeley. Everyone’s Data, One Utility. Millions of servers, billions of clients …. 1000-YEAR durability (excepting fall of society) Maintains Privacy, Access Control, Authenticity - PowerPoint PPT Presentation

Citation preview

Page 1: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStoreStatus and Directions

ROC/OceanStore Retreat 6/10/02

John KubiatowiczUniversity of California at Berkeley

Page 2: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:2ROC/OceanStore Jan’02

Everyone’s Data, One Utility

• Millions of servers, billions of clients ….• 1000-YEAR durability (excepting fall of society)• Maintains Privacy, Access Control, Authenticity• Incrementally Scalable (“Evolvable”)• Self Maintaining!

• Not quite peer-to-peer: • Utilizing servers in infrastructure• Some computational nodes more equal than others

Page 3: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:3ROC/OceanStore Jan’02

The Path of an OceanStore UpdateSecond-Tier

Caches

Multicasttrees

Inner-RingServers

Clients

Page 4: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:4ROC/OceanStore Jan’02

Big Push: OSDI

• We analyzed and tuned the write path– Many different bottlenecks and bugs found– Currently committing data and archiving it at

about 3-5 Mb/sec

Page 5: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:5ROC/OceanStore Jan’02

Big Push: OSDI• Stabilized basic OceanStore code base• Interesting issues:

– Cryptography in critical path• Fragment generation/SHA-1 limiting archival

throughput at the moment• Signatures are problem for inner ring

– (although – Sean will tell you about cute batching trick)

– Second-tier can shield inner ring• Actually shown this with Flash-crowd-like

benchmark– Berkeley DB has max limit approx 10mb/sec

• Buffer cache layer can’t meet that

Page 6: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:6ROC/OceanStore Jan’02

OceanStore Goes Global!• OceanStore components running “globally:”

– Australia, Georgia, Washington, Texas, Boston– Able to run the Andrew File-System benchmark

with inner ring spread throughout US– Interface: NFS on OceanStore

• Word on the street: it was easy to do– The components were debugged locally– Easily set up remotely

• I am currently talking with people in:– England, Maryland, Minnesota, ….– Intel P2P testbed will give us access to much more

Page 7: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:7ROC/OceanStore Jan’02

Inner Ring• Running Byzantine ring from Castro-Liskov

– Elected “general” serializes requests• Proactive Threshold signatures

– Permits the generation of single signature from Byzantine agreement process

• Highly tuned cryptography (in C)– Batching of requests yields higher throughput

• Delayed updates to archive– Batches archival ops for somewhat quiet

periods• Currently getting approximately 5Mb/sec

Page 8: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:8ROC/OceanStore Jan’02

We have Throughput Graphs!

(Sean will discuss)

Page 9: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:9ROC/OceanStore Jan’02

• Have simple algorithms for placing replicas on nodes in the interior– Intuition: locality properties

of Tapestry help select positionsfor replicas

– Tapestry helps associateparents and childrento build multicast tree

• Preliminary resultsshow that this is effective

• We have tentative writes!– Allows local clients to

see data quickly

Self-Organizing second-tier

Page 10: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:10ROC/OceanStore Jan’02

Effectiveness of second tier

Page 11: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:11ROC/OceanStore Jan’02

Archival Layer• Initial implementation needed lots of tuning

– Was getting 1Mb/sec coding throughput– Still lots of room to go:

• A “C” version of fragmentation could get 26MB/s• SHA-1 evaluation expensive

• Beginnings of online analysis of servers– Collection facility similar to web crawler– Exploring failure correlations for global web

sites– Eventually used to help distribute fragments

Page 12: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:12ROC/OceanStore Jan’02

New Metric: FBLPY

• No more discussion of 1034 years MTTF• Easier to understand?

Page 13: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:13ROC/OceanStore Jan’02

4

2

3

3

3

2

2

1

2

4

1

2

3

3

1

34

1

1

4 32

4

Basic Tapestry MeshIncremental suffix-based routing

NodeID0x43FE

NodeID0x13FENodeID

0xABFE

NodeID0x1290

NodeID0x239E

NodeID0x73FE

NodeID0x423E

NodeID0x79FE

NodeID0x23FE

NodeID0x73FF

NodeID0x555E

NodeID0x035E

NodeID0x44FE

NodeID0x9990

NodeID0xF990

NodeID0x993E

NodeID0x04FE

NodeID0x43FE

Page 14: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:14ROC/OceanStore Jan’02

Dynamic Adaptation inTapestry

• New algorithms for nearest-neighbor acquisition [SPAA ’02]

• Massive parallel inserts with objects staying continuously available [SPAA ’02]

• Deletes (voluntary and involuntary): [SPAA ’02]

• Hierarchical objects search for mobility [MOBICOM submission]

• Continuous adjustment of neighbor links to adapt to failure [ICNP]

• Hierarchical routing (Brocade): [IPTPS’01]

Page 15: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:15ROC/OceanStore Jan’02

Reality: Web Caching through

OceanStore

Page 16: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:16ROC/OceanStore Jan’02

Other Apps• This summer: Email through OceanStore

– IMAP and POP proxies– Let normal mail clients access mailboxes in OS

• Palm-pilot synchronization– Palm data base as an OceanStore DB

• Better file system support– Windows IFS (Really!)

Page 17: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:17ROC/OceanStore Jan’02

Summer Work• Big push to get privacy aspects of

OceanStore up and running• Big push for more apps• Big push for Introspective computing

aspects– Continuous adaptation of network– Replica placement– Management/Recovery– Continuous Archival Repair

• Big push for stability– Getting stable OceanStore running continuously– Over big distances– …

Page 18: OceanStore Status and Directions ROC/OceanStore Retreat 6/10/02

OceanStore:18ROC/OceanStore Jan’02

For more info:• OceanStore vision paper for ASPLOS 2000

“OceanStore: An Architecture for Global-Scale Persistent Storage”

• OceanStore paper on Maintenance (IEEE IC):

“Maintenance-Free Global Data Storage”• SPAA paper on dynamic integration

“Distributed Object Location in a Dynamic Network”• Both available on OceanStore web site:

http://oceanstore.cs.berkeley.edu/