View
222
Download
0
Category
Tags:
Preview:
Citation preview
OceanStore An Architecture for Global-Scale Persistent Storage
• Motivation
• Feature
• Application
• Specific Components- Secure Naming - Update
- Access Control - Deep Archival Storage
- Data Location and Routing - Introspection
• Conclusion
http://oceanstore.cs.berkeley.edu
provides persistent storage for ubiquitous computing
• Secure information
• Durable information
• Automatic and reliable archiving of information
• Geographically distributed data and cache
1010 users * 10,000 files/user = 1014 files
Motivation
• Data Utility Model- user- service provider / responsible party
• Untrusted Infrastructure - privacy & integrity & robustness
• Nomadic Data/Promiscuous Cachingfloating replicas
• Deep Archival Storagearchival form of data objectself-verifying data
• Introspection
Features
Computation
Optimization Observation
• Groupware and personal information management toolschallenge: concurrent updates from many peoplesolution: flexible update mechanism
• Digital libraries and repositories for scientific datachallenge: Massive quantities of storage, reliability, complicated managementsolution: Deep archival storage + seamless data migration
• New streaming applications challenge: data aggregation and dissemination solution: uniform infrastructure
Applications
Secure Naming
Fundamental Unit: Persistent ObjectGUID: secure hash, 160bit, location independent
Uniqueness + unforgeability + verification
• AGUID Active data:
SHA-1(human-readable name + owner’s public key)
• VGUID Archival data: SHA-1(data)
• NodeID Server: SHA-1(public key of server)
Directory object: securely mapping human-readable names to GUID
GUIDs Secure PointersName+Key
Active GUID
Global Object Resolutions
Floating Replica(Active Object)
Active Data
CommitLogs
CKPoint GUID
Archival GUIDArchival GUID
Signature
RP KeysACLsMetaData
Global Object Resolution
Archival copyor snapshot
Archival copyor snapshot
Archival copyor snapshot
Erasure Coded
Archival GUIDSignature
Inactive Object
Global Object Resolution
Access Control
• Reader restriction
restrict key distribution only to readers
• Writer restriction
ACL
require all writes be signed
Data Location and Routing
Two levels:
• Fast probabilistic search for “routing cache”Attenuated Bloom filter
first Bloom filter: record of the objects contained locally on the current node
ith Bloom filter:union of all of the Bloom filters for all of the nodes a distance i through any path from the current node.
fully distributed, constant amount of storagelocality – provided by introspection mechanism
• Slow guaranteed global search plaxton mesh
Global Algorithm
• Nodes : NodeID• Data Object: GUID
– Each object has Root node
f (ObjectID) = RootID, randomly mapped
– Root node is responsible for storing object’s location
– Publish process :
deposit a pointer at every hop along the path to root node
• Plaxton mesh• Incremental suffix based routing
4
2
3
3
3
2
2
1
2
4
1
2
3
3
1
34
1
1
4 3
2
4
Plaxton MeshIncremental suffix-based routing
NodeID0x43FE
NodeID0x13FENodeID
0xABFE
NodeID0x1290
NodeID0x239E
NodeID0x73FE
NodeID0x423E
NodeID0x79FE
NodeID0x23FE
NodeID0x73FF
NodeID0x555E
NodeID0x035E
NodeID0x44FE
NodeID0x9990
NodeID0xF990
NodeID0x993E
NodeID0x04FE
NodeID0x43FE
Object LocationRandomization and Locality
Fault-tolerant Routing
• Multiple roots of each object using salted hash
• Additional neighbor links & neighbor link repair
• Repeat publishing process to repair location pointers
• Detect failures via soft-state probe packets
• Dynamic insertion & deletion
Update Model
TimeStampClient ID{Pred1, Update1}{Pred2, Update2}{Pred3, Update3}Client Signature
Update message format:
Conflict resolution
• Predicate-action pairs• write restriction• All updates submitted to Inner Ring servers which use byzantine agreement protocol to choose the final commit order• Responsible party decides the inner ring• Use plaxton mesh to disseminate commit order to secondary tier replicas
Flexible update: support a range of consistency semantics (e.g. ACID)Untrusted infrastructure, limitation to work over ciphertext.
Performance: - requirement of network bandwidth - latency of the client side
OceanStore Update
Deep Archive Storage
• Archival Data in Erasure Coded Fragments - Erasure codes
produce n fragments, where any m is sufficient to reconstruct data. m < n. rate r = m/n. Storage overhead is 1/r.
• OceanStore equivalent of stable store• Archival Fragments generated by Inner Ring• Fragments are self-verifying
Deep Archive Storage - update
Deep Archival Storage - Self Verifying Data
Fragment 3:
Fragment 4:
Data:
Fragment 1:
Fragment 2:
H2 H34 Hd F1 - fragment data
H14 data
H1 H34 Hd F2 - fragment data
H4 H12 Hd F3 - fragment data
H3 H12 Hd F4 - fragment data
F1 F2 F3 F4
H1 H2 H3 H4
H12 H34
H14
B-GUID
HdData
Encoded Fragments
F1
H2
H34
Hd
Fragment 1: H2 H34 Hd F1 - fragment data
Introspection
• Monitoring and adaptation of routing substrate–Optimization of Plaxton Mesh–Adaptation of second-tier multicast tree
• Continuous monitoring of access patterns:–Clustering algorithms to discover object relationships
•Clustered prefetching: demand-fetching related objects•Proactive-prefetching: get data there before needed
–Time series-analysis of user and data motion• Continuous testing and repair of information
–Slow sweep through all information to make sure there are sufficient erasure-coded fragments–Continuously reevaluate risk and redistribute data–Diagnosis and repair of routing and location infrastructure
Conclusions• OceanStore: everyone’s data, one big utility
– Global Utility model for persistent data storage
• OceanStore properties:– Provides security, privacy, and integrity– Provides extreme durability– Lower maintenance cost through continuous
adaptation, self-diagnosis and repair– Large scale system has good statistical properties
Difference: Oceanstore: persistent storage infrastructure, untrusted infrastructure, passive data object OSD: active/dynamic object, trust model
can not be too active over ciphertext.
Common issues:- data security(privacy, integrity, reliability) - authentication and authorization- naming and routing- data consistency - caching- maintain-free- applications
OceanStore vs OSD
Recommended