View
221
Download
0
Category
Preview:
Citation preview
HAIL (High-Availability HAIL (High-Availability and Integrity Layer) for and Integrity Layer) for
Cloud StorageCloud Storage
Alina OpreaAlina Oprea
Joint with Kevin Bowers and Ari JuelsJoint with Kevin Bowers and Ari JuelsRSA LaboratoriesRSA Laboratories
2
Cloud Storage Provider
Client
Mostly static data:
• Back-up
• Archival Is my data available ?
Storage server
Web server
Cloud storageCloud storage
3
Proofs of Retrievability (PORs)Proofs of Retrievability (PORs)
Cloud Storage Provider
Client
F
Encoding
k
Corrects small corruption
4
Proofs of Retrievability (PORs)Proofs of Retrievability (PORs)
Cloud Storage Provider
Client
F
Challenge
F
k
Response
Requires integrity checks on server or client
Detects large corruption
5
When PORs failWhen PORs fail
Cloud Storage Provider
Client
FF
k
Challenge Responsedecoder
Unrecoverable
6
HAIL GoalsHAIL Goals
• Resilience against cloud provider failure or temporary unavailability– Amazon S3 went down several times, once for 8 hours– Linkup lost 45% of its customer data
• Use multiple cloud providers to construct a reliable cloud storage service out of unreliable components– RAID (Reliable Array of Inexpensive Disks) for cloud storage
• Provide clients verification capabilities– Efficient proofs of file availability by interacting with cloud
providers
7
Replicate across multiple providersReplicate across multiple providersAmazon S3 Google EMC Atmos
Client
F
Sample and check consistency across providers
F F F
Naïve approach
8
RoadmapRoadmap
• Adversarial model for HAIL
• Small-corruption attack on replication scheme
• Encoding layer for each replica individually
• Reduce storage overhead by dispersal
• Increasing file lifetime with secret keys
9
Adversarial modelAdversarial model
• Static: corrupts a fixed number b of the n total providers over time– Create enough redundancy in the file to handle this (b+1
replicas)– Is this realistic?
• Mobile (proactive): corrupts b out of n providers in each epoch– Separate each server into code base and storage base– At the beginning of an epoch code base of all servers is cleaned
(through reboot, for instance)– All servers might have residual data corruption– Reactive design: check integrity and redistribute
10
Attack on replication schemeAttack on replication schemeAmazon S3 Google EMC Atmos
Client
F F F
The probability that client samples the corrupted block is low
File can not be recovered after
[n/b] epochs
F F F
11
Replication with PORReplication with PORAmazon S3 Google EMC Atmos
Client
F
F F F
ECC
POR POR POR
Cons: requires integrity checks for each replica
12
Replication with PORReplication with PORAmazon S3 Google EMC Atmos
Client
Sample and check consistency across providers
F F FF
13
Replication with PORReplication with PORAmazon S3 Google EMC Atmos
Client
F F FF
• Large storage overhead due to replication
• File lifetime still limited by [n/b] (єc/ єd)
- єc correction threshold of POR encoding
- єd detection threshold of POR
єd єd
>єc >єc
Sample and check consistency across providers
єd
>єc
14
Reduce storage overheadReduce storage overhead
Client
F
dispersal
F
(n,m)
decode
n fragments
m fragments
15
Dispersal code Dispersal code
Client
F
dispersal (n,m)
P1 P2 P3 P4 P5
F Dispersal code parity blocks
16
Dispersal code Dispersal code
Client
P1 P2 P3 P4 P5
Stripe
Check that stripe is a codeword in dispersal code
POR encoding to correct small corruption
Dispersal code parity
POR encoding
F Dispersal code parity blocks
How to increase file lifetime?
17
Increasing file lifetime with MACsIncreasing file lifetime with MACs
Client
P1 P2 P3 P4 P5
MAC MAC MAC MAC MAC
Can we reduce storage overhead?
18
Integrity-protected dispersal codeIntegrity-protected dispersal code
Client
P1 P2 P3 P4 P5
Reed-Solomon dispersal code
m hk1(m) UHF hk2(m)
PRF+
19
Integrity-protected dispersal codeIntegrity-protected dispersal code
Client
P1 P2 P3 P4 P5
MACs embedded into parity symbols
m PRF+
20
Current work and open problemsCurrent work and open problems
• Proofs of Retrievability– Lower bounds akin to Naor and Rothblum’s lower bounds for
memory checking– What is the cost of file updates?
• HAIL– K. Bowers, A. Juels and A. Oprea – “HAIL (High-Availability and
Integrity Layer) for Cloud Storage”, CCS 2009– Different adversarial models– Investigate alternative constructions– Supporting file updates
Recommended