53
Security approaches in BigTable-like storage systems 22951 Research Seminar: Information Security and Privacy July 2014 Open University of Israel Grisha Weintraub

Security approaches in BigTable-like storage systems

Embed Size (px)

DESCRIPTION

BigTable is a Google's distributed storage system that is designed to manage large-scale structured data. BigTable was designed for internal (i.e. trusted) use and therefore no security considerations were taken into account. Since 2006, following the publication of the paper that describes BigTable's architecture, several open-source BigTable-like systems have been developed (e.g. HBase, Hypertable). One of the primary uses of such systems is cloud storage - service that provides users with access to data without the need for managing hardware or software. However users may not trust cloud provider and hence appropriate security techniques should be applied. In this seminar three different security approaches for BigTable-like systems are reviewed: 1. iBigTable - enhancement of BigTable that provides scalable data integrity assurance. 2. BigSecret - secure data management framework for BigTable-like storage systems. 3. Accumulo – extension of BigTable that provides cell-level access control.

Citation preview

Page 1: Security approaches in BigTable-like storage systems

Security approaches in BigTable-like storage systems

22951 Research Seminar: Information Security and Privacy July 2014

Open University of Israel

Grisha Weintraub

Page 2: Security approaches in BigTable-like storage systems

Abstract

• BigTable - Google’s scalable storage system. • Designed for internal(i.e. trusted) use. • Open sources implementations (e.g. HBase).• Can be deployed in a public cloud (i.e. DBaaS). • However one may not trust the public cloud

provider.• Our focus is on the approaches to make

BigTable-like systems secure.

Page 3: Security approaches in BigTable-like storage systems

Outline

• BigTable

• Security approaches :

Integrity(iBigTable)

Encryption(BigSecret)

Access Control(Accumulo)

Page 4: Security approaches in BigTable-like storage systems

BigTable - Introduction

• Fay Chang et al., Bigtable: A Distributed Storage System for Structured Data, OSDI2006 (Best Paper)

• Distributed storage system for managing structured data that is designed to scale to a very large size.

Page 5: Security approaches in BigTable-like storage systems

BigTable – Data Model

• BigTable is a sparse, distributed, persistent multidimensional sorted map.

• The map is indexed by a row key, column key, and a timestamp.

• (row_key,column_key,time) string

Page 6: Security approaches in BigTable-like storage systems

BigTable – Data Modelphone name user_id

178145 John 15

email name user_id

[email protected] 29Bob t1Robert t2

row_keycolumn_key

timestamp

(29, name, t2) “Robert”

email phone name user_id

RDBMSApproach

null 178145 John [email protected] null Bob 29

Page 7: Security approaches in BigTable-like storage systems

BigTable – Data Model• Columns are grouped into Column Families:

– family : optional qualifier

contactInfo : email contactInfo : phone name: user_id

[email protected] 17814552 John 15

Column Family

Optional Qualifier

name user_id RDBMSApproach

John 15

value type user_id

178145 phone [email protected] email 15

Page 8: Security approaches in BigTable-like storage systems

BigTable – Data Model

Value Timestamp Column Row-Key

Qualifier Family

Key Value

• Sorting order:– Row-Key Family Qualifier Timestamp

Page 9: Security approaches in BigTable-like storage systems

BigTable – Data Model

• Tablets :– Large tables broken into tablets at row boundaries.– Tablet holds contiguous range of rows.– Approximately 100-200 MB of data per tablet.

..… id

..… 15000

Tablet 1..… .…

..… 20000

..… 20001

Tablet 2..… .…

..… 25000

Page 10: Security approaches in BigTable-like storage systems

BigTable – API

• Metadata operations :– Creating and deleting tables, column families, modify access control

rights.

• Client operations :– Write/delete values– Read values– Scan row ranges

// Open the tableTable *T = OpenOrDie("/bigtable/users");

// Update name and delete a phoneRowMutation r1(T, “29");r1.Set(“name:", “Robert");r1.Delete(“contactInfo:phone");Operation op;Apply(&op, &r1);

Page 11: Security approaches in BigTable-like storage systems

BigTable – System Structure • Three major components:

– Client library

– Master (exactly one) :• Assigning tablets to tablet servers.• Detecting the addition and expiration of tablet servers.• Balancing tablet-server load.• Garbage collection of files in GFS.• Schema changes such as table and column family creations.

– Tablet Servers(multiple, dynamically added) :• Manages 10-1000 tablets• Handles read and write requests to the tablets.• Splits tablets that have grown too large.

Page 12: Security approaches in BigTable-like storage systems

BigTable – System Structure

Page 13: Security approaches in BigTable-like storage systems

BigTable – Tablet Location

• Three-level hierarchy analogous to that of a B+ tree to store tablet location information.

• Client library caches tablet locations.

Page 14: Security approaches in BigTable-like storage systems

BigTable – Tablet Serving• Writes :

– Updates committed to a commit log.– Recently committed updates are stored in memory – memtable.– Older updates are stored in a sequence of SSTables.

• Reads :– Read operation is executed on a merged view of the sequence of SSTables and the memtable.– Since the SSTables and the memtable are sorted, the merged view can be formed efficiently.

Page 15: Security approaches in BigTable-like storage systems

BigTable - Compactions

• Minor compaction:– Converts the memtable into SSTable.– Reduces memory usage.– Reduces log reads during recovery.

• Major compaction:– Merging compaction that results in a single SSTable.– No deletion records, only live data.– Good place to apply policy “keep only N versions”

Page 16: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable)

Encryption(BigSecret)

Access Control(Accumulo)

Page 17: Security approaches in BigTable-like storage systems

iBigTable - Introduction

• Wei Wei, Ting Yu, Rui Xue: iBigTable: practical data integrity for bigtable in public cloud. CODASPY 2013

• Enhancement of BigTable that provides scalable data integrity assurance.

Page 18: Security approaches in BigTable-like storage systems

iBigTable – System Model

BigTable

Data Owner

Clients

writes

reads

Page 19: Security approaches in BigTable-like storage systems

iBigTable - Goals

• Correctness:– returned records have not been modified in any way

• Completeness:– no answers have been omitted from the result

• Freshness:– results are based on the most current version of the data

Page 20: Security approaches in BigTable-like storage systems

iBigTable – System Design• Basic Idea:

– Build Merkle Hash Tree based Authenticated Data Structure for each tablet.

• Verification Object(VO) - Data returned along with result and used to authenticate the result.

• Example – VO for Data block 1 – {Hash 0-1, Hash 1}

Page 21: Security approaches in BigTable-like storage systems

iBigTable – System Design

Merkle B+ Tree

Page 22: Security approaches in BigTable-like storage systems

iBigTable – System Design

User Tablet User Tablet

Meta Tablet

Root Tablet

Data Owner

Root hash

• Pros:– Only maintain one hash for all data

• Cons:– Require update propagation– Concurrent updates could cause issues

Page 23: Security approaches in BigTable-like storage systems

User Tablet User Tablet

Meta Tablet

Root Tablet

Data OwnerRoot hash

Root hash

Root hash

Root hash

……

iBigTable – System Design

Page 24: Security approaches in BigTable-like storage systems

iBigTable – Reads

1.1 getMetaTabletLocation(table name, row key)

Tablet Server serving ROOT tabletClient

1.3 meta tablet location

1.4

verif

y

2.1 getUserTabletLocation(table name, row key)

Tablet Server serving META tabletClient

2.3 user tablet location

2.4

verif

y

3.1 getRow(row key)

Tablet Server serving USER tabletClient

3.3 row data

3.4

verif

y

1.2 generate VO

2.2 generate VO

2.2 generate VO

, VO

, VO

, VO

Page 25: Security approaches in BigTable-like storage systems

iBigTable – Updates

3.1 new/updated row

Tablet Server serving USER tabletData Owner

3.3 PT-VO

3.4 verify and update tablet root hash 3.2 generate PT-VO

Partial Tree Verification Object (PT-VO) – The difference between a VO and a PT-VO is that a PT-VO contains keys along with hashes, while a VO does not.

Page 26: Security approaches in BigTable-like storage systems

iBigTable – Updates

6030

10 50 80

0 10 20 5030 40 80 9060 70

70

Initial MB+ row tree of a tablet in a tablet server.

Page 27: Security approaches in BigTable-like storage systems

iBigTable – Updates

6030

50

5030 40

45

New Key 45

Insert a row with key 45 into partial tree VO

40 45

6030

50

5030

New Key 45

40

Partial tree VO after 45 is inserted

Page 28: Security approaches in BigTable-like storage systems

iBigTable – Authenticated Data Structure

• Projected range queries - expensive to generate and verify VOs.

SL-MBT: A single-level Merkle B+ tree

Page 29: Security approaches in BigTable-like storage systems

iBigTable – Authenticated Data Structure

TL-MBT: A two-level Merkle B+ tree.

Page 30: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable) √

Encryption(BigSecret)

Access Control(Accumulo)

Page 31: Security approaches in BigTable-like storage systems

BigSecret - Introduction

• Erman Pattuk et al., BigSecret: A Secure Data Management Framework for Key-Value Stores. IEEE CLOUD 2013

• A secure data management framework for BigTable-like storage systems.

Page 32: Security approaches in BigTable-like storage systems

BigSecret – System Model

BigTable

Clients

BigSecret

get(“Bob”, “email”) Get(“A4Vc”, “Zx$23”)

“DF77Xs9”“[email protected]

Page 33: Security approaches in BigTable-like storage systems

BigSecret – Goals• Secure storage of data on untrusted servers.

• Efficient query execution on encrypted data.

• Supported queries :– Put– Get– Delete– Scan

Page 34: Security approaches in BigTable-like storage systems

BigSecret – Preliminaries• Key :

– row||fam||qua||ts

• Symmetric Encryption:– E(p) c //encryption– D(c) p //decryption

• Pseudo-Random Functions(PRF):– H(m) h //deterministic random

• Bucketization:– Partitions p1,p2,… of domain Z.– Ident function that assigns unique random identifiers to each partition.– Map function that takes a partitioned domain, a value v from the domain, and returns

Ident(p), where v belongs to p.

Page 35: Security approaches in BigTable-like storage systems

BigSecret – Bucketization

0 100002000 4000 6000 8000

34 97 123 266 771

Map(100) = 34 Map(6451) = 266

Order-preserving mapping:x<y Map(x) < Map(y)

Page 36: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Naive approach – encrypt values only

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (row, fam, qua, ts, E(value))

E(value)D(E(value))

– All operations are supported.– Relatively good performance.– Only minor changes to the system are required.– Poor privacy.

Page 37: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Model-1– bucketization for all key parts

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (Map(row), Map(fam), Map(qua)||E(key), Map(ts), E(value))

– All operations are supported.– Relatively bad performance.– Privacy-performance trade-off.

Scan(row_from, row_to, fam)

Scan(200, 300, contactInfo)

Scan(Map(row_from), Map(row_to), Map(fam))

Scan(34, 34, 452)

Page 38: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Model-2– PRF for all key parts

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (H(row), H(fam), H(qua)||E(key), H(ts), E(value))

– Scan is not supported.– Relatively good performance.– Frequency-based attacks.

Get(row, fam, qua)

Get(200, contactInfo, email)

Get(H(row), H(fam), H(qua))

Get(Az54Et, q8dj8, qWd29h)

Page 39: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Frequency-based attacks(Damiani et al. 2003)

Possible solutions:• Decreasing the range of the PRFs.• Model-3

city name id

Tel-Aviv Alice 19New York Bob 24

Paris Carol 32New York Alice 38

city name id

$ 27 j

& 14 a

* 23 t

& 27 z

27 = “Alice”& = “New York”

Alice lives in NY

Page 40: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Model-3– PRF only for row-key

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (H(row), 0, E(key), 1, E(value))

– Scan is not supported.– Relatively good privacy.– Performance ?

Get(row, fam, qua)

Get(200, contactInfo, email)

Get(H(row), 0, null)

Get(Az54Et, 0, null)

Page 41: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Page 42: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable) √

Encryption(BigSecret) √

Access Control(Accumulo)

Page 43: Security approaches in BigTable-like storage systems

Accumulo- Introduction

• Adam Fuchs, Apache Accumulo: Extensions to Google's Bigtable Design, 2012, lecture conducted from Morgan State University

• An extension of BigTable that provides cell-level access control.

Page 44: Security approaches in BigTable-like storage systems

Accumulo – System Model

BigTable

Value Qualifier Family Row

Bob name [email protected] email contactInfo 14sodium : 137 …

blood test

healthData 14

Patient suffers from .…

doctor’s notes

healthData 14

… … .… …

email, blood test

blood test, notes

Bob

Page 45: Security approaches in BigTable-like storage systems

Accumulo – System Model

BigTable

credentials, query

lookup user user authorization set

auth, query

datadata

Page 46: Security approaches in BigTable-like storage systems

Accumulo- Data Model

Value Timestamp Column Row-Key

Visibility Qualifier Family

Value Timestamp Column Row-Key

Qualifier Family

Security labels (e.g. A|(B&C) )

Page 47: Security approaches in BigTable-like storage systems

Accumulo- Visibility

• Syntax:– A&B – both A and B required– A|B – must have either A or B – A|(B & C) – must have A or both B and C

• Examples:– Admin|(Manager & Sales)– Citizen & Adult– Secret | Top Secret

Page 48: Security approaches in BigTable-like storage systems

Accumulo- Visibility

Value Visibility Qualifier Family RowBob name [email protected] bob14 email contactInfo 14

sodium : 137 …

bob14|doctor blood test healthData 14

Patient suffers from .…

doctor doctor’s notes

healthData 14

… … .… …

Page 49: Security approaches in BigTable-like storage systems

Accumulo – Visibility

BigTable

(bob, ***), health data

lookup user {bob14}

{bob14}, health data

blood testblood test

Visibility Qualifier Family

doctor notes HealthData

bob14|doctor

blood test HealthData

Bob

Page 50: Security approaches in BigTable-like storage systems

Accumulo- Iterators

Iterator

Page 51: Security approaches in BigTable-like storage systems

Accumulo- Iterators

Page 52: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable) √

Encryption(BigSecret) √

Access Control(Accumulo) √

Page 53: Security approaches in BigTable-like storage systems

References

• Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert Gruber: Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!). OSDI 2006:205-218

• Wei Wei, Ting Yu, Rui Xue: iBigTable: practical data integrity for bigtable in public cloud. CODASPY 2013:341-352

• Erman Pattuk, Murat Kantarcioglu, Vaibhav Khadilkar, Huseyin Ulusoy, Sharad Mehrotra: BigSecret: A Secure Data Management Framework for Key-Value Stores. IEEE CLOUD 2013:147-154

• http://accumulo.apache.org/