29
Department of information System Course:IS533-Advanced topics in database Done by :Alanoud Saad Alqoufi ID :435920068 Supervised by : Prof.Almetwally Mohamed Mostafa Date :5/3/2015 Google Megastore

Db presentation google_megastore

Embed Size (px)

Citation preview

Page 1: Db presentation google_megastore

Department of information SystemCourse:IS533-Advanced topics in database

Done by :Alanoud Saad AlqoufiID :435920068

Supervised by : Prof.Almetwally Mohamed MostafaDate :5/3/2015

Google Megastore

Page 2: Db presentation google_megastore

Outline• Introduction• Architecture and data model• Transactions and concurrency control• Replications• Data structurs and algorithms• Failure Detection• Throughput • Limitation • Related work• Experience• Questions

Page 3: Db presentation google_megastore

What is Google Megastore

A database over Bigtable with high availability• Widely deployed in Google• Used on more than 100 application• Handle more than 3 billion write and 20 billion read• Store nearly a petabyte of primary data• Available on GAE since Jan 2011

1

Page 4: Db presentation google_megastore

Motivation• Scalability• Availability • Consistency• Responsive• Rapid development

2

Page 5: Db presentation google_megastore

Megastore=RDBMS+NOSQLRDBMS NOSQL

Slow performance High performanceNot scalable Scalable

Fixed data model No schemaEasier to code Complicated

Consistent Less consistent

3

Page 6: Db presentation google_megastore

Megastore

4

Page 7: Db presentation google_megastore

Toward Availability and Scalability

1. Data Replication Paxos Algorithm

2. Data Partitioning Entity Groups

5

Page 8: Db presentation google_megastore

Entity Group Operations

6

Page 9: Db presentation google_megastore

Design of Megastore• De-normalized dataData Model• The data model declared in schema• Each schema has a set of tables • Table could be Root or Child table• The root along with all child entities called Entity

Group

7

Page 10: Db presentation google_megastore

Sample schema for Photo Sharing ServiceCREATE SCHEMA PhotoApp;

CREATE TABLE User { required int64 user_id;required string name;} PRIMARY KEY(user_id), ENTITY GROUP ROOT;

CREATE TABLE Photo {required int64 user_id;required int32 photo_id; required int64 time;required string full_url;optional string thumbnail_url;repeated string tag;} PRIMARY KEY(user_id, photo_id),IN TABLE User,ENTITY GROUP KEY(user_id) REFERENCES User;

CREATE LOCAL INDEX PhotosByTimeON Photo(user_id, time);CREATE GLOBAL INDEX PhotosByTagON Photo(tag) STORING (thumbnail_url); 8

Page 11: Db presentation google_megastore

IndexesSecondary indexes are supported

• Local index• Global index• Storing clause• Repeated index• Inline index

9

Page 12: Db presentation google_megastore

Mapping to Big tableBigtable column name=Megastore table name+Property name

10

Page 13: Db presentation google_megastore

Transactions and concurrency controlConcurrency Control

MVCC• Read consistency• Current• Snapshot• Inconsistent reads

• Write consistency

11

Page 14: Db presentation google_megastore

Transactions and concurrency control

Complete transaction lifecycle in Megastore1. Read2. Application logic3. Commit4. Apply5. Clean up

12

Page 15: Db presentation google_megastore

Transactions and concurrency control

13

Queues• Example: Calendar application

Page 16: Db presentation google_megastore

2 Phase commit

Transactions and concurrency control

14

Page 17: Db presentation google_megastore

Paxos

15

Replications

Page 18: Db presentation google_megastore

Modified Paxos

16

Fast reads Fast writes

Replications

Page 19: Db presentation google_megastore

New Replica Types• Full Replicas• Witness Replicas• Read-only Replicas

17

Replications

Page 20: Db presentation google_megastore

Replicated Logs

18

Data structurs and algorithms

Page 21: Db presentation google_megastore

Reads

19

Data structurs and algorithms

Page 22: Db presentation google_megastore

Writes

20

Data structurs and algorithms

Page 23: Db presentation google_megastore

Failure Detection

• Chubby lock service

21

Page 24: Db presentation google_megastore

Write Throughput• Sharding entity groups• Place replicas in same region• Bulk processing

22

Page 25: Db presentation google_megastore

Limitations• Latency• Chain gang throttling• Not enforce policies on physical layout

23

Page 26: Db presentation google_megastore

Experience

24

Page 27: Db presentation google_megastore

Related Work• NoSQL Bigtable, Cassandra, Yahoo PNUTS, Amazon SimpleDB

• Data replication processHbase, CouchDB, Dynamo

• Paxos algorithmSCALARIS, Keyspace

25

Page 28: Db presentation google_megastore

Questions1. Megastore is built upon2. Synchronous replication based upon3. Partitioned into a vast space of small databases each with its own

replicated4. The data model is declared in a strongly typed5. Megastore tables are either or tables6. 3 levels of read consistency: 7. Cross entity group updates are supported by:

26

BigTable

Paxos

log

SchemaRoot

Child

CurrentSnapshotInconsistent

2 Phase commit

Page 29: Db presentation google_megastore

Any Questions?