Upload
alanoudsalqoufi
View
164
Download
0
Embed Size (px)
Citation preview
Department of information SystemCourse:IS533-Advanced topics in database
Done by :Alanoud Saad AlqoufiID :435920068
Supervised by : Prof.Almetwally Mohamed MostafaDate :5/3/2015
Google Megastore
Outline• Introduction• Architecture and data model• Transactions and concurrency control• Replications• Data structurs and algorithms• Failure Detection• Throughput • Limitation • Related work• Experience• Questions
What is Google Megastore
A database over Bigtable with high availability• Widely deployed in Google• Used on more than 100 application• Handle more than 3 billion write and 20 billion read• Store nearly a petabyte of primary data• Available on GAE since Jan 2011
1
Motivation• Scalability• Availability • Consistency• Responsive• Rapid development
2
Megastore=RDBMS+NOSQLRDBMS NOSQL
Slow performance High performanceNot scalable Scalable
Fixed data model No schemaEasier to code Complicated
Consistent Less consistent
3
Megastore
4
Toward Availability and Scalability
1. Data Replication Paxos Algorithm
2. Data Partitioning Entity Groups
5
Entity Group Operations
6
Design of Megastore• De-normalized dataData Model• The data model declared in schema• Each schema has a set of tables • Table could be Root or Child table• The root along with all child entities called Entity
Group
7
Sample schema for Photo Sharing ServiceCREATE SCHEMA PhotoApp;
CREATE TABLE User { required int64 user_id;required string name;} PRIMARY KEY(user_id), ENTITY GROUP ROOT;
CREATE TABLE Photo {required int64 user_id;required int32 photo_id; required int64 time;required string full_url;optional string thumbnail_url;repeated string tag;} PRIMARY KEY(user_id, photo_id),IN TABLE User,ENTITY GROUP KEY(user_id) REFERENCES User;
CREATE LOCAL INDEX PhotosByTimeON Photo(user_id, time);CREATE GLOBAL INDEX PhotosByTagON Photo(tag) STORING (thumbnail_url); 8
IndexesSecondary indexes are supported
• Local index• Global index• Storing clause• Repeated index• Inline index
9
Mapping to Big tableBigtable column name=Megastore table name+Property name
10
Transactions and concurrency controlConcurrency Control
MVCC• Read consistency• Current• Snapshot• Inconsistent reads
• Write consistency
11
Transactions and concurrency control
Complete transaction lifecycle in Megastore1. Read2. Application logic3. Commit4. Apply5. Clean up
12
Transactions and concurrency control
13
Queues• Example: Calendar application
2 Phase commit
Transactions and concurrency control
14
Paxos
15
Replications
Modified Paxos
16
Fast reads Fast writes
Replications
New Replica Types• Full Replicas• Witness Replicas• Read-only Replicas
17
Replications
Replicated Logs
18
Data structurs and algorithms
Reads
19
Data structurs and algorithms
Writes
20
Data structurs and algorithms
Failure Detection
• Chubby lock service
21
Write Throughput• Sharding entity groups• Place replicas in same region• Bulk processing
22
Limitations• Latency• Chain gang throttling• Not enforce policies on physical layout
23
Experience
24
Related Work• NoSQL Bigtable, Cassandra, Yahoo PNUTS, Amazon SimpleDB
• Data replication processHbase, CouchDB, Dynamo
• Paxos algorithmSCALARIS, Keyspace
25
Questions1. Megastore is built upon2. Synchronous replication based upon3. Partitioned into a vast space of small databases each with its own
replicated4. The data model is declared in a strongly typed5. Megastore tables are either or tables6. 3 levels of read consistency: 7. Cross entity group updates are supported by:
26
BigTable
Paxos
log
SchemaRoot
Child
CurrentSnapshotInconsistent
2 Phase commit
Any Questions?