Upload
agnes-charles
View
220
Download
2
Embed Size (px)
Citation preview
Caching with “Good Enough” Currency, Consistency, and
Completeness
Hongfei Guo University of WisconsinPer-Åke Larson Microsoft ResearchRaghu Ramakrishnan University of Wisconsin
2
Motivation — Scaling Google
…
3
Updates
…
Backend DBMS
Problem: How to tell whether the cached data is “good enough” for an application?
NO data quality requirements from the applications! NO data quality guarantees from the caching DBMS!
Motivation — Scaling A DBMS By Caching
Application Server
Application Server
App specific code
Caching DBMS
Asynchronous Updates
4
Apps: Specifies data quality requirements in queries[SIGMOD 2004] [SIGMOD 2004 Demo]
Fine-grained data quality-aware database caching model
Cache admin: Specifies local data quality Cache: Keeps track of local data quality
[VLDB 2005]
Query processing: Enforces data quality constraint[SIGMOD 2004] [VLDB 2005]
System performance evaluation[ongoing work]
Caching DBMS
Backend DBMS
Application ServerApplication Server
Big Picture
5
Contributions
Goal: fine-grained data quality-aware cache management
Problems How does the cache track data quality? How does the admin specify cache
properties? How to maintain the cache efficiently? How to enforce data quality constraints for
queries?
A comprehensive solution Cache properties Dynamic cache model Efficient cache maintenance and “safety” Efficiently enforce data quality checking
6
Review: Data Quality Metrics (informal)
Currency: The elapsed time since this copy becomes stale
Consistency: A query result is (snapshot) consistent iff it is as if evaluated from a snapshot of the master database
C&C: Currency & Consistency
7
bid
title author
bid rid
text
1 databases
Raghu 1 1 …
1 databases
Raghu 1 2 …
2 databases
Ullman 2 3 …
CURRENCY BOUND 10 min ON (B, R) BY B.bid
CURRENCY BOUND 10 min ON (B), 30 min ON (R)
CURRENCY BOUND 10 min ON (B, R)
Review: Proposed SQL Syntax
Ullmandatabases2
Raghudatabases1
authortitlebid
BookCopy
…23
…12
…11
textbidrid
ReviewCopy
SELECT *FROM Books B, Reviews R WHERE B.bid = R.bid AND
B.title = “Databases“
Consistency class
Currency bound
Group by
8
Roadmap
Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions
9
Cache Properties
Why Define Cache Properties?
Query processing
Cache maintenance
Queries with Relaxed C&C Requirements Results
= contract
10
Cache Properties (P+3C)
Presence — per object Consistency — a set of objects Completeness — per predicate Currency — object staleness
Describe local data status
11
Presence
Example: SELECT *
FROM Authors AWHERE authorId = 1
Question: Is an object present at the cache?
12
Consistency and Currency
Example: SELECT *
FROM Authors AWHERE authorId in (1, 2, 3)CURRENCY BOUND 10 ON (A)
Question: Is a set of objects consistent and no more than 10 minutes old?
13
Completeness
Example: SELECT *
FROM Authors AWHERE city = ‘Madison’
Question: Are ALL authors from Madison in the cache?
View 1
View 2View 3
Basic Concepts
ObjectTables
Cache
H2
H1Master Database
Snapshots
View 1
View 2View 3
Cache Property Examples
Cache
H2
H1Master Database
Present Complete
Currency = now – stale point
Consistent
Stale point
16
Roadmap
Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions
17
Specifying Cache Properties
Specified as integrity constraints Presence constraint Consistency constraint Completeness constraint
Presence correlation constraint Consistency correlation constraint
Single view
Between two views
18
AuthorList_PCT:
authorId name city
1 Alice Madison
2 Bob Madison
3 Cedric Seattle
Presence Constraint AuthorCopy:
authorId
1
2
3
Backend DBMS
Caching DBMS
19
control-table
CREATE VIEW AuthorCopy AS SELECT * FROM Authors
CREATE TABLE AuthorList_PCT (authorId int)
ALTER VIEW AuthorCopy ADD
ON authorId IN (SELECTauthorId FROM authorId_PCT
Partially materialize
d view[Zhou et al 2005]
authorId name city
Presence ConstraintAuthorCopy:
authorId
AuthorList_PCT:
1 Alice Madison
2 Bob Madison
3 Cedric Seattle
1
2
3
control-key
PRESENCE
20
CityList_CsCT:
authorId name city
1 Alice Madison
2 Bob Madison
3 Cedric Seattle
Consistency Constraint AuthorCopy:
city
Madison
authorId
AuthorList_PCT:
1
2
3
authorId
AuthorList_PCT:
1
2
3
CREATE TABLE CityList_CsCT (city string)
ALTER VIEW AuthorCopy ADD
ON city IN (SELECT city
FROM cityList_CsCT
Consistency
Backend DBMS
Cache Region
21
authorId
AuthorList_PCT:CityList_CpCT:
authorId name city
1 Alice Madison
2 Bob Madison
3 Cedric Seattle
Completeness Constraint AuthorCopy:
city
Madison
New York
CREATE TABLE CityList_CpCT (city string)
ALTER VIEW AuthorCopy ADD
ON city IN (SELECT city
FROM cityList_CsCT
Completeness
Backend DBMS
authorId
AuthorList_PCT:
1
3
1
3
22
111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee
isbn authorId title
1 Alice Madison
2 Bob Madison3 Cedric Seattle
authorId name city
Presence Correlation Constraint
AuthorCopy:
BookCopy:
ALTER VIEW BookCopy ADD PRESENCE ON authorId IN (SELECT authorId
FROM AuthorCopy)
authorId
AuthorList_PCT:
1
2
3Backend
DBMS
authorId
authorId
23
111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee
isbn authorId title
1 Alice Madison
2 Bob Madison3 Cedric Seattle
authorId name city
Presence Correlation Constraint
AuthorCopy:
BookCopy:
authorId
AuthorList_PCT:
1
2
3
authorId
authorId
AuthorList_PCT
AuthorCopy
BookCopy
authorId
authorId
24
111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee
isbn authorId title
1 Alice Madison
2 Bob Madison3 Cedric Seattle
authorId name city
Consistency Correlation Constraint
AuthorCopy:
BookCopy:
authorId
AuthorList_PCT:
1
2
3
authorId
authorIdBackend
DBMS
ALTER VIEW BookCopy ADD CONSISTENCY ROOT
25
111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee
isbn authorId title
1 Alice Madison
2 Bob Madison3 Cedric Seattle
authorId name city
Consistency Correlation Constraint
AuthorCopy:
BookCopy:
authorId
AuthorList_PCT:
1
2
3
authorId
authorId
AuthorList_PCT
AuthorCopy
BookCopy
authorId
authorId
26
Cache Schema Example
AuthorList_PCT
AuthorCopy
BookCopy
ReviewerList_PCT
ReviewerCopy
authorId
authorId
isbn
reviewId
reviewerId
ReviewCopy
CityList_CsCT
27
Roadmap
Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions
28
Extension to the Optimizer
Compile-time consistency checking
Run-time currency and inexpensive consistency checking
Cost estimation
29
Run-time C&C Checking
Currency guard:Check if local view V satisfies currency requirement
Consistency guard: Check if local view V satisfies consistency requirement
ChoosePlan
C&CGuard
Remote planrequesting E
Local plan using V
30
Future Directions
Improve current prototype Read-write
transactions?
Adaptive data quality aware caching policies Control-table content? Refresh intervals?
Automate cache design/tuning How to get a good cache
schema? (i.e., cache region granularity, assignment)
Comprehensive performance evaluation Cache configurations? Comparison with other
replication solutions?
31
Summary Goal: fine-grained data quality-
aware cache management A comprehensive solution
Four cache properties Dynamic cache model Efficient cache maintenance and “safety” Efficiently enforce C&C checking
Questions?
32
So long, and thanks for all the fish!
33
34
Simple Consistency Guards Overhead
0
10
20
30
40
50
60
70
80
Qa Qb Qc Qa Qb Qc
Consistency guard
Query
Local
Remote
Execu
tion t
ime (
ms)
16.56%
14.00%
1.72%
1.59%1.66%
1.6%
35
0
1
2
3
4
5
6
7
A11a A11b A12 S11 S12 A11a A11b A12 S11 S12
Consistency guard
Query
Single Table Consistency Guard Overhead
Local
Remote
Execu
tion t
ime (
ms)
62.85%
16.98% 71.41%
6.06% 8.79%7.48%2.33%4.95%
58.32%
23.77%
(Qa is used)