Upload
mrjefferson
View
223
Download
0
Embed Size (px)
Citation preview
8/10/2019 Main Memory Databases
1/35
EECS 584
Fall 2014
Main Memory Databases
Presented by Nate Harada
8/10/2019 Main Memory Databases
2/35
EECS 584
Fall 2014
Main Memory Database Systems:
An Overview
Garcia-Molina and Salem, IEEE, 1992
8/10/2019 Main Memory Databases
3/35
EECS 584
Fall 2014
H-Store: A High-Performance,
Distributed Main Memory
Transaction Processing System
Kallman et al., Brown, 2008
Jones et al., MITHugg, Vertica
Adabi, Yale
8/10/2019 Main Memory Databases
4/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
5/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
6/35
EECS 584
Fall 2014
The Landscape
*From Stonebrakers 2007 VLDB Presentation on H-Store
8/10/2019 Main Memory Databases
7/35
EECS 584
Fall 2014
Solutions
*From Stonebrakers 2007 VLDB Presentation on H-Store
8/10/2019 Main Memory Databases
8/35
EECS 584
Fall 2014
DBMS vs MMDB
8/10/2019 Main Memory Databases
9/35
EECS 584
Fall 2014
Why Main Memory DB?
Main memory is faster Disk access is traditionally the bottleneck
Random access is just as fast as sequential access
Main memory is simpler Fast access simplifies concurrency
No caching
8/10/2019 Main Memory Databases
10/35
EECS 584
Fall 2014
Why Not Main Memory DB?
Main memory is volatile Have to somehow recover data if crashes
Main memory is expensive
Limited to small databases (for now)
https://www.ic.gc.ca/eic/site/oca-bc.nsf/eng/ca02093.html
8/10/2019 Main Memory Databases
11/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
12/35
EECS 584
Fall 2014
Concurrency Control
Faster access means larger lock granules Up to the whole database
Can change lock structure
Store lock info with files instead of hash table
Lock Bit File Data
8/10/2019 Main Memory Databases
13/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
14/35
EECS 584
Fall 2014
Commit Processing
Logging becomes a bottleneck if we write todisk
We could use stable main memory
Non-volatile RAM just coming out Hold log tail and move to disk constantly
We could do group commits
Trade latency for throughput
8/10/2019 Main Memory Databases
15/35
EECS 584
Fall 2014
Recovery
How do we deal with crashes?
8/10/2019 Main Memory Databases
16/35
EECS 584
Fall 2014
Recovery
Most MMDB systems dump to diskoccasionally
Generally this is the entire database
Trade off between frequent (up to date) andinfrequent (good performance)
Could also have multiple machines
(redundancy)
8/10/2019 Main Memory Databases
17/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
18/35
EECS 584
Fall 2014
Access Methods
Data Representation Pointers in index structures
Pointers to communicate with client
Index Structures: T-Trees vs B-Trees We can use deeper trees
8/10/2019 Main Memory Databases
19/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
20/35
EECS 584
Fall 2014
Query Processing
Sequential access no longer
important Can create different data
structures, eg DBGraph
Performance and scheduling of
backups matters
Disk access time is no longer important Cost estimation different
8/10/2019 Main Memory Databases
21/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
22/35
EECS 584
Fall 2014
Miscellaneous
Applications can be given actual memorypositions for reads
Can even give direct access for writes
Dangerous! H-Store solves with precompiled procedures
How do we determine where to store items
weve migrated to disk This issue has no traditional counterpart
8/10/2019 Main Memory Databases
23/35
EECS 584
Fall 2014
Covering this Talk
Overview of Main Memory Databases Specific Considerations
Concurrency Control
Commit Processing Access Methods
Query Processing
Miscellaneous Case Study: H-Store
8/10/2019 Main Memory Databases
24/35
EECS 584
Fall 2014
H-Store
8/10/2019 Main Memory Databases
25/35
EECS 584
Fall 2014
H-Store
ONE
CPU
8/10/2019 Main Memory Databases
26/35
EECS 584
Fall 2014
Architecture
8/10/2019 Main Memory Databases
27/35
EECS 584
Fall 2014
Deploy Time
Procedures arecompiled
Layout determined by
administrator Database optimized at
deploy time
8/10/2019 Main Memory Databases
28/35
EECS 584
Fall 2014
Runtime
Transactions Initiated at one site,
that site fulfillstransaction
Special Cases Single Sited:
Transaction runs ononly one site
One-shot: Eachquery in transactionruns on only onesite
8/10/2019 Main Memory Databases
29/35
EECS 584
Fall 2014
Single-Sited Transaction
DATA
Request
Client
8/10/2019 Main Memory Databases
30/35
EECS 584
Fall 2014
Single-Sited Transaction
DATA
Redirect
Client
8/10/2019 Main Memory Databases
31/35
EECS 584
Fall 2014
Single-Sited Transaction
DATA
Response
Client
8/10/2019 Main Memory Databases
32/35
EECS 584
Fall 2014
Multi-Sited Transaction
DATA
Request
Client
DATA DATA
8/10/2019 Main Memory Databases
33/35
EECS 584
Fall 2014
Locking
H-Store has no locks, we just execute one at atime on a site
Concurrency achieved by partitioning data
across machines We simply hope that a transaction doesnt
need data on multiple partitions
8/10/2019 Main Memory Databases
34/35
EECS 584
Fall 2014
Multi-Sited Transaction
DATA
Request
Client
DATA DATA
8/10/2019 Main Memory Databases
35/35
EECS 584 Fall 2014
Multi-Sited Transaction
DATA
Request
Client
DATADATA