Main Memory Databases

Embed Size (px)

Citation preview

  • 8/10/2019 Main Memory Databases

    1/35

    EECS 584

    Fall 2014

    Main Memory Databases

    Presented by Nate Harada

  • 8/10/2019 Main Memory Databases

    2/35

    EECS 584

    Fall 2014

    Main Memory Database Systems:

    An Overview

    Garcia-Molina and Salem, IEEE, 1992

  • 8/10/2019 Main Memory Databases

    3/35

    EECS 584

    Fall 2014

    H-Store: A High-Performance,

    Distributed Main Memory

    Transaction Processing System

    Kallman et al., Brown, 2008

    Jones et al., MITHugg, Vertica

    Adabi, Yale

  • 8/10/2019 Main Memory Databases

    4/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    5/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    6/35

    EECS 584

    Fall 2014

    The Landscape

    *From Stonebrakers 2007 VLDB Presentation on H-Store

  • 8/10/2019 Main Memory Databases

    7/35

    EECS 584

    Fall 2014

    Solutions

    *From Stonebrakers 2007 VLDB Presentation on H-Store

  • 8/10/2019 Main Memory Databases

    8/35

    EECS 584

    Fall 2014

    DBMS vs MMDB

  • 8/10/2019 Main Memory Databases

    9/35

    EECS 584

    Fall 2014

    Why Main Memory DB?

    Main memory is faster Disk access is traditionally the bottleneck

    Random access is just as fast as sequential access

    Main memory is simpler Fast access simplifies concurrency

    No caching

  • 8/10/2019 Main Memory Databases

    10/35

    EECS 584

    Fall 2014

    Why Not Main Memory DB?

    Main memory is volatile Have to somehow recover data if crashes

    Main memory is expensive

    Limited to small databases (for now)

    https://www.ic.gc.ca/eic/site/oca-bc.nsf/eng/ca02093.html

  • 8/10/2019 Main Memory Databases

    11/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    12/35

    EECS 584

    Fall 2014

    Concurrency Control

    Faster access means larger lock granules Up to the whole database

    Can change lock structure

    Store lock info with files instead of hash table

    Lock Bit File Data

  • 8/10/2019 Main Memory Databases

    13/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    14/35

    EECS 584

    Fall 2014

    Commit Processing

    Logging becomes a bottleneck if we write todisk

    We could use stable main memory

    Non-volatile RAM just coming out Hold log tail and move to disk constantly

    We could do group commits

    Trade latency for throughput

  • 8/10/2019 Main Memory Databases

    15/35

    EECS 584

    Fall 2014

    Recovery

    How do we deal with crashes?

  • 8/10/2019 Main Memory Databases

    16/35

    EECS 584

    Fall 2014

    Recovery

    Most MMDB systems dump to diskoccasionally

    Generally this is the entire database

    Trade off between frequent (up to date) andinfrequent (good performance)

    Could also have multiple machines

    (redundancy)

  • 8/10/2019 Main Memory Databases

    17/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    18/35

    EECS 584

    Fall 2014

    Access Methods

    Data Representation Pointers in index structures

    Pointers to communicate with client

    Index Structures: T-Trees vs B-Trees We can use deeper trees

  • 8/10/2019 Main Memory Databases

    19/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    20/35

    EECS 584

    Fall 2014

    Query Processing

    Sequential access no longer

    important Can create different data

    structures, eg DBGraph

    Performance and scheduling of

    backups matters

    Disk access time is no longer important Cost estimation different

  • 8/10/2019 Main Memory Databases

    21/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    22/35

    EECS 584

    Fall 2014

    Miscellaneous

    Applications can be given actual memorypositions for reads

    Can even give direct access for writes

    Dangerous! H-Store solves with precompiled procedures

    How do we determine where to store items

    weve migrated to disk This issue has no traditional counterpart

  • 8/10/2019 Main Memory Databases

    23/35

    EECS 584

    Fall 2014

    Covering this Talk

    Overview of Main Memory Databases Specific Considerations

    Concurrency Control

    Commit Processing Access Methods

    Query Processing

    Miscellaneous Case Study: H-Store

  • 8/10/2019 Main Memory Databases

    24/35

    EECS 584

    Fall 2014

    H-Store

  • 8/10/2019 Main Memory Databases

    25/35

    EECS 584

    Fall 2014

    H-Store

    ONE

    CPU

  • 8/10/2019 Main Memory Databases

    26/35

    EECS 584

    Fall 2014

    Architecture

  • 8/10/2019 Main Memory Databases

    27/35

    EECS 584

    Fall 2014

    Deploy Time

    Procedures arecompiled

    Layout determined by

    administrator Database optimized at

    deploy time

  • 8/10/2019 Main Memory Databases

    28/35

    EECS 584

    Fall 2014

    Runtime

    Transactions Initiated at one site,

    that site fulfillstransaction

    Special Cases Single Sited:

    Transaction runs ononly one site

    One-shot: Eachquery in transactionruns on only onesite

  • 8/10/2019 Main Memory Databases

    29/35

    EECS 584

    Fall 2014

    Single-Sited Transaction

    DATA

    Request

    Client

  • 8/10/2019 Main Memory Databases

    30/35

    EECS 584

    Fall 2014

    Single-Sited Transaction

    DATA

    Redirect

    Client

  • 8/10/2019 Main Memory Databases

    31/35

    EECS 584

    Fall 2014

    Single-Sited Transaction

    DATA

    Response

    Client

  • 8/10/2019 Main Memory Databases

    32/35

    EECS 584

    Fall 2014

    Multi-Sited Transaction

    DATA

    Request

    Client

    DATA DATA

  • 8/10/2019 Main Memory Databases

    33/35

    EECS 584

    Fall 2014

    Locking

    H-Store has no locks, we just execute one at atime on a site

    Concurrency achieved by partitioning data

    across machines We simply hope that a transaction doesnt

    need data on multiple partitions

  • 8/10/2019 Main Memory Databases

    34/35

    EECS 584

    Fall 2014

    Multi-Sited Transaction

    DATA

    Request

    Client

    DATA DATA

  • 8/10/2019 Main Memory Databases

    35/35

    EECS 584 Fall 2014

    Multi-Sited Transaction

    DATA

    Request

    Client

    DATADATA