View
12
Download
0
Category
Preview:
Citation preview
Motivations L-Store Evaluation Conclusions
L-Store: A Real-time OLTP and OLAP System
Mohammad Sadoghi†
Souvik Bhattacharjee‡, Bishwaranjan Bhattacharjee#, Mustafa Canim#
†Exploratory Systems Lab† University of California, Davis
‡ University of Maryland, College Park# IBM T.J. Watson
EDBT’18 (March 27, 2018)
Mohammad Sadoghi (UC Davis) EDBT’18 1 / 16
Motivations L-Store Evaluation Conclusions
Data Management at Macroscale: The Four V’s of Big Data
John Doe
Mohammad Sadoghi (UC Davis) EDBT’18 2 / 16
Motivations L-Store Evaluation Conclusions
Data Management at Macroscale: The Four V’s of Big Data
John Doe
Mohammad Sadoghi (UC Davis) EDBT’18 2 / 16
Motivations L-Store Evaluation Conclusions
Data Management at Microscale: Volume & Velocity
OLTP(Write-optimized)
Data Velocity
Sales
Mohammad Sadoghi (UC Davis) EDBT’18 2 / 16
Motivations L-Store Evaluation Conclusions
Data Management at Microscale: Volume & Velocity
OLAP(Read-optimized) OLTP
(Write-optimized)
Extract-Transform-Load (ETL)
Data Velocity
Sales
Reports
Data Volume
Data is Stale
Mohammad Sadoghi (UC Davis) EDBT’18 2 / 16
Motivations L-Store Evaluation Conclusions
Data Management at Microscale: Volume & Velocity
OLAP(Read-optimized) OLTP
(Write-optimized)
Extract-Transform-Load (ETL)
Data Volume
Data Velocity
Sales
Reports
Mohammad Sadoghi (UC Davis) EDBT’18 2 / 16
Motivations L-Store Evaluation Conclusions
One Size Does not Fit All As of 2012
Mohammad Sadoghi (UC Davis) EDBT’18 3 / 16
Motivations L-Store Evaluation Conclusions
One Size Does not Fit All As of 2017
Mohammad Sadoghi (UC Davis) EDBT’18 3 / 16
Motivations L-Store Evaluation Conclusions
Data Management at Microscale: Volume & Velocity
OLAP+OLTP(Read & Write-
optimized)Reports
Sales
Mohammad Sadoghi (UC Davis) EDBT’18 4 / 16
Motivations L-Store Evaluation Conclusions
Storage Layout Conflict
Read Optimized(compressed, read-only pages)
Write Optimized(uncompressed in-place updates)
Columnar StorageRow-based Storage
Write-optimized (i.e., uncompressed & row-based) vs. read-optimized (i.e.,compressed & column-based) layouts
Mohammad Sadoghi (UC Davis) EDBT’18 5 / 16
Motivations L-Store Evaluation Conclusions
Unifying OLTP and OLAP: Velocity & Volume Dimensions
Observed Trends
In operational databases, there is a pressing need to close the gap betweenthe write-optimized layout for OLTP (i.e., row-wise) and theread-optimized layout for OLAP (i.e., column-wise).
Introducing a lineage-based storage architecture, a contention-free updatemechanism over a native columnar storage in order to
lazily and independently stage stable data from a write-optimized layout(i.e., OLTP) into a read-optimized layout (i.e., OLAP)
Mohammad Sadoghi (UC Davis) EDBT’18 6 / 16
Motivations L-Store Evaluation Conclusions
Unifying OLTP and OLAP: Velocity & Volume Dimensions
Observed Trends
In operational databases, there is a pressing need to close the gap betweenthe write-optimized layout for OLTP (i.e., row-wise) and theread-optimized layout for OLAP (i.e., column-wise).
Introducing a lineage-based storage architecture, a contention-free updatemechanism over a native columnar storage in order to
lazily and independently stage stable data from a write-optimized layout(i.e., OLTP) into a read-optimized layout (i.e., OLAP)
Mohammad Sadoghi (UC Davis) EDBT’18 6 / 16
Motivations L-Store Evaluation Conclusions
Unifying OLTP and OLAP: Velocity & Volume Dimensions
Observed Trends
In operational databases, there is a pressing need to close the gap betweenthe write-optimized layout for OLTP (i.e., row-wise) and theread-optimized layout for OLAP (i.e., column-wise).
Introducing a lineage-based storage architecture, a contention-free updatemechanism over a native columnar storage in order to
lazily and independently stage stable data from a write-optimized layout(i.e., OLTP) into a read-optimized layout (i.e., OLAP)
Mohammad Sadoghi (UC Davis) EDBT’18 6 / 16
Motivations L-Store Evaluation Conclusions
Lineage-based Storage Architecture (LSA): Intuition
Base Pages (Read-only)
Tail Pages (Append-only)
Index
Lineage Mapping (indirection layer, stable LID-to-RID mapping)
Base Version
( anchored RIDs)
Latest Version
(monotonically increasing RIDs)
In-page Lineage Tacking
Points to Stable LIDs
(i.e., anchored RID)
RIDi
RIDi
RIDk
RIDj
Physical Update Independence: De-coupling data & its updates(reconstruction via in-page lineage tracking and lineage mapping)
Mohammad Sadoghi (UC Davis) EDBT’18 7 / 16
Motivations L-Store Evaluation Conclusions
Lineage-based Storage Architecture (LSA): Intuition
Base Pages (Read-only)
Tail Pages (Append-only)
Index
Lineage Mapping (indirection layer, stable LID-to-RID mapping)
Base Version
( anchored RIDs)
Latest Version
(monotonically increasing RIDs)
Append-only Updates
(physical update independence)
In-page Lineage Tacking
Points to Stable LIDs
(i.e., anchored RID)
Monotonically Increasing Lineage
(updates are assigned RIDs in an increasing order)
RIDi
RIDi
RIDk
RIDj
Physical Update Independence: De-coupling data & its updates(reconstruction via in-page lineage tracking and lineage mapping)
Mohammad Sadoghi (UC Davis) EDBT’18 7 / 16
Motivations L-Store Evaluation Conclusions
Lineage-based Storage Architecture (LSA): Intuition
Base Pages (Read-only)
Tail Pages (Append-only)
Index
Lineage Mapping (indirection layer, stable LID-to-RID mapping)
Base Version
(stable anchored RIDs)
Latest Version
(monotonically increasing RIDs)
Append-only Updates
(physical update independence)
Lazy Update Consolidation
(snapshot reconstruction via lineage mapping & in-page tracking)
In-page Lineage Tacking
In-page Lineage Tacking
Data Block RIDs Remain Unchanged
(stable reference, anchored RIDs)
Points to Stable LIDs
(i.e., anchored RID)
Monotonically Increasing In-page Lineage
Monotonically Increasing Lineage
(updates are assigned RIDs in an increasing order)
RIDi
Consolidated Data (Read-only)
RIDi
RIDk
RIDj
Physical Update Independence: De-coupling data & its updates(reconstruction via in-page lineage tracking and lineage mapping)
Mohammad Sadoghi (UC Davis) EDBT’18 7 / 16
Motivations L-Store Evaluation Conclusions
Lineage-based Storage Architecture (LSA): Overview
Columnar StorageBase Pages(read-only)
Tail Pages(append-only)
Range Partitioning
Page
Dir
ecto
ry
Record(spanning over a set of aligned columns)
Overview of the lineage-based storage architecture(base pages and tail pages are handled identically at the storage layer)
Mohammad Sadoghi (UC Davis) EDBT’18 8 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Detailed Design
Columnar Storage
Range Partitioning
Base Pages(read-only)
Read Optimized(compressed, read-only pages)
Records are range-partitioned and compressed into a set of ready-only base pages(accelerating analytical queries)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Detailed Design
Write Optimized(uncompressed, append-only updates)
Updated Columns
Corresponding Columns
Base Pages(read-only)Tail Pages
(append-only)
Read Optimized(compressed, read-only pages)
Recent updates for a range of records are clustered in their tails pages(transforming costly point updates into an amortized analytical-like query)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Detailed Design
Write Optimized(uncompressed, append-only updates)
Updated Columns
Base Pages(read-only)Tail Pages
(append-only)
Different Versions of the Record
Base Record(older version)
Tail Record(latest version)
Read Optimized(compressed, read-only pages)
Recent updates for a range of records are clustered in their tails pages(transforming costly point updates into an amortized analytical-like query)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Detailed Design
Write Optimized(uncompressed, append-only updates)
Pre-allocated Space(lazily)
Base Pages(read-only)Tail Pages
(append-only)
Read Optimized(compressed, read-only pages)
Recent updates are strictly appended, uncompressed in the pre-allocated space(eliminating the read/write contention)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Detailed Design
Write Optimized(uncompressed, append-only updates)
Indirection Column(uncompressed, in-place update)
Forward Pointer to theLatest Version of the Record
Indirection Column(back pointer to the previous version)
Read Optimized(compressed, read-only pages)
Achieving (at most) 2-hop access to the latest version of any record(avoiding read performance deterioration for point queries)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Detailed Design
Write Optimized(uncompressed, append-only updates)
Indirection Column(uncompressed, in-place update)
Indirection Column(back pointer to the previous version)
New Version
Read Optimized(compressed, read-only pages)
Achieving (at most) 2-hop access to the latest version of any record(avoiding read performance deterioration for point queries)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Detailed Design
Write Optimized(uncompressed, append-only updates)
Indirection Column(uncompressed, in-place update)
Indirection Column(back pointer to the previous version)
New Version
Read Optimized(compressed, read-only pages)
Backward Pointer
Achieving (at most) 2-hop access to the latest version of any record(avoiding read performance deterioration for point queries)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Contention-free Merge
Write Optimized(uncompressed, append-only updates)
Merge Queue(tail pages to be merged)
Consecutive Set of Committed Updates
Indirection Column(uncompressed, in-place update)
Read Optimized(compressed, read-only pages)
Contention-free merging of only stable data: read-only and committed data(no need to block on-going and new transactions)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Contention-free MergeRead Optimized
(compressed, read-only pages)
Write Optimized(uncompressed, append-only updates)
⋈ =
Asynchronous Lazy Merge (committed, consecutives updates)
Indirection Column(uncompressed, in-place update)
Merge Queue(tail pages to be merged)
Lazy independent merging of base pages with their corresponding tail pages(resembling a local left outer-join of the base and tail pages)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Contention-free Merge
Write Optimized(uncompressed, append-only updates)
In-page, Independent Lineage Tracking
Asynchronous Lazy Merge (committed, consecutives updates)
⋈ =
Indirection Column(uncompressed, in-place update)
Read Optimized(compressed, read-only pages)
Independently tracking the lineage information within every page(no need to coordinate merges among different columns of the same records)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Epoch-based Contention-free De-allocation
Epoch-based De-allocation(longest running query)
Page Directory
Indirection Column(uncompressed, in-place update)
Write Optimized(uncompressed, append-only updates)
Read Optimized(compressed, read-only pages)
Asynchronous Lazy Merge
⋈ =
Contention-free page de-allocation using an epoch-based approach(no need to drain the ongoing transactions)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Epoch-based Contention-free De-allocation
In-page, Independent Lineage Tracking
Page Directory
Indirection Column(uncompressed, in-place update)
Write Optimized(uncompressed, append-only updates)
Epoch-based De-allocation(longest running query)
Read Optimized(compressed, read-only pages)
Contention-free page de-allocation using an epoch-based approach(no need to drain the ongoing transactions)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Epoch-based Contention-free De-allocation
Page Directory
Indirection Column(uncompressed, in-place update)
Write Optimized(uncompressed, append-only updates)
Epoch-based De-allocation(longest running query)
Asynchronous Lazy Merge
⋈ =
Read Optimized(compressed, read-only pages)
Contention-free page de-allocation using an epoch-based approach(no need to drain the ongoing transactions)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Epoch-based Contention-free De-allocation
Page Directory
Indirection Column(uncompressed, in-place update)
Write Optimized(uncompressed, append-only updates)
Epoch-based De-allocation(longest running query)
Asynchronous Lazy Merge
⋈ =
Read Optimized(compressed, read-only pages)
Contention-free page de-allocation using an epoch-based approach(no need to drain the ongoing transactions)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
L-Store: Epoch-based Contention-free De-allocation
Epoch-based De-allocation(longest running query)
In-page, Independent Lineage Tracking
Write Optimized(uncompressed, append-only updates)
Page Directory
Indirection Column(uncompressed, in-place update)
Asynchronous Lazy Merge
⋈ =
Read Optimized(compressed, read-only pages)
Contention-free page de-allocation using an epoch-based approach(no need to drain the ongoing transactions)
Mohammad Sadoghi (UC Davis) EDBT’18 9 / 16
Motivations L-Store Evaluation Conclusions
Experimental Analysis
Mohammad Sadoghi (UC Davis) EDBT’18 10 / 16
Motivations L-Store Evaluation Conclusions
Experimental Settings
Hardware:
2 × 6-core Intel(R) Xeon(R) CPU E5-2430 @ 2.20GHz, 64GB, 15 MB L3 cache
Workload: Extended Microsoft Hekaton Benchmark
Comparison with In-place Update + History and Delta + Blocking MergeEffect of varying contention levelsEffect of varying the read/write ratio of short update transactionsEffect of merge frequency on scanEffect of varying the number of short update vs. long read-only transactionsEffect of varying L-Store data layouts (row vs. columnar)Effect of varying the percentage of columns read in point queriesComparison with log-structured storage architecture (LevelDB)
Mohammad Sadoghi (UC Davis) EDBT’18 11 / 16
Motivations L-Store Evaluation Conclusions
Effect of Varying Contention Levels
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25
Throughp
ut(M
txns/s)
NumberofParallelShortUpdateTransactions
L-StoreIn-placeUpdate+HistoryDelta+BlockingMerge
0
0.5
1
1.5
2
0 5 10 15 20 25
Throughp
ut(M
txns/s)
NumberofParallelShortUpdateTransactions
L-StoreIn-placeUpdate+HistoryDelta+BlockingMerge
Achieving up to 40× as increasing the update contention
Mohammad Sadoghi (UC Davis) EDBT’18 12 / 16
Motivations L-Store Evaluation Conclusions
Effect of Merge Frequency on Scan Performance
0
0.5
1
1.5
2
2.5
4K 8K 16K 32K 64KScan
Exe
cuti
on
Tim
e (
in s
eco
nd
s)
Number of Tail Records Processed per Merge
Mixed OLTP + OLAP Workload; Low Contention (1 Scan + 1 Merge Threads, Page Size = 32 KB)
Scan Performance(4 Update Threads)
Scan Performance(14 Update Threads)
Merge process is essential in maintaining efficient scan performance
Mohammad Sadoghi (UC Davis) EDBT’18 13 / 16
Motivations L-Store Evaluation Conclusions
Effect of Mixed Workloads: Update Performance
0
0.2
0.4
0.6
0.8
1
1 4 8 12 16Up
dat
e T
hro
ugh
pu
t (m
illio
n o
f tx
n/s
)
Number of Parallel Update Transactions
Mixed OLTP + OLAP Workload; Medium Contention (Total of 17 Threads + 1 Merge Thread, Page Size = 32 KB)
Lineage-based DataStore (L-Store)
In-place Update +History
Delta + BlockingMerge
Eliminating latching & locking results in a substantial performance improvement
Mohammad Sadoghi (UC Davis) EDBT’18 14 / 16
Motivations L-Store Evaluation Conclusions
Effect of Mixed Workloads: Read Performance
0
200
400
600
800
1 5 9 13 16
Re
ad T
hro
ugh
pu
t (t
xn/s
)
Number of Parallel Read-only Transactions
Mixed OLTP + OLAP Workload; Medium Contention (Total of 17 Threads + 1 Merge Thread, Page Size = 32 KB)
Lineage-based DataStore (L-Store)
In-place Update +History
Delta + BlockingMerge
Coping with tens of update threads with a single merge thread
Mohammad Sadoghi (UC Davis) EDBT’18 14 / 16
Motivations L-Store Evaluation Conclusions
L-Store Key Contributions
Unifying OLAP & OLTP by introducing lineage-based storagearchitecture (LSA)
LSA is a native multi-version, columnar storage model that lazily &independently stages data from a write-optimized layout into aread-optimized one
Contention-free merging of only stable data without blocking ongoingor incoming transactions
Contention-free page de-allocation without draining ongoingtransactions
L-Store outperforms in-place update & delta approaches by factor of upto 8× on mixed OLTP/OLAP workloads and up to 40× onupdate-intensive workloads
Mohammad Sadoghi (UC Davis) EDBT’18 15 / 16
Motivations L-Store Evaluation Conclusions
Questions?Thank you!
Exploratory Systems Lab (ExpoLab)Website: https://msadoghi.github.io/
Mohammad Sadoghi (UC Davis) EDBT’18 16 / 16
Recommended