A Top-Down Approach to Achieving Performance …mozafari/php/data/uploads...A Top-Down Approach to...

A Top-Down Approach to Achieving�Performance Predictability in

Database SystemsJiamin Huang, Barzan Mozafari, Grant Schoenebeck and Thomas F. Wenisch

University of Michigan

Mean Latency Standard Deviation 99th Percentile

MySQL running TPC-C benchmark at a fixed rate

Performance Predictability in Today’s DBMS

By focusing too much on raw performancewe have neglected predictability

Why Does Predictability Matter?•  Latency-sensitive applications

•  Provisioning

•  SLA guarantees

•  Tuning

•  Interactive applications

•  User-experience

Example: Provisioning & SLAs

00.020.040.060.080.1

0.120.140.16

10 30 50 70 90 110 130 150 170 190 210

Number of cores

Mean Latency 99th Percentile

Latency: 0.016TPS: 62.5k

Example: Provisioning & SLAs

00.020.040.060.080.1

0.120.140.16

10 30 50 70 90 110 130 150 170 190 210

Number of cores

Mean Latency 99th Percentile

Latency: 0.075msTPS:13k

Latency: 0.016TPS: 62.5k Latency: 0.016

TPS: 62.5k

What is Performance Predictability?•  Performance Variance:

1.  Inherent (External): varying amounts of work, network problems, …

2.  Avoidable (Internal): due to internal artifacts of the DBMS (algorithms, data structures, …)

Two Approaches to Achieve Predictability•  Bottom-up: build a new DBMS from scratch

•  Once an academic prototype, always an academic prototype

•  Sacrifice performance for predictability

•  Top-down: identify root causes of unpredictability and mitigate them

•  Goal: do not compromise performance

•  Benefit: adoption is “no-brainer”

•  Challenge: today’s DBMSs are extremely complex

Key Questions1.  How to identify sources of variance?

2.  What makes today’s DBMSs unpredictable?

3.  How to achieve perf. predictability?

4.  How effective are our techniques?

Identifying Root Causes of Performance Variance

•  Profiling tools: critical for diagnosing perf. problems in modern software

•  Existing profilers focus on average performance

•  DTrace, gprof, perf, etc.

•  Breakdown of avg. performance of DBs done before•  “OLTP through the looking glass, and what we found there” [SIGMOD’08]

•  Need a new profiler capable of breaking down perf. variance → TProfiler

TProfiler•  Goal: Pinpoint root causes of performance variance in large

and complex codebases of today’s DBMS

770K lines of code

1.5M lines of code

1.9M lines of code

Q: How to find the root causes of

performance variance efficiently and accurately?

Our Solution: Variance Trees

process_query

execute_queryparse_query send_result

Call Graph

T(process_query)

T(execute_query)T(parse_query) T(send_result)

≡ Overall Latency

T(f): Execution time of function f

Latency Break Down

Our Solution: Variance Trees•  If , then:

Our Solution: Variance Trees

T(process_query)

T(execute_query)T(parse_query) T(send_result)

Var(process_query)

Var(execute_query)Var(parse_query) Var(send_result)

Cov(parse_query,execute_query)

Cov(parse_query,send_result)

Cov(execute_query,send_result)

Latency Break Down

Variance Break Down

Variance Tree

≡ Overall Latency Variance

Efficiency•  Observation: most nodes are actually insignificant

•  Do not build a complete variance tree!

•  Build variance tree iteratively and selectively

1.  Tree expansion: break down variance of selected functions (process_query at the beginning)

2.  Node selection: select significant* nodes from the tree

3.  User inspection: users inspect selected functions, and decide whether to further investigate

* See paper for details

Case Studies•  Used TProfiler to analyze 3 popular (both

traditional and modern) DBMSs

Setup•  Application: MySQL 5.6.23

•  Hardware: Intel Xeon E5 2.1GHz

•  Workload: TPC-C

•  128 Warehouses, 30GB Buffer Pool

Root Causes of Performance Unpredictability in MySQL

•  With 37 iterations, 6 mins manual inspection time each, out of 30K functions

Function Name Contribution to Overall Latency

Variance

os_event_wait[A] 37.5%

os_event_wait[B] 21.7%

buf_pool_mutex_enter 32.92%

Transactions waiting for locks on data objectsSame function, different call sites

Waiting for lock on the→ buffer pool before updating the list of buffer pages

Mitigating Performance Variance1.  Changing the implementation

•  Parallel Logging

2.  Changing the algorithm•  VATS, LLU

3.  Changing the tuning parameters•  Buffer pool size, redo log flush policy, etc.

Mitigating Performance Variance1.  Changing the implementation

•  Parallel Logging

2.  Changing the algorithm•  VATS, LLU

3.  Changing the tuning parameters•  Buffer pool size, redo log flush policy, etc.

Latency Variance Caused by Queuing

Min Queueing TimeMax Queueing Time

Average Queueing Time

L T1 T2 T3 T4 T5

Our Insight: Look at the Big Picture

L1 T1 T2

T1T2L3

T1T2T4

VATS: Variance Aware Transaction Scheduling Algorithm

L1 T1 T2

T1T2L3

T1T2T4

VATS grants locks according to transactions’ arrival time in the system, not in the queue (earliest first)

LRU Ordering of Buffer Pages

P1 P2 P3 P4 P5

List of buffer pages

P1 P2 P3 P4 P5

P4 is accessed

P1 P2 P3 P4 P5

The whole list is locked

Place where variance occurs

P4 P1 P2 P3 P5

P4 is moved to the head

P4 P1 P2 P3 P5

Solution: Use a lazy page update algorithm (LLU)

Lock is released

Variance-aware Tuning•  buf_pool_mutex_enter – buffer pool size

•  33%

•  66%

•  100%

VATS Improvement

Variance Mean Latency

•  189 lines of code changed in MySQL

LLU Improvement

00.20.40.60.8

11.21.41.61.8

•  46 lines of code changed in MySQL

Buffer Pool Size Tuning

0123456789

ffer P

66% 100%

Real-world Adoption•  TProfiler open-sourced

•  VATS has been merged into MySQL distributions (default in MariaDB and staged in Oracle MySQL)

•  2M+ installations in the world

•  Our buffer pool problem independently discovered and fixed in MySQL 5.8.0

Conclusion•  Predictability is an increasingly critical dimension of modern

software overlooked in today’s DBMSs

•  TProfiler identifies root causes of perf. variance in a principled fashion

•  Enable local and surgical changes to complex DBMS codebases

•  Lock waiting is major source of perf. variance in today’s DBMSs

•  Variance-aware scheduling, lazy optimizations, and tuning strategies dramatically improve predictability w/o sacrificing raw performance

Backup Slides

Definition of Predictability•  Many ways to capture perf. predictability

•  Minimize latency variance or tail latencies

•  Bound latency variance or tail latencies

•  Minimize the (stdev / mean) ratio

•  Our focus: identifying source of latency variance

•  Reducing variance without sacrificing mean latency

Node Selection Example

Var(write_logs)

Var(write_to_buffer)

80%· Var(Latency)

10%· Var(Latency)

Larger contribution

Var(lock_buffer) Var(unlock_buffer)

2%· Var(Latency)60%· Var(Latency)

Node Selection Example

Var(write_logs)

Var(write_to_buffer)

80%· Var(Latency)

10%· Var(Latency)

The lower in the variance tree, the more specific

Var(lock_buffer) Var(unlock_buffer)

2%· Var(Latency)60%· Var(Latency)↑

More specific

Manual Efforts

Application SemanticInterval

Annotation

# ofTProfiler

Avg. ManualInspection Time

per Run

Modified Lines

of Code

MySQL 9 lines of code 37 6 minutes 235Postgres 7 lines of code 16 10 minutes 355Httpd 4 lines of code 17 12 minutes 45

Related Work: DARC•  Uses multiple runs to produce latency histograms

•  Can find man contributors of latency in each execution time range

≠ Main contributors of latency variance in a semantic interval

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

write_logs

write unlocklock

0-1sec 1-10sec 10-20sec

1. Tree ExpansionVar(process_query)

Set the root to the variance of the toplevel function for query processing

Root Creation

Var(process_query)

Var(execute_query)Var(parse_query) Cov(parse_query,execute_query)

Break down the root and expand the variance tree

1. Tree Expansion

Var(process_query)

Select the most “informative” nodes from the tree

informative = large-enough value + deep-enough in the tree

2. Node Selection

VarianceContribution

Specificity

Var(process_query)

User Inspection

3. User Inspection

•  Ask for user inspection when:

1.  Cov terms are large

•  Study how to de-correlate the two functions

2.  Var terms are both large and deep

•  If cause is still unclear, repeat the expand-select-inspect process

A Top-Down Approach to Achieving Performance …mozafari/php/data/uploads...A Top-Down Approach to...

Documents

SMM: A Data Stream Management System for Knowledge Discovery 1 Hetal Thakkar, Nikolay Laptev, Hamid Mousavi, Barzan Mozafari, Vincenzo Russo, Carlo Zaniolo

Distributed Lock Management with RDMA: Decentralization ...web.eecs.umich.edu/~mozafari/php/data/uploads/sigmod...Distributed Lock Management with RDMA: Decentralization without Starvation

Senior ResearcherPourmotaabed T. Serum butyrylcholinesterase activity and phenotype associations with lipid profile in stroke patients. Clin Biochem. 2009,42:210-214. 31. Mozafari

Achieving Customer- Focused Service Achieving Customer- Focused Service

Doc.: IEEE 802.11-12/1145r1 Submission November 2012 Jiamin Chen, HuaweiSlide 1 IEEE 802.11aj Usage Models Document Date: 2012-11-12 Author(s)/contributors:

11 Tanaz Mozafari Khaled Almahasin Jasper Vale Cultures & Organizations By Geert Hofstede & Gert Hofstede Chun Chieh Yang Salpy Dombourian Becoming a Manager

Doc.: IEEE 802.11-12/1145r0 Submission December 2012 Jiamin Chen, HuaweiSlide 1 IEEE 802.11aj Usage Models Document Date: 2012-10-23 Authors/contributor:

Neighbor-Sensitive Hashingpyongjoo/resources/YongjooPark-knn.pdf · Neighbor-Sensitive Hashing Yongjoo Park Michael Cafarella Barzan Mozafari University of Michigan, Ann Arbor Nov

High-performance Pattern Detection and Discovery for Databases and Data Streams Barzan Mozafari Adviser: Prof. Carlo Zaniolo Committee Members: Prof. Junghoo

Optimal Load Shedding with Aggregates and Mining …web.eecs.umich.edu/~mozafari/php/data/uploads/icde_2010.pdfOptimal Load Shedding with Aggregates and Mining Queries Barzan Mozafari,

SnappyData: Streaming, Transactions, and Interactive ...web.eecs.umich.edu/~mozafari/php/data/uploads/snappy.pdfSnappyData: Streaming, Transactions, and Interactive Analytics in a

Doc.: IEEE 802.11-12/1245r4 Submission April 2013 Jiamin Chen, HuaweiSlide 1 IEEE 802.11aj Usage Models Document Date: 2013-04-24 Author(s)/contributors:

REGULAR PAPER Dan Suciu Efﬁcient query evaluation on ...web.eecs.umich.edu/~mozafari/fall2015/eecs584/papers/probabilistic... · Efﬁcient query evaluation on probabilistic databases

SnappyData: A Uniﬁed Cluster for Streaming, Transactions ...web.eecs.umich.edu/~mozafari/php/data/uploads/cidr_2017.pdf · SnappyData: A Uniﬁed Cluster for Streaming, Transactions,

Achieving Performance, Scalability, and Availability ...€¦ · Achieving Performance, Scalability, and Availability Objectives ... Average latency ... Achieving Performance, Scalability,

Achieving Increased Mobility Achieving Increase

Visualization-Aware Sampling for Very Large Databasesweb.eecs.umich.edu/~mozafari/php/data/uploads/icde_2016.pdf · deliver a visualization with a signiﬁcantly higher quality (Section

Positive responses of strawberry (Fragaria × ananassa Duch ...* Ali akbar Mozafari a.mozafari@uok.ac.ir 1 Department of Horticultural Sciences, Faculty of Agriculture, University

Achieving Energy Network Achieving Excellence in Energ… · 4 Achieving Energy Network Excellence In this challenging environment, achieving operational excellence is a necessary

BARZAN MOZAFARI KAI ZENG LORIS D’ANTONI CARLO …web.eecs.umich.edu/~mozafari/papers/tods_2013.pdfLORIS D’ANTONI, University of Pennsylvania CARLO ZANIOLO, University of California,