28
Automatic Database Management System Tuning Through Large-scale Machine Learning Dana Van Aken, Andrew Pavlo, Georey J. Gordon, Bohan Zhang [image source] OtterTune

OtterTune - Percona · MySQL v5.6, YCSB: update-heavy workload VM - 2GB RAM, 2 vCPUs. ... OtterTune Reuse historical performance data from tuning past DBMS deployments to tune

Embed Size (px)

Citation preview

Automatic Database Management System Tuning Through Large-scale Machine Learning

Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, Bohan Zhang

[image source]

OtterTune

DBMS Tuning

• Tuning a DBMS’s configuration knobs knobs for the application’s workload and hardware is critical for performance

• Tuning even one DBMS deployment is hard…

2

0.20.4

0.60.8

0.51.01.52.02.5

0.0

1.0

2.0

Bufferpoolsize

(GB)

Logfilesize(GB)1.0

Challenge #1: Dependencies

3

99th %-tile latency (sec)

lower is better

MySQL v5.6, YCSB: update-heavy workload VM - 2GB RAM, 2 vCPUs

Bufferpoolsize(GB)3.01.00.0 2.0

0.0

1.0

2.0

3.0

Challenge #2: Continuous Settings

4

MySQL v5.6, YCSB: update-heavy workload VM - 2GB RAM, 2 vCPUs

99th %-tile latency (sec)

lower is better

Challenge #3: Non-reusable Configs

5

Workload#1 Workload#2 Workload#3

0.0

2.0

4.0

6.0

Config#2Config#1 Config#3

MySQL v5.6, 3 different YCSB workloads

99th %-tile latency (sec)

lower is better

8x

30x

Challenge #4: Tuning Complexity

6

Number of configuration knobs in MySQL and Postgres releases over the last 16 years

MySQL Postgres

20122004 20162000 2008

600

400

200

0

Num

bero

fkno

bs

5x

7x

539

291

Existing Solutions

7

Auto-tunersHeuristic-based; DBMS-specific;

“Smart” Auto-tunersResearch only; no data reuse;

OtterTune

Reuse historical performance data from tuning past DBMS deployments to tune new DBMS deployments

8

Train ML models to: 1. Identify the most impactful knobs

2. Find past workloads that are similar to the new workload

3. Recommend knob settings that improve a target objective (e.g., latency, throughput)

Tuning Manager

Data Repository

ML Models

Controller

DBMS1

• User specifies the target objective1

2

• Controller starts the first observation period- Measures external metrics, then collects DBMS-specific internal metrics- Returns: metrics, current knob configuration

2

• Tuning manager saves the data and computes the next configuration3- Returns: next configuration, expected improvement

3

• Controller installs next configuration on DBMS

4

4

Assumptions & Limitations

• All DBMS instances run on the same hardware

• Blacklist for knobs that should NOT be tuned

10

Machine Learning Pipeline

11

Workload Characterization Knob Identification

Sam

ples

Metrics

Sam

ples

Knobs Knobs

Distinct Metrics

Importance

Automatic Tuner

Workload Characterization

• Goal: find a set of metrics that best characterize an application’s workload

• Method: Use the metrics stored in OtterTune’s data repository

12

...

...

Internal DBMS Metrics:- Directly affected by the knobs’ settings

- Problem: redundancy - Same but different units - Highly correlated

- Solution: prune them - Factor analysis to capture correlation patterns

- K-means to group correlated metrics

# buffer pool misses

total # buffer pool requests

Buffer pool size is too small:

Sample Metric Clusters

innodb_data_reads innodb_buffer_pool_reads

innodb_pages_read …

14

MySQL (v5.6)

questions table_locks_immediate table_open_cache_hits

Postgres (v9.3)

blks_read heap_blks_read idx_blks_read

buffers_backend blk_write_time buffers_clean

Knob Identification

• Goal: identify which knobs affect the DBMS’s performance

• Method: feature selection technique to order the knobs by importance

15

Configuration Knobs:- Knobs have varying degrees of impact on the DBMS’s performance

- Some have high impact - Some have no impact - For many, it depends on the workload

- Problem: which knobs matter?

- Solution: Incremental knob selection to gradually increase the number of knobs tuned during a session

...

...

...

- Solution: Lasso to determine the order of importance

- Problem: how many to tune?

Top 5 Configuration Knobs17

1. innodb_buffer_pool_size

2. innodb_flush_method

3. innodb_log_file_size

4. innodb_thread_concurrency

5. innodb_max_dirty_pages-

_pct_lwm

1. shared_buffers

2. checkpoint_segments

3. effective_cache_size

4. bgwriter_lru_maxpages

5. bgwriter_delay

MySQL (v5.6) Postgres (v9.3)

Automated Tuning

• Step 1: Workload Mapping

• Step 2: Configuration Recommendation

18

Step 1: Workload Mapping

• Goal: Match the target DBMS’s workload to the most similar known workload in the data repository

• Method: For each known workload, compute a similarity score by comparing the performance measurements

19

- Problem: deciding which

Known WorkloadsTarget DBMS’s Workload

Target Workload

Data

p99 latency throughput innodb_data_writes handler_read_key …

792 512 46044 526784 …

Knob Config X

Similarity score: average distance in performance measurements over all metrics

Workload A

- Problem: computing the score with raw metrics would be unfair

Workload B

792 512 46044 526784 …4 5 7 5 …

Bin metric values

- Solution: compute deciles, bin values- Problem: little overlap in the configurations tried between workloads

f(x)f(x)f(x)f(x)f(x)f(x)

- Solution: for each metric, train statistical model to predict performance

8 2 9 3 …3 5 8 4 …

Step 2: Recommendation

• Goal: recommend configurations in order to optimize the target metric over the entire tuning session

• Method: Reuse data from the similar workload in step #1 to train a statistical model and use it to optimize the next configuration to try

21

Recommendation Steps

• Use similar historical data & new data to train aGaussian Process model

• Predict the latency & variance for a sample set of possible configurations

• Optimize the next configuration to try, trading off: - Exploration: gathering information to improve the model - Exploitation: greedily trying to do well on the target

objective

22

0.40.2

0.60.8

0.51.01.52.02.5

0.0

1.0

2.0

Bufferpoolsize

(GB)

Logfilesize(GB)

99th%-tile

latenc

y(s)

Experimental Setup

DBMSs: MySQL (v5.6), Postgres (v9.3)

Training data collection:

• 15 YCSB workload mixtures

• ~30k trials per DBMS

Experiments conducted on Amazon EC2

23

Data vs. No Data24

2 hours of tuning time

OtterTune iTuned

TPC-C

36% 34%

Wikipedia

47% 42%

Wikipedia

40% 42%

TPC-C

46% 30%

MySQL (v5.6) Postgres (v9.3)

99th %-tile latency (sec)

lower is better

MySQL: Efficacy Comparison

25

0.0

0.5

1.0

1.5

0.71

0.32

0.74

0.30

1.64

Default OtterTune Tuning Script DBA RDS Config

0

250

500

750

562

736

508

686

165

99th percentile latency (sec) (lower is better)

Throughput (txn/sec) (higher is better)

OtterTune generates a knob configuration comparable to the

DBA!

Postgres: Efficacy Comparison

26

0.0

0.2

0.4

0.6

0.300.250.250.24

0.53

Default OtterTune Tuning Script DBA RDS Config

0

250

500

750

1000

714

843845946

426

99th percentile latency (sec) (lower is better)

Throughput (txn/sec) (higher is better)

OtterTune generates a knob configuration comparable to the

DBA!

Future Work

27

• Automatically detecting the hardware capabilities of the DBMS’s host machine

• Auto-tuning hierarchical components

• Multi-objective optimization

THE END@danavanaken