23
Self-Tuning Database systems Wang Haocong Jan 8, 2009

Self-Tuning Database systems

  • Upload
    akando

  • View
    53

  • Download
    7

Embed Size (px)

DESCRIPTION

Self-Tuning Database systems. Wang Haocong Jan 8, 2009. Tuning databases. Logical database design Physical database design (indexes) Memory, disks Materialized Views Partitioning. Self-Tuning Systems. Databases are complicated! Schema design is hard Lots of “knobs” to tweak - PowerPoint PPT Presentation

Citation preview

Page 1: Self-Tuning Database systems

Self-Tuning Database systems

Wang Haocong

Jan 8, 2009

Page 2: Self-Tuning Database systems

2

Tuning databases

Logical database design Physical database design (indexes) Memory, disks Materialized Views Partitioning

Page 3: Self-Tuning Database systems

3

Self-Tuning Systems

Databases are complicated! Schema design is hard Lots of “knobs” to tweak Need appropriate information

Does the DB approach give us more ability to “self-tune” than some other approach (e.g., Java)?

Page 4: Self-Tuning Database systems

4

What Would We Like to Auto-Tune?

Query optimization – statistics, bad decisions, …

The schema itself? Indices Auxiliary materialized views Data partitioning Perhaps logging?

Page 5: Self-Tuning Database systems

5

What Are The Challenges in Building Adaptive Systems?

Really, a generalization of those in adaptive query processing

Information gathering – how do we get it? Extrapolating – how do we do this

accurately and efficiently? Sampling or piloting Minimizing the impact of mistakes if they

happen Using app-specific knowledge

Page 6: Self-Tuning Database systems

6

Who’s Interested in these Problems?

Oracle: Materialized view “wizard”

Microsoft “AutoAdmin”: Index selection, materialized view selection Stats on materialized views Database layout

IBM SMART (Self-Managing And Resource Tuning): Histogram tuning (“LEO” learning optimizer) Partitioning in clusters Index selection Adaptive query processing

Page 7: Self-Tuning Database systems

7

A Particular Instance: Microsoft’s Index Tuning Wizard

Why not let the system choose the best index combination(s) for a workload

The basic idea: Log a whole bunch of queries that are

frequently run See what set of indices is best

Why is this hard? Why not index everything?

Create these indices with little or no human input

Page 8: Self-Tuning Database systems

8

Possible Approaches

Obviously: only consider indices that would be useful The optimizer can “tell” which indices it might use in

executing a query

But that continues to be a lot of indices! Can exhaustively compare all possible indices Note that indices can interact (esp. for updates)

How do we compare costs and benefits of indices? Execute for real Use optimizer cost model with whatever stats we have Gather some stats (e.g., build histograms, sample) and use

cost model

Page 9: Self-Tuning Database systems

9

Physical design

Page 10: Self-Tuning Database systems

10

SQL Server Architecture

Page 11: Self-Tuning Database systems

11

Their Approach in More Detail For a workload of n queries:

Generate a separate workload with each query Evaluate the candidate indices for this query to find the best

“configuration” – limited to 2 indices, 2 tables, single joins Candidate index set for workload is the union of all configurations

Too expensive to enumerate all; use a greedy algorithm: Exhaustively enumerate (using optimizer) best m-index

configuration Pick a new index I to add, which seems to save cost relative to

adding some other I’ or to the current cost Repeat until we’ve added “enough” k indices “Despite interaction among indices, the largest cost reductions

often result from indices that are good candidates by themselves”

They iteratively expand to 2-column indices – index on leading column must be desirable for this to be desirable

Page 12: Self-Tuning Database systems

12

Further Enhancements

Use the tool for “what-if” analysis What if a table grows by a substantial amount?

Supplement with extra info gathered from real query execution Maybe we can “tweak” estimates for certain

selectivities An attempt to compensate for the

“exponential error” problem

Page 13: Self-Tuning Database systems

13

Physical tuning tool

Decide when to tune Decide what “representative” workload Run the tool and examine the

recommended physical design changes Implement them if appropriate

Page 14: Self-Tuning Database systems

14

Alternative Tuning Models

Alerter When to tune Light weight tools

Workload as a Sequence Read/update queries Create/drop physical structures

Page 15: Self-Tuning Database systems

15

Dynamic tuning

Page 16: Self-Tuning Database systems

16

Dynamic tuning

Low overhead, not interfere with normal functioning of DBMS

Balance cost of transitioning and potential benefits

Avoid unwanted oscillations

Page 17: Self-Tuning Database systems

17

Impact

Tuning Large Workloads Partition Sample

Tuning Production Servers Test server

Page 18: Self-Tuning Database systems

18

Future Directions

Ability to compare the quality of automated physical design solutions

Light weight approaches Machine learning techniques, control

theory and online algorithms

Page 19: Self-Tuning Database systems

19

Memory tuning in DB2

Innovative cost-benefit analysis Simulation technique vs. modeling

Tunes memory distribution and total memory usage

Simple greedy memory tuner

Control algorithms to avoid oscillations

Performs very well in experiments For both OLTP and DSS

Page 20: Self-Tuning Database systems

20

SBPX Operation

Buffer Pool SBPX

Disk

3. Page request for

4. Check Bufferpool

5. Check SBPX

6. Start timer

1. Victimize Page (move to SBPX)2. Load new page from disk7. Victimize BP page (send to SBPX)

8. Load page from disk

9. Stop timer

Page 21: Self-Tuning Database systems

21

Experimental results – tuning a static workload

T ime

Tra

ns

ac

tio

ns

Pe

r M

inu

te

Phas e 1 Phas e 2 Phas e 3

Phase 1 Phase 2 Phase 3

BP

Size

Page 22: Self-Tuning Database systems

22

Experimental results – workload shift

0

1000

2000

3000

4000

5000

6000

7000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

Order of execution

Tim

e in

sec

on

ds

avg = 959

avg = 2285

avg = 6206

Reduce 63%

Some IndexesDropped

Page 23: Self-Tuning Database systems

23

Thank you!