Upload
akando
View
53
Download
7
Tags:
Embed Size (px)
DESCRIPTION
Self-Tuning Database systems. Wang Haocong Jan 8, 2009. Tuning databases. Logical database design Physical database design (indexes) Memory, disks Materialized Views Partitioning. Self-Tuning Systems. Databases are complicated! Schema design is hard Lots of “knobs” to tweak - PowerPoint PPT Presentation
Citation preview
Self-Tuning Database systems
Wang Haocong
Jan 8, 2009
2
Tuning databases
Logical database design Physical database design (indexes) Memory, disks Materialized Views Partitioning
3
Self-Tuning Systems
Databases are complicated! Schema design is hard Lots of “knobs” to tweak Need appropriate information
Does the DB approach give us more ability to “self-tune” than some other approach (e.g., Java)?
4
What Would We Like to Auto-Tune?
Query optimization – statistics, bad decisions, …
The schema itself? Indices Auxiliary materialized views Data partitioning Perhaps logging?
5
What Are The Challenges in Building Adaptive Systems?
Really, a generalization of those in adaptive query processing
Information gathering – how do we get it? Extrapolating – how do we do this
accurately and efficiently? Sampling or piloting Minimizing the impact of mistakes if they
happen Using app-specific knowledge
6
Who’s Interested in these Problems?
Oracle: Materialized view “wizard”
Microsoft “AutoAdmin”: Index selection, materialized view selection Stats on materialized views Database layout
IBM SMART (Self-Managing And Resource Tuning): Histogram tuning (“LEO” learning optimizer) Partitioning in clusters Index selection Adaptive query processing
7
A Particular Instance: Microsoft’s Index Tuning Wizard
Why not let the system choose the best index combination(s) for a workload
The basic idea: Log a whole bunch of queries that are
frequently run See what set of indices is best
Why is this hard? Why not index everything?
Create these indices with little or no human input
8
Possible Approaches
Obviously: only consider indices that would be useful The optimizer can “tell” which indices it might use in
executing a query
But that continues to be a lot of indices! Can exhaustively compare all possible indices Note that indices can interact (esp. for updates)
How do we compare costs and benefits of indices? Execute for real Use optimizer cost model with whatever stats we have Gather some stats (e.g., build histograms, sample) and use
cost model
9
Physical design
10
SQL Server Architecture
11
Their Approach in More Detail For a workload of n queries:
Generate a separate workload with each query Evaluate the candidate indices for this query to find the best
“configuration” – limited to 2 indices, 2 tables, single joins Candidate index set for workload is the union of all configurations
Too expensive to enumerate all; use a greedy algorithm: Exhaustively enumerate (using optimizer) best m-index
configuration Pick a new index I to add, which seems to save cost relative to
adding some other I’ or to the current cost Repeat until we’ve added “enough” k indices “Despite interaction among indices, the largest cost reductions
often result from indices that are good candidates by themselves”
They iteratively expand to 2-column indices – index on leading column must be desirable for this to be desirable
12
Further Enhancements
Use the tool for “what-if” analysis What if a table grows by a substantial amount?
Supplement with extra info gathered from real query execution Maybe we can “tweak” estimates for certain
selectivities An attempt to compensate for the
“exponential error” problem
13
Physical tuning tool
Decide when to tune Decide what “representative” workload Run the tool and examine the
recommended physical design changes Implement them if appropriate
14
Alternative Tuning Models
Alerter When to tune Light weight tools
Workload as a Sequence Read/update queries Create/drop physical structures
15
Dynamic tuning
16
Dynamic tuning
Low overhead, not interfere with normal functioning of DBMS
Balance cost of transitioning and potential benefits
Avoid unwanted oscillations
17
Impact
Tuning Large Workloads Partition Sample
Tuning Production Servers Test server
18
Future Directions
Ability to compare the quality of automated physical design solutions
Light weight approaches Machine learning techniques, control
theory and online algorithms
19
Memory tuning in DB2
Innovative cost-benefit analysis Simulation technique vs. modeling
Tunes memory distribution and total memory usage
Simple greedy memory tuner
Control algorithms to avoid oscillations
Performs very well in experiments For both OLTP and DSS
20
SBPX Operation
Buffer Pool SBPX
Disk
3. Page request for
4. Check Bufferpool
5. Check SBPX
6. Start timer
1. Victimize Page (move to SBPX)2. Load new page from disk7. Victimize BP page (send to SBPX)
8. Load page from disk
9. Stop timer
21
Experimental results – tuning a static workload
T ime
Tra
ns
ac
tio
ns
Pe
r M
inu
te
Phas e 1 Phas e 2 Phas e 3
Phase 1 Phase 2 Phase 3
BP
Size
22
Experimental results – workload shift
0
1000
2000
3000
4000
5000
6000
7000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
Order of execution
Tim
e in
sec
on
ds
avg = 959
avg = 2285
avg = 6206
Reduce 63%
Some IndexesDropped
23
Thank you!