Upload
kitty
View
33
Download
0
Tags:
Embed Size (px)
DESCRIPTION
R* Optimizer Validation and Performance Evaluation for Distributed Queries. Lothar F. Mackert, Guy M. Lohman IBM Almaden Research Center 1986 presented by: KORHAN KOCABIYIK. Paper Layout. Overview of distributed Compilation and Optimization Instrumentation General Measurements - PowerPoint PPT Presentation
Citation preview
KORHAN KOCABIYIK 1
R* Optimizer Validation and Performance Evaluation for Distributed
QueriesLothar F. Mackert, Guy M. Lohman
IBM Almaden Research Center
1986
presented by:KORHAN KOCABIYIK
KORHAN KOCABIYIK 2
Paper Layout
Overview of distributed Compilation and Optimization Instrumentation General Measurements Distributed Join Results Suggestions for Improving their Performance Conclusions Suggested Readings
KORHAN KOCABIYIK 3
About Paper
It is about the performance evaluation of R* to distributed queries (that has tables at multiple sites)
R* message, I/O and CPU resources consumed and corresponding costs estimated by the optimizer
A number of tests were run over wide variety of dynamically-created test databases, SQL queries, and system parameters.
Different speed networks show different performance for distributed queries
KORHAN KOCABIYIK 4
What is System R?
System R is a database system built as a research project at IBM Almaden Research Center in the 1970's.
System R introduced the SQL language and also demonstrated that a relational system could provide good transaction processing performance.
Non-procedural user interface. User specify only what data is desired, leaving the system optimizer to choose how to access that data
Optimizer plays a central role regarding system performance
KORHAN KOCABIYIK 5
R*
R* is inherited and extended to a distributed environment, the optimization algorithms of System R
Each site is autonomous as possible.– No central scheduler, no central deadlock detection, no
central catalog, etc. R* uses snapshot data – a “copy of a relation(s) in which
the data is consistent but not necessarily up to date.” used to provide a static copy of the database
The Unit of Distribution in R* is a table and each table is stored at one and only one site
KORHAN KOCABIYIK 6
Optimization Should Answer
Under what circumstances (regions of parameter space) does the
optimizer choose a suboptimal plan, or, worse, a particularly bad plan? To which parameters is the actual performance most sensitive? Are these parameters being modeled accuretly by the optimizer? What is the impact of variations from the optimizer’s simplifying
assumptions? Is it possible to simplfy the optimizer’s model (by using heuristics, for
example) to speed up optimization? What are the best database statistics to support optimization?
KORHAN KOCABIYIK 7
Distributed Compilation and Optimization
Distributed Query: Any SQL data manipulation statement that references tables at sites other than the query site
Query Site: The site to which the application program is submitted for compilation
Master Site: The site coordinates the optimization of all SQl statements embedded in that program
Apprentice Site: For each query, sites other than the master site that store a table referenced in the query.
Join Site: The site at which each join takes place. And there are two methods for trnsfering a copy of the inner table to the join site:– Ship Whole: A copy of the entire table is transferred– Fetch Matches: Only matching tuples are transferred
KORHAN KOCABIYIK 8
Fetch Matches
1) Project the outer table tuple and ship this value to the site of the inner table
2) Find those tuples in the inner inner table that match the value sent and project them to the columns needed
3) Ship a copy of the projected matching inner tuples back to the join site
4) Join the matches to the outer table
KORHAN KOCABIYIK 9
Optimization in R*
Compilation hence optimization is truly distributed in R*
The master’s optimizer makes all inter-site decisions, such as the site at which inter-site joins takes place, and the method and order of transfers between sites
Intra-site decisions are only suggested by the master planner
KORHAN KOCABIYIK 10
R* Cost Structure
The optimizer tries to minimize a cost function that is a linear combination of four components: CPU, I/O, the number of messages and the total number of bytes transferred at each message– Cost = Wcpu(#_instructions) + Wi/o(#ios) + Wmsg(#_MSGs) +
Wbyt(#_byt) Wmsg for some constant penalty for sending a
communication over the network Wbyt penalty for message length.
KORHAN KOCABIYIK 11
CPU cost
Both local and distributed queries found CPU cost to be important.
CPU costs high in sorts– 1.allocating temp disk space for partially sorted strings– 2. Encoding and decoding sorted columns– 3. Quicksorting individual pages in mem.
CPU costs are also high in scans Although “CPU cost is significant . . . [it] is not enough to
affect the choice of the optimizer”
KORHAN KOCABIYIK 12
Instruments
1. Added three statements to SQL language for test control and performance monitoring(EXPLAIN, COLLECT COUNTERS, FORCE OPTIMIZER)
2. Developed pre-compiled applications for autonmatically – (a) testing queries using the SQL statements of 1 – (b) analyzing the data collected by step a
3. Stored output of SQL statements of (1) and application programs of (2) in DB tables to establish interface between (1), (2a) and (2b)
KORHAN KOCABIYIK 13
Distributed EXPLAIN
The EXPLAIN command writes to user-owned PLAN_TABLE information describing the access plan chosen by the optimizer for a given SQL statement
No single site has the complete access plan for a distributed query Master-site has the inter-site access plan and each apprentice has its
local access plans R* EXPLAIN command was augmented to store each apprentice-
site’s plan in a local PLAN_TABLE and the test application program was altered to retrieve that information from each apprentice’s PLAN_TABLE
KORHAN KOCABIYIK 14
Distributed COLLECT COUNTERS
This new SQL statement collects and stores in a user-owned table the values of 40 internal counters RSS* (counts of disk reads, writes, and lookups) and some newly implemented counters of the communications component DC*
By a process run in the master and child processes run in the apprentices these counters are collected and inserted into the COUNTER_TABLE in the master-site
KORHAN KOCABIYIK 15
Distributed FORCE OPTIMIZER
To measure the cost of suboptimal plans they had to force the optimizer to use a specified plan.
The FORCE OPTIMIZER statement chooses the plan for the next SQL statement
Apprentice optimization can be forced by simply telling each apprentice to utilize the optimization decisions recommended by the master-site
KORHAN KOCABIYIK 16
Experimental Envirenment
2 identical IBM 4381 m/c, 40 buffer page of 4K each,
3000 random integer values for the join column 4 Integer & 5 fixed size char columns Each tuple is exactly 66 bytes long System stored 50 tuples in one page Join tests resulted with 100 to 6000 tupled result
sets
KORHAN KOCABIYIK 17
Costs Measured
KORHAN KOCABIYIK 18
Distributed Join
Focused on distributed joins rather than single table queries
In R*, n-table joins are executed as a sequence of n-1 two-table joins
Intermediate results of these joins are called composite tables
For simplicity they used only two-table joins
KORHAN KOCABIYIK 19
Distributed Joins (Cont’)
There are two different join methods:– Merge Scan Joins (-M-)– Nested Loop Joins (-N-)
There are two different transfer methods:– Whole table transfer (W)– Fetch only match (F)
AIW-M-B means Indexed outer table A is wholly shipped to inner table B with merge scan sort
KORHAN KOCABIYIK 20
Inner Table Transfer Strategy
If we ship the inner table B as a whole– best plan is AI-M-BIW
If we fetch the matching inner tuples – best plan is AI-M-BIF
KORHAN KOCABIYIK 21
Transfer Trade-offs
Option W – transfer the whole inner table.– Negatives
• No indexes can be shipped with it.• May ship inner tuples that have no matching outer tuples
– Positives• Predicates applied before shipping may reduce size• Only one transfer is needed for the join which may result in
lower overall network costs
KORHAN KOCABIYIK 22
Transfer Trade-offs (Cont.)
Option F – Fetch matching tuples only• Idea – Send outer tuple join column values, match with inner
tuples, then send these inner tuples over network
– Negatives• Multiple rounds of sending congest network• May have to send whole table anyway – W better
– Positives• Tables may have few actual join values in common, can
eliminate need to send many inner tuples.
KORHAN KOCABIYIK 23
Use W or F Option?
In W – Network costs only 2.9% of total Strategy F handy when:
– Cardinality of outer table <0.5 the # of messages required to ship the inner table as a whole.
• Idea behind rule – beat plan W in theory by sending few join values from outer table that will weed out most inner tuples
– The join cardinality < inner cardinality• Idea behind 2nd rule – since most inner tuples will not be
needed, we can beat plan W by sending only outer join values and eliminating most inner tuples.
KORHAN KOCABIYIK 24
Optimizer performance (Local)
Optimizer has trouble modeling unclustered indexes on smaller tables. In such cases, Optimizer actually picks worst plan, thinking it is the best and thinks the best is actually the worst.
Why?– Tuple order unimportant to nested loop join and index on outer
table may clutter buffer.– A highly selective predicate may eliminate the need for a scan in
which case the index is important. Adding an index can increase the cost - The Optimizer models each
table independently, ignoring competition for buffer space amongst two tables being joined
KORHAN KOCABIYIK 25
Distributed vs. Local Joins
Total resources consumed in Distributed joins higher than in local joins
Response time in Distributed joins less than in Local Joins What does this mean?
– We have more machines doing work in a distributed join so they can do work in parallel- more work is done but since more machines are doing work, the result takes less time.
KORHAN KOCABIYIK 26
Distrib vs. Local Joins Cont.
Response time improvement for distributed queries has 2 reasons– 1. Parallelism– 2. Reduced Contention – accessing tables using
unclustered indexes benefit greatly from larger buffers – n machines = n buffer size.
Negatives of Distributed – Slow network speeds make reverse true, then local joins are faster.
KORHAN KOCABIYIK 27
Alternative Join methods
R* has also introduced some alternative join methods– joins using dynamically-created indexes– semijoins– joins using hashing (Bloom) filters (Bloomjoins)
S is a table at site 1 and T is a table at site 2 S.a = T.b where a is column of S and b is a column of T Two cases
– Both S and T have an (unclustered) index on their join column(s)– Neither have an (unclustered) index on their join column(s)
KORHAN KOCABIYIK 28
Dynamically Create Temporary Index on Inner Table
Since we cannot send an index, we can try to make one
Cost structure may be high– Scan entire table and send to site 1– Store table and create a temporary index on it at
site1– Execute best local join plan
KORHAN KOCABIYIK 29
Semijoin
Sort S and T on the join column. Produce S’, T’ Send S’ ‘s join column values to site T match against T’ and send these actual tuples to site S. Merge-join T’ ‘s tuples and S’ ‘s tuples to get answer.
KORHAN KOCABIYIK 30
Bloom join
Use Bloom filter – bit string sort of like hash table where each bit represents a bucket like in a hash table. All bits start off 0. If a value hashes to bit x, turn x to 1.
Generate a Bloom filter for table S and send to T. Hash T using the same hash function and ship any
tuples that hash to a 1 in S’s Bloom filter At site S, join T’s tuples with table S.
KORHAN KOCABIYIK 31
Comparison of join methods
Bloom joins generally outperform other methods Semijoins advantageous when both data and index
(unclustered) pages of inner table fit into the buffer so that efficient use of these tables keep semijoins procesing costs low. If not, constant paging of unclustered index results in poor performance.
KORHAN KOCABIYIK 32
Why are Bloom Joins better?
Message costs of Semi and Bloom comparable Semijoin incurs higher local processing costs to
perform a “second join”, ie once send S’ ‘s join column to T’, join, then send this result to S’ and join these T’ values with S’.
KORHAN KOCABIYIK 33
Conclusions
Shipping the entire inner table to the join site and storing it there dominates the fetch-matches strategy
R* optimizer’s modeling of message costs was very accurate (a necessary condition for picking the correct join site)
Even for medium-speed, long networks linking geographically disparsed hosts, local costs for CPU and I/O are significant
Bloom-join dominates the other distributed join methods in all cases investigated
They did not test joins for very large tables and for more than 2 tables, so they may be characterised differently
They used an environment that all sites are have the same capabilities and a uniform network link. So adapting the optimizer to real-life may be rather difficult
KORHAN KOCABIYIK 34
Questions?Thanks!
presented by:KORHAN KOCABIYIK