36
10 Copyright © 2005, Oracle. All rights reserved. Sorting and Joining

Sorting and Joining

Embed Size (px)

DESCRIPTION

Objectives After completing this lesson, you should be able to do the following: Optimize sort performance Describe different join techniques Explain join optimization Find optimal join execution plans Objectives After completing this lesson, you should be able to: Optimize sort performance by using Top-N SQL Describe the techniques used for executing join statements Explain the optimization that is performed on joins Find the optimal execution plan for a join Joins are probably the most frequent cause of performance problems.

Citation preview

Page 1: Sorting and Joining

10Copyright © 2005, Oracle. All rights reserved.

Sorting and Joining

Page 2: Sorting and Joining

10-2 Copyright © 2005, Oracle. All rights reserved.

Objectives

After completing this lesson, you should be able to do the following:• Optimize sort performance• Describe different join techniques• Explain join optimization• Find optimal join execution plans

Page 3: Sorting and Joining

10-3 Copyright © 2005, Oracle. All rights reserved.

Sort Operations

• SORT UNIQUE• SORT AGGREGATE• SORT GROUP BY • SORT JOIN• SORT ORDER BY

Page 4: Sorting and Joining

10-4 Copyright © 2005, Oracle. All rights reserved.

Tuning Sort Performance

• Sorting large sets can be expensive, so you should tune sort parameters.

• Note that that DISTINCT, GROUP BY, and most set operators cause implicit sorts.

• Minimize sorting by one of the following:– Try to avoid DISTINCT and GROUP BY.– Use UNION ALL instead of UNION.– Enable index access to avoid sorting.

Page 5: Sorting and Joining

10-5 Copyright © 2005, Oracle. All rights reserved.

Top-N SQL

SQL> SELECT * 2 FROM (SELECT prod_id 3 , prod_name 4 , prod_list_price 5 , prod_min_price 6 FROM products 7 ORDER BY prod_list_price DESC) 8 WHERE ROWNUM <= 5;

Page 6: Sorting and Joining

10-6 Copyright © 2005, Oracle. All rights reserved.

Join Terminology

• Join statement• Join predicate, nonjoin predicate• Single-row predicate

SQL> SELECT c.cust_last_name,c.cust_first_name, 2 co.country_id, co.country_name 3 FROM customers c JOIN countries co 4 ON (c.country_id = co.country_id) 5 AND ( co.country_id = 'JP' 6 OR c.cust_id = 205);

Join PredicateNon join Predicate

Single row Predicate

Page 7: Sorting and Joining

10-7 Copyright © 2005, Oracle. All rights reserved.

Join Terminology• Natural join

• Join with nonequal predicate

SQL> SELECT c.cust_last_name, co.country_name 2 FROM customers c NATURAL JOIN countries co;

SQL> SELECT s.amount_sold, p.promo_name 2 FROM sales s JOIN promotions p 3 ON( s.time_id 4 BETWEEN p.promo_begin_date 5 AND p.promo_end_date );

• Crossjoin SQL> SELECT * 2 FROM customers c CROSS JOIN countries co;

Page 8: Sorting and Joining

10-8 Copyright © 2005, Oracle. All rights reserved.

Join Operations

• A join operation combines the output from two row sources and returns one resulting row source.

• Join operation types include the following:– Nested loops join– Sort and merge join– Hash join

Page 9: Sorting and Joining

10-9 Copyright © 2005, Oracle. All rights reserved.

Nested Loops Joins

• One of the two tables is defined as the outer table (or the driving table).

• The other table is called the inner table.• For each row in the outer table, all matching rows

in the inner table are retrieved.

For each row in the outer table

For each row in the inner table

Check for a match

Page 10: Sorting and Joining

10-10 Copyright © 2005, Oracle. All rights reserved.

Nested Loops Join Plan

Nested loops

Table access(Outer/driving table)

Table access(Inner table)

1

2 3

Page 11: Sorting and Joining

10-11 Copyright © 2005, Oracle. All rights reserved.

When Are Nested Loops Joins Used?

Nested loops joins are used when:• Joining a few rows that have a good driving

condition• Order of tables is important• USE_NL(table1 table2) hint is used

Page 12: Sorting and Joining

10-12 Copyright © 2005, Oracle. All rights reserved.

Hash Joins

A hash join is executed as follows:• Both tables are split into as many partitions as

required, using a full table scan.• For each partition pair, a hash table is built in

memory on the smallest partition.• The other partition is used to probe the hash table.

Page 13: Sorting and Joining

10-13 Copyright © 2005, Oracle. All rights reserved.

Hash Join Plan

Hash join

Table access Table access

1

2 3

Page 14: Sorting and Joining

10-14 Copyright © 2005, Oracle. All rights reserved.

When Are Hash Joins Used?

Hash joins are used if either of the following conditions is true:• A large amount of data needs to be joined. • A large fraction of the table needs to be joined.

Use the USE_HASH hint, and use correct values for the following initialization parameters:• HASH_AREA_SIZE • HASH_JOIN_ENABLED

Page 15: Sorting and Joining

10-15 Copyright © 2005, Oracle. All rights reserved.

Sort and Merge Joins

A sort and merge join is executed as follows:1. The rows from each row source are sorted

on the join predicate columns.2. The two sorted row sources are then merged

and returned as the resulting row source.

Page 16: Sorting and Joining

10-16 Copyright © 2005, Oracle. All rights reserved.

Sort and Merge Join Plan

2 3

Merge

Sort Sort

Table access Table access

1

4 5

Page 17: Sorting and Joining

10-17 Copyright © 2005, Oracle. All rights reserved.

When Are Sort and Merge Joins Used?

Sort and merge joins can be used if any of the following conditions are true:• Join condition is not an equijoin.• OPTIMIZER_MODE is set to RULE.• HASH_JOIN_ENABLED is false. • Sorts are required for other operations.• Optimizer has considered the costs based on the

values of:– HASH_AREA_SIZE– SORT_AREA_SIZE

Page 18: Sorting and Joining

10-18 Copyright © 2005, Oracle. All rights reserved.

Joining Multiple Tables

You can join only two row sources at a time. Joins with more than two tables are executed as follows:1. Two tables are joined, resulting in a row source.2. The next table is joined with the row source that

results from step 1.3. Repeat step 2 until all tables are joined.

Page 19: Sorting and Joining

10-19 Copyright © 2005, Oracle. All rights reserved.

SQL:1999 Outer Joins

• Without using plus (+) sign .• The keyword OUTER JOIN is used instead.

SQL> SELECT s.time_id, t.time_id 2 FROM sales s 3 RIGHT OUTER JOIN times t 4 ON (s.time_id = t.time_id);

Page 20: Sorting and Joining

10-20 Copyright © 2005, Oracle. All rights reserved.

Oracle Proprietary Outer Joins

• Join predicates with a plus (+) sign• Nonjoin predicates with a plus (+) sign• Predicates without a plus (+) sign disable outer

joins.

SQL> SELECT s.time_id, t.time_id 2 FROM sales s, times t 3 WHERE s.time_id (+) = t.time_id;

Page 21: Sorting and Joining

10-21 Copyright © 2005, Oracle. All rights reserved.

Full Outer Joins

• A full outer join acts like a combination of the left and right outer joins.

• In addition to the inner join, rows in both tables that have not been returned in the result of the inner join are preserved and extended with nulls.

• This syntax has been introduced in Oracle9i Database.

SQL> SELECT c.cust_id, c.cust_last_name 2 , co.country_name 3 FROM customers c 4 FULL OUTER JOIN countries co 5 ON (c.country_id = co.country_id);

Page 22: Sorting and Joining

10-22 Copyright © 2005, Oracle. All rights reserved.

Execution of Outer Joins

Indexes can be used for outer join predicates.

SQL> SELECT c.cust_id, co.country_name 2 FROM customers c 3 LEFT OUTER JOIN countries co 4 ON (c.country_id = co.country_id) 5 AND co.country_id = 'IT';

Page 23: Sorting and Joining

10-23 Copyright © 2005, Oracle. All rights reserved.

The Optimizer and Joins

The optimizer determines the following (in sequence):1. Order in which to join the tables2. Best join operation to apply for each join3. Access path for each row source

Page 24: Sorting and Joining

10-24 Copyright © 2005, Oracle. All rights reserved.

Join Order Rules

Rule 2

For outer joins, the table with the outer joined table must come after the other table in the join order for processing the join.

Rule 1

A single-row predicate forces its row source to be placed first in the join order.

Page 25: Sorting and Joining

10-25 Copyright © 2005, Oracle. All rights reserved.

Join Optimization

• As a first step, a list of possible join orders is generated.

• This potentially results in the following:

• Parse time grows exponentially when adding tables to a join.

Number of Tables Join Orders---------------- -----------2 2! = 23 3! = 64 4! = 24

Page 26: Sorting and Joining

10-26 Copyright © 2005, Oracle. All rights reserved.

Join Optimization

1. A set of possible execution plans is generated, limited by the following initialization parameter:OPTIMIZER_MAX_PERMUTATIONS

2. The optimizer then estimates the cost of each plan and chooses the one with the lowest cost.

Page 27: Sorting and Joining

10-27 Copyright © 2005, Oracle. All rights reserved.

Estimating Join Costs

• Nested loops join:Cost read(A) + card(A) * read(B)– OPTIMIZER_INDEX_CACHING parameter

• Sort and merge join:Cost read(A) + sort(A) + read(B) + sort(B)

Page 28: Sorting and Joining

10-28 Copyright © 2005, Oracle. All rights reserved.

Star Joins

SALES

PRODUCTS

CHANNELS PROMOTIONS TIMES

CUSTOMERS

Page 29: Sorting and Joining

10-30 Copyright © 2005, Oracle. All rights reserved.

Hints to Facilitate Joins

Join order• ORDERED• LEADING• STAR

Join operations• USE_HASH,USE_MERGE,

USE_NL• LEADING• HASH_AJ, MERGE_AJ,

NL_AJ• HASH_SJ, MERGE_SJ,

NL_SJ

Page 30: Sorting and Joining

10-31 Copyright © 2005, Oracle. All rights reserved.

Other Join Hints

SQL> select /*+ STAR_TRANSFORMATION */... 2 from T1, T2, ... 3 where ...

SQL> select /*+ DRIVING_SITE(T1) */... 2 from T1, T2, ... 3 where ...

Page 31: Sorting and Joining

10-32 Copyright © 2005, Oracle. All rights reserved.

Subqueries and Joins

• Subqueries, like joins, are statements that reference multiple tables.

• Subquery types:– Noncorrelated subqueries– Correlated subqueries– NOT IN subqueries (anti-joins)– EXISTS subqueries (semi-joins)

Page 32: Sorting and Joining

10-34 Copyright © 2005, Oracle. All rights reserved.

Initialization Parameters ThatInfluence Joins

• Hash join parameters:– HASH_JOIN_ENABLED– HASH_AREA_SIZE

• Sort parameters:– SORT_AREA_SIZE– SORT_AREA_RETAINED_SIZE

Page 33: Sorting and Joining

10-35 Copyright © 2005, Oracle. All rights reserved.

Throwaway of Rows

• Rows that are retrieved but not used• Identify throwaway by comparing the number of

rows of the following:– Join operation– Input row sources

Page 34: Sorting and Joining

10-36 Copyright © 2005, Oracle. All rights reserved.

Minimize Throwaway of Rows

• Compare available indexes with join and nonjoin predicates.

• Consider hinting for a nested loop instead of sort and merge operations.

Page 35: Sorting and Joining

10-37 Copyright © 2005, Oracle. All rights reserved.

Minimize Processing

• Changing nested loops joins into sort and merge joins can do the following:– Remove index access– Add sort overhead

• Theoretically, hash joins are the most efficient join operation.

• Cluster joins need less I/O than the corresponding nested loops join.

Page 36: Sorting and Joining

10-38 Copyright © 2005, Oracle. All rights reserved.

Summary

In this lesson, you should have learned how to:• Use the Top-N SQL feature• Describe available join operations• Optimize join performance against different

requirements• Influence the join order• Discover that tuning joins is more complicated

than tuning single table statements