Upload
crystal-walton
View
223
Download
0
Embed Size (px)
DESCRIPTION
3 SQL Writing Process Step 1: What information do I need? Columns Step 2: Where is it? Tables Step 3: Write SQL: SELECT columns FROM tables WHERE... (joins, filters, subqueries) I'M FINISHED!
Citation preview
1
Presentation Outline
• SQL Writing Process
• SQL Standards
• Using Indexes
• The Optimizer
• FROM, WHERE Clauses
• EXPLAIN
• SQL Trace
• Sub-Selects and Joins
• Tips and Tricks
2
Caveat
Although many of these principles apply
to all databases, Oracle will be used in
the examples.
3
SQL Writing Process
Step 1: What information do I need? Columns
Step 2: Where is it? Tables
Step 3: Write SQL:
SELECT columnsFROM tables
WHERE ... (joins, filters, subqueries)
I'M FINISHED!
4
SQL Writing Process• YOU'RE NOT FINISHED YET! You've got the results
you want, but at what cost?
• There are many, many ways to get the right results, but only one is the fastest way—1000-to-1 improvements are attainable!
• Inefficient SQL can dramatically degrade the performance of the entire system
• Developers and DBAs must work together to tune the database and the application
5
Pre-Tuning Questions• How long is too long?
• Is the statement running on near-production volumes?
• Is the optimal retrieval path being used?
• How often will it execute?
• When will it execute?
6
SQL StandardsWhy are SQL standards important?
• Maintainability, readability
• Performance: If SQL is the same as a (recently) executed statement, it can be re-used instead of needing to be reparsed
7
SQL StandardsQuestion: which of these statements are the same?
A. SELECT LNAME FROM EMP WHERE EMPNO = 12;
B. SELECT lname FROM emp WHERE empno = 12;
C. SELECT lname FROM emp WHERE empno = :id;
D. SELECT lname FROM emp WHERE empno = 12;
8
SQL Standards• Answer: None
• Whitespace, case, bind variables vs. constants all matter
• Using standards helps to ensure that equivalent SQL can be reused.
9
Tables Used in the Examples
deptno dnameloc
DEPTempno mgrjobdeptnofnamelnamecommhiredategradesal
EMPgrade losalhisal
SALGRADE
10
SELECT E.empno,
D.dname
FROM emp E,
dept D
WHERE E.deptno = D.deptno
AND (D.deptno = :vardept
OR E.empno = :varemp);
SQL Standards: Example
Keywords upper case and left-aligned
Columns on new lines
Use std. table aliases
Separate w/ one space
Use bind variables
AND/OR on new lines
No space before/after parentheses
11
Indexes: What are they?• An index is a database object used to speed retrieval
of rows in a table.
• The index contains only the indexed value--usually the key(s)--and a pointer to the row in the table.
• Multiple indexes may be created for a table
• Not all indexes contain unique values
• Indexes may have multiple columns (e.g., Oracle allows up to 32)
12
• If a column appears in a WHERE clause it is a candidate for being indexed.
• If a column is indexed the database can used the index to find the rows instead of scanning the table.
• If the column is not referenced properly, however, the database may not be able to used the index and will have to scan the table anyway.
• Knowing what columns are and are not indexed can help you write more efficient SQL
Indexes and SQL
13
No index exists for column EMPNO on table EMP, so a table scan must be performed:
Example: Query without Index
empno fname lname...4 lisa baker9 jackie miller1 john larson3 larry jones5 jim clark2 mary smith7 harold simmons8 mark burns6 gene harris
Table: EMPSELECT *FROM empWHERE empno = 8
14
7 8 9
Example: Query with Index
empno fname lname ...4 lisa baker9 jackie miller1 john larson3 larry jones5 jim clark2 mary smith7 harold simmons8 mark burns6 gene harris
Table: EMP
5Index: PK_EMPEMP (EMPNO)
1, 4 5, 9
1 2
3 4
5 6
SELECT *FROM empWHERE empno = 8
Column EMPNO is indexed, so it can be used to find the requested row:
15
• Sometimes a table scan cannot be avoided
• Not every column should be indexed--there is performance overhead on Inserts, Updates, Deletes
• Small tables may be faster with a table scan
• Queries returning a large number (> 5-20%) of the rows in the table may be faster with a table scan
Indexes: Caveats
16
Example: Index on (EMPNO, DEPTNO)
SELECT *FROM emp WHERE deptno = 10;
SELECT *FROM emp WHERE empno > 0AND deptno = 10;
Must use the leading column(s) of the index for the index to be used
Indexes: Column Order
Will NOT use index
WILL use index
17
Using a function, calculation, or other operation on an indexed column disables the use of the Index
SELECT *FROM emp WHERE TRUNC(hiredate) = TRUNC(SYSDATE); ...WHERE fname || lname = 'MARYSMITH';
SELECT *FROM emp WHERE hiredate BETWEEN TRUNC(SYSDATE) AND TRUNC(SYSDATE)+1...WHERE fname = 'MARY' AND lname = 'SMITH';
Indexes: Functions
Will NOT use index
WILL use index
18
Using NOT excludes indexed columns:
SELECT *FROM dept WHERE deptno != 0; ... deptno NOT = 0;... deptno IS NOT NULL;
SELECT *FROM dept WHERE deptno > 0;
Indexes: NOT
Will NOT use index
WILL use index
19
The Optimizer• The WHERE/FROM rules on the following pages apply
to the Rule-based optimizer (Oracle).
• If the Cost-based Optimizer is used, Oracle will attempt to reorder the statements as efficiently as possible (assuming statistics are available).
• DB2 and Sybase use only a Cost-based optimizer
• The Optimizer's access paths can be overridden in Oracle and Sybase (not DB2)
20
The Optimizer: HintsReturn the first rows in the result set as fast as possible:
SELECT /*+ FIRST_ROWS */ empnoFROM emp E dept D, WHERE E.deptno = D.deptno;
Force Optimizer to use index IDX_HIREDATE:
SELECT /*+ INDEX (E idx_hiredate) */ empnoFROM emp EWHERE E.hiredate > TO_DATE('01-JAN-2000');
21
FROM Clause: Driving TableSpecify the driving table last in the FROM Clause:
SELECT *FROM dept D, -- 10 rows emp E -- 1,000 rowsWHERE E.deptno = D.deptno;
SELECT *FROM emp E, -- 1,000 rows dept D -- 10 rowsWHERE E.deptno = D.deptno;
Driving table is EMP
Driving table is DEPT
22
FROM Clause: Intersection Table
When joining 3 or more tables, use the Intersection table (with the most shared columns) as the driving table:
SELECT *FROM dept D, salgrade S, emp EWHERE E.deptno = D.deptnoAND E.grade = S.grade;
EMP shares columns with DEPT and SALGRADE, so use as the driving table
23
WHERE: Discard EarlyUse WHERE clauses first which discard the maximum number of rows:
SELECT *FROM emp EWHERE E.empno IN (101, 102, 103)AND E.deptno > 10;
3 rows 90,000 rows
24
WHERE: AND Subquery FirstWhen using an "AND" subquery, place it first:SELECT *FROM emp EWHERE E.sal > 50000AND 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)
SELECT *FROM emp EWHERE 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)AND E.sal > 50000
CPU = 156 sec
CPU = 10 sec
25
WHERE: OR Subquery LastWhen using an "OR" subquery, place it last:SELECT *FROM emp EWHERE 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)OR E.sal > 50000
SELECT *FROM emp EWHERE E.sal > 50000OR 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)
CPU = 100 sec
CPU = 30 sec
26
WHERE: Filter First, Join LastWhen Joining and Filtering, specify the Filter condition first, Joins last.
SELECT *FROM emp E, dept DWHERE (E.empno = 123OR D.deptno > 10)AND E.deptno = D.deptno;
Filter criteria
Join criteria
27
Subqueries: IN vs. EXISTSUse EXISTS instead of IN in subqueries:
SELECT E.*FROM emp EWHERE E.deptno IN ( SELECT D.deptno FROM dept D WHERE D.dname = 'SALES');
SELECT *FROM emp EWHERE EXISTS ( SELECT 'X' FROM dept D WHERE D.deptno = E.deptno AND D.dname = 'SALES');
IN: Both tables are scanned
EXISTS: Only outer table is scanned; subquery uses index
28
Subquery vs. JoinUse Join instead of Subquery :
SELECT *FROM emp EWHERE E.deptno IN ( SELECT D.deptno FROM dept D WHERE D.dname = 'SALES');
SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;
IN: Both tables are scanned
JOIN: Only one table is scanned, other uses index
29
Join vs. EXISTSBest performance depends on subquery/driving table:
SELECT *FROM emp EWHERE EXISTS ( SELECT 'X' FROM dept D WHERE D.deptno = E.deptno AND D.dname = 'SALES');
SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;
EXISTS: better than Join if the number of matching rows in DEPT is small
JOIN: better than Exists if the number of matching rows in DEPT is large
30
Explain
Display the access path the database will use (e.g., use of indexes, sorts, joins, table scans)
• Oracle: EXPLAIN• Sybase: SHOWPLAN• DB2: EXPLAIN
Oracle Syntax:EXPLAIN PLAN SET STATEMENT_ID = 'statement id'INTO PLAN_TABLE FOR
statement
Requires Select/Insert privileges on PLAN_TABLE
31
Explain
Example 1: “IN” subquerySELECT *FROM emp EWHERE E.deptno IN ( SELECT D.deptno FROM dept D WHERE D.dname = 'SALES');
Result:MERGE JOIN SORT (JOIN) TABLE ACCESS (FULL) OF EMP SORT (JOIN) VIEW SORT (UNIQUE) TABLE ACCESS (FULL) OF DEPT
3 joins1 dynamic view 2 table scans3 sorts
32
Explain Example 2: "EXISTS" subquerySELECT *FROM emp eWHERE EXISTS ( SELECT 'x' FROM dept d WHERE d.deptno = e.deptno AND d.dname = 'SALES');
Result:FILTER TABLE ACCESS (FULL) OF EMP TABLE ACCESS (BY INDEX ROWID) OF DEPT INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)
1 table scan1 index scan1 index access
33
Explain
Example 3: Join (no subquery)SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;
Result:NESTED LOOPS TABLE ACCESS (FULL) OF EMP TABLE ACCESS (BY INDEX ROWID) OF DEPT INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)
1 table scan1 index scan1 index access
34
SQL Trace
Use SQL Trace to determine the actual time and resource costs for for a statement to execute.
Step 1: ALTER SESSION SET SQL_TRACE TRUE;
Step 2: Execute SQL to be traced:SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;
Step 3: ALTER SESSION SET SQL_TRACE FALSE;
35
SQL Trace
Step 4: Trace file is created in <USER_DUMP_DEST> directory on the server (specified by the DBA).
Step 5: Run TKPROF (UNIX) to create a formatted output file:
tkprof echd_ora_15319.trc $HOME/prof.out table=plan_table explain=dbuser/passwd
Trace fileFormatted output filedestination for Explainuser/passwd for Explain
36
SQL Trace
Step 6: view the output file:
...SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES' AND D.deptno = E.deptno;
call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 0 0Execute 1 0.00 0.00 0 0 0 0Fetch 2 0.00 0.00 4 19 3 6------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.00 0.00 4 19 3 6
Misses in library cache during parse: 0Optimizer goal: CHOOSEParsing user id: 62 (PMARKS)
Rows Row Source Operation------- --------------------------------------------------- 6 NESTED LOOPS 14 TABLE ACCESS FULL EMP 14 TABLE ACCESS BY INDEX ROWID DEPT 14 INDEX UNIQUE SCAN (object id 4628)
TIMED_STATISTICS must be turned on to get these values
EXPLAIN output
37
Tips and Tricks: UNION ALLUse UNION ALL instead of UNION if there are no duplicate rows (or if you don't mind duplicates):
SELECT * FROM empUNION SELECT * FROM emp_arch;
SELECT * FROM empUNION ALLSELECT * FROM emp_arch;
UNION: requires sort
UNION ALL: no sort
38
Tips and Tricks: HAVING vs. WHEREWith GROUP BY, use WHERE instead of HAVING (if the filter criteria does not apply to a group function):
SELECT deptno, AVG(sal)FROM empGROUP BY deptnoHAVING deptno IN (10, 20);
SELECT deptno, AVG(sal)FROM empWHERE deptno IN (10, 20)GROUP BY deptno;
HAVING: rows are filtered after result set is returned
WHERE: rows are filtered first--possibly far fewer to process
39
Tips and Tricks: EXISTS vs DISTINCTUse EXISTS instead of DISTINCT to avoid implicit sort (if the column is indexed):
SELECT DISTINCT e.deptno, e.lname FROM dept d, emp eWHERE d.deptno = e.deptno;
SELECT e.deptno, e.lname FROM emp eWHERE EXISTS ( SELECT 'X' FROM dept d WHERE d.deptno = e.deptno);
DISTINCT: implicit sort is performed to filter duplicate rows
EXISTS: no sort
40
Tips and Tricks: Consolidate SQL
Select from Sequences and use SYSDATE in the statement in which they are used:
SELECT SYSDATE INTO :vardate FROM dual;
SELECT arch_seq.NEXTVAL INTO :varid FROM dual;
INSERT INTO archiveVALUES (:vardate, :varid, ...)
INSERT INTO emp_archive VALUES (SYSDATE, emp_seq.NEXTVAL, ...)
BEFORE: 3 statements are used to perform 1 Insert
AFTER: only 1 statement is needed
41
Tips and Tricks: Consolidate SQL
Consolidate unrelated statements using outer-joins to the the DUAL (dummy) table:
SELECT dname FROM dept WHERE deptno = 10;SELECT lname FROM emp WHERE empno = 7369;
SELECT d.dname, e.lnameFROM dept d, emp e, dual xWHERE d.deptno (+) = 10 AND e.empno (+) = 7369AND NVL('X', x.dummy) = NVL('X', e.ROWID (+))AND NVL('X', x.dummy) = NVL('X', d.ROWID (+));
BEFORE: 2 round-trips
AFTER: only 1 round-trip
42
Tips and Tricks: COUNT
Use COUNT(*) instead of COUNT(column):
SELECT COUNT(empno)FROM emp;
SELECT COUNT(*)FROM emp; ~ 50% faster
43
Tips and Tricks: Self-Join
Use a self-join (joining a table to itself) instead of two queries on the same table:
SELECT mgr INTO :varmgr FROM emp WHERE deptno = 10;LOOP... SELECT mgr, lname FROM emp WHERE mgr = :varmgr;
SELECT E.mgr, E.lnameFROM emp E, emp MWHERE M.deptno = 10AND E.empno = M.mgr;
BEFORE: 2 round-trips
AFTER: only 1
44
Tips and Tricks: ROWNUM
Use the ROWNUM pseudo-column to return only the first N rows of a result set. (For example, if you just want a sampling of data):
SELECT *FROM emp WHERE ROWNUM <= 10;
Returns only the first 10 employees in the table, in no particular order
45
Tips and Tricks: ROWID
The ROWID pseudo-column uniquely identifies a row, and is the fastest way to access a row:
CURSOR retired_emp_cur IS SELECT ROWID FROM emp WHERE retired = 'Y';...FOR retired_emp_rec IN retired_emp_cur LOOP SELECT fname || ' ' || lname INTO :printable_name FROM emp WHERE ROWID = retired_emp_rec.ROWID; ...
Instead of selecting the key column(s), ROWID is used to identify the row for later use
46
Tips and Tricks: Sequences
Use a Sequence to generate unique values for a table:
SELECT MAX(empno) INTO :new_empno FROM emp;...INSERT INTO emp VALUES (:new_empno + 1, ...);
INSERT INTO emp VALUES (emp_seq.NEXTVAL, ...); or SELECT emp_seq.NEXVAL INTO :new_empno FROM dual;
Using a Sequence ensures that you always have a unique number, and does not require any table reads
MAX(empno) requires a sort and an index scan
INSERT could fail with a Duplicate error if someone else gets there first
47
Tips and Tricks: Connect By
Use CONNECT BY to construct hierarchical queries:
SELECT LPAD(' ',4*(LEVEL-1)) || lname Name, JobFROM emp WHERE job != 'CLERK'START WITH job = 'PRESIDENT' CONNECT BY PRIOR empno = mgr; Name Job
King PRESIDENT Jones MANAGER Scott ANALYST Ford ANALYST Blake MANAGER Allen SALESMAN Ward SALESMAN Martin SALESMAN Turner SALESMAN Clark MANAGER
48
Tips and Tricks: Cartesian Products
Avoid Cartesian products by ensuring that the tables are joined on all shared keys:
SELECT * FROM dept, -- 10 rows salgrade, -- 20 rows emp; -- 1,000 rows
SELECT * FROM dept, -- 10 rows salgrade, -- 20 rows emp -- 1,000 rowsWHERE E.deptno = D.deptnoAND E.grade = S.grade;
10 * 1000 * 20 = 200,000 rows
1,000 rows
49
Tips and Tricks: TOAD
• Tool for Oracle Application Developers
• Oracle only! Requires Oracle SQL*Net client software
• Freeware tool for viewing/updating Oracle objects
• http://www.toadsoft.com or s:\tempfile\toad\toadfree.zip
50
Tips and Tricks: TOAD
CTRL+E displays EXPLAIN PLAN
SQL result set displayed in grid
51
Tips and Tricks: TOAD
Table/view data in an editable grid
Indexes, constraints, grants, etc. for the current table
All tables/views for a selected schema