40
Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

Embed Size (px)

Citation preview

Page 1: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

Chapter 06How to code Subqueries

MIT 22033, Database Management Systems

By: S. Sabraz Nawaz

Page 2: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 2

Objectives

Applied

Code SELECT statements that require subqueries.

Code SELECT statements that use common table expressions (CTEs) to define the subqueries.

Knowledge

Describe the way subqueries can be used in the WHERE, HAVING, FROM and SELECT clauses of a SELECT statement.

Describe the difference between a correlated subquery and a noncorrelated subquery.

Explain the difference between a correlated subquery and a noncorrelated subquery.

Given a SELECT statement that uses a subquery, explain how the result set of the subquery will affect the final result set.

Page 3: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 3

How to use subqueries A subquery is a SELECT statement that’s coded within another

SQL statement. A subquery can return a single value, a result set that contains a

single column, or a result set that contains one or more columns. A subquery that returns a single value can be coded, or introduced,

anywhere an expression is allowed. A subquery that returns a single column can be introduced in place

of a list of values, such as the values for an IN phrase. A subquery that returns one or more columns can be introduced in

place of a table in the FROM clause. A subquery that’s used in a WHERE or HAVING clause is called

a subquery search condition or a subquery predicate. This is themost common use for a subquery.

Page 4: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 4

How to use subqueries (continued) The syntax for a subquery is the same as for a standard SELECT

statement. However, a subquery doesn’t typically include the GROUP BY or HAVING clause, and it can’t include an ORDER BY clause unless the TOP phrase is used.

Subqueries can be nested within other subqueries. However, subqueries that are nested more than two or three levels deep can be difficult to read and can result in poor performance.

Page 5: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 5

Four ways to introduce a subquery in a SELECT statement 1. In a WHERE clause as a search condition

2. In a HAVING clause as a search condition

3. In the FROM clause as a table specification

4. In the SELECT clause as a column specification

Page 6: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 6

A SELECT statement that uses a subquery in theWHERE clause

SELECT InvoiceNumber, InvoiceDate, InvoiceTotalFROM InvoicesWHERE InvoiceTotal > (SELECT AVG(InvoiceTotal) FROM Invoices)ORDER BY InvoiceTotal

The value returned by the subquery1879.7413

The result set

(21 rows)

Page 7: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 7

How subqueries compare to joins Like a join, a subquery can be used to code queries that work with

two or more tables. Most subqueries can be restated as joins and most joins can be

restated as subqueries.

Advantages of joins The result set of a join can include columns from both tables. The

result set of a query with a subquery can only include columnsfrom the table named in the outer query, not in the subquery.

A join tends to be more intuitive when it uses an existingrelationship between the two tables, such as a primary key toforeign key relationship.

A query with a join typically performs faster than the same querywith a subquery, especially if the query uses only inner joins.

Page 8: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 8

Advantages of subqueries You can use a subquery to pass an aggregate value to the outer

query. A subquery tends to be more intuitive when it uses an ad hoc

relationship between the two tables. Long, complex queries can sometimes be easier to code using

subqueries.

Page 9: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 9

A query that uses an inner joinSELECT InvoiceNumber, InvoiceDate, InvoiceTotalFROM Invoices JOIN Vendors ON Invoices.VendorID = Vendors.VendorIDWHERE VendorState = 'CA'ORDER BY InvoiceDate

The same query restated with a subquerySELECT InvoiceNumber, InvoiceDate, InvoiceTotalFROM InvoicesWHERE VendorID IN (SELECT VendorID FROM Vendors WHERE VendorState = 'CA')ORDER BY InvoiceDate

The result set returned by both queries

(40 rows)

Page 10: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 10

The syntax of a WHERE clause that uses an INphrase with a subquery

WHERE test_expression [NOT] IN (subquery)

How to use subqueries with the IN operator You can introduce a subquery with the IN operator to provide

the list of values that are tested against the test expression. When you use the IN operator, the subquery must return a

single column of values. A query that uses the NOT IN operator with a subquery can

typically be restated using an outer join.

Page 11: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 11

A query that returns vendors without invoicesSELECT VendorID, VendorName, VendorStateFROM VendorsWHERE VendorID NOT IN (SELECT DISTINCT VendorID FROM Invoices)

The result of the subquery

(34 rows)

Page 12: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 12

The result set of vendors without invoices

(88 rows)

The query restated without a subquerySELECT Vendors.VendorID, VendorName, VendorStateFROM Vendors LEFT JOIN Invoices ON Vendors.VendorID = Invoices.VendorIDWHERE Invoices.VendorID IS NULL

Page 13: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 13

The syntax of a WHERE clause that compares anexpression with the value returned by a subquery

WHERE expression comparison_operator [SOME|ANY|ALL] (subquery)

How to compare the result of a subquery with anexpression You can use a comparison operator in a search condition to

compare an expression with the results of a subquery. If you code a search condition without the SOME, ANY, and ALL

keywords, the subquery must return a single value.

Page 14: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 14

A query that returns invoices with a balance dueless than the average

SELECT InvoiceNumber, InvoiceDate, InvoiceTotal, InvoiceTotal - PaymentTotal - CreditTotal AS BalanceDueFROM InvoicesWHERE InvoiceTotal - PaymentTotal - CreditTotal > 0 AND InvoiceTotal - PaymentTotal - CreditTotal < (SELECT AVG(InvoiceTotal - PaymentTotal – CreditTotal) FROM Invoices WHERE InvoiceTotal - PaymentTotal - CreditTotal > 0)ORDER BY InvoiceTotal DESC

Page 15: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 15

The result of the subquery (the average balancedue)

1669.9060

The result set

(33 rows)

Page 16: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 16

How the ALL keyword works

ConditionEquivalentexpression Description

x > ALL (1, 2) x > 2 x must be greater than themaximum value returned bythe subquery.

x < ALL (1, 2) x < 1 x must be less than theminimum value returned bythe subquery.

x = ALL (1, 2) (x = 1) AND(x = 2)

This condition evaluates toTrue only if the subqueryreturns a single value or if allthe values returned by thesubquery are the same.

x <> ALL (1, 2) (x <> 1) AND(x <> 2)

This condition is equivalent to:x NOT IN (1, 2)

Page 17: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 17

How the ALL keyword works (continued) The ALL keyword tests that a comparison condition is true for all

of the values returned by a subquery.

If no rows are returned by the subquery, a comparison that uses ALL is always true.

If all of the rows returned by the subquery contain a null value, a comparison that uses ALL is always false.

Page 18: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 18

A query that returns invoices that are larger than the largest invoice for vendor 34

SELECT VendorName, InvoiceNumber, InvoiceTotal FROM Invoices JOIN Vendors ON Invoices.VendorID = Vendors.VendorID WHERE InvoiceTotal > ALL (SELECT InvoiceTotal FROM Invoices WHERE VendorID = 34) ORDER BY VendorName

Page 19: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 19

The result of the subquery (the invoice totals forvendor 34)

The result set

(25 rows)

Page 20: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 20

How the ANY and SOME keywords work

ConditionEquivalentexpression Description

x > ANY (1, 2) x > 1 x must be greater than theminimum value returned by thesubquery.

x < ANY (1, 2) x < 2 x must be less than the maximumvalue returned by the subquery.

x = ANY (1, 2) (x = 1) OR(x = 2)

This condition is equivalent to:x IN (1, 2)

x <> ANY (1, 2) (x <> 1) OR(x <> 2)

This condition will evaluate toTrue for any non-empty result setcontaining at least one non-nullvalue that isn’t equal to x.

Page 21: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 21

How the ANY and SOME keywords work (continued) The ANY or SOME keyword tests that a condition is true for at

least one of the values returned by a subquery. SOME is the ANSI-standard keyword, but ANY is more commonly used.

If no rows are returned by the subquery or all of the rows returned by the subquery contain a null value, a comparison that uses ANY or SOME is always false.

Page 22: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 22

A query that returns invoices smaller than the largest invoice for vendor 115

SELECT VendorName, InvoiceNumber, InvoiceTotal FROM Vendors JOIN Invoices ON Vendors.VendorID = Invoices.InvoiceID WHERE InvoiceTotal < ANY (SELECT InvoiceTotal FROM Invoices WHERE VendorID = 115)

Page 23: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 23

The result of the subquery (the invoice totals forvendor 115)

The result set

(17 rows)

Page 24: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 24

How to code correlated subqueries A correlated subquery is a subquery that is executed once for each

row processed by the outer query. A noncorrelated subquery isexecuted only once.

A correlated subquery refers to the value of a column in the outerquery. Because that value varies depending on the row that’s beingprocessed, each execution of the subquery returns a different result.

To refer to a value in the outer query, a correlated subquery uses aqualified column name that includes the table name from the outerquery.

If the subquery uses the same table as the outer query, an alias, orcorrelation name, must be assigned to one of the tables to removeambiguity.

Page 25: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 25

A query that uses a correlated subquery tocalculate the average invoice total for each vendor

SELECT VendorID, InvoiceNumber, InvoiceTotalFROM Invoices AS Inv_MainWHERE InvoiceTotal > (SELECT AVG(InvoiceTotal) FROM Invoices AS Inv_Sub WHERE Inv_Sub.VendorID = Inv_Main.VendorID)ORDER BY VendorID, InvoiceTotal

The value returned by the subquery for vendor 9528.5016

Page 26: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 26

The result set for the query with the correlatedsubquery

(36 rows)

Page 27: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 27

The syntax of a subquery that uses the EXISTSoperator

WHERE [NOT] EXISTS (subquery)

How to use the EXISTS operator You can use the EXISTS operator to test that one or more rows

are returned by the subquery. You can use NOT along with the EXISTS operator to test that no

rows are returned by the subquery. A subquery with the EXISTS operator doesn’t return any rows.

It returns an indication of whether any rows meet the condition. Because no rows are returned by the subquery, it doesn’t matter

what columns you specify in the SELECT clause. The EXISTS operator is used most often with correlated

subqueries.

Page 28: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 28

A query that returns vendors without invoicesSELECT VendorID, VendorName, VendorStateFROM VendorsWHERE NOT EXISTS (SELECT * FROM Invoices WHERE Invoices.VendorID = Vendors.VendorID)

The result set

(88 rows)

Page 29: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 29

How to code subqueries in the FROM clause A subquery that’s coded in the FROM clause returns a result

set called a derived table. When you create a derived table, you must assign an alias to

it. Then, you can use the derived table within the outer queryjust as you would any other table.

When you code a subquery in the FROM clause, you mustassign names to any calculated values in the result set.

Derived tables are most useful when you need to furthersummarize the results of a summary query.

Page 30: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 30

A query that uses a derived table to retrieve thetop 5 vendors by average invoice total

SELECT Invoices.VendorID, MAX(InvoiceDate) AS LatestInv, AVG(InvoiceTotal) AS AvgInvoiceFROM Invoices JOIN (SELECT TOP 5 VendorID, AVG(InvoiceTotal) AS AvgInvoice FROM Invoices GROUP BY VendorID ORDER BY AvgInvoice DESC) AS TopVendor ON Invoices.VendorID = TopVendor.VendorIDGROUP BY Invoices.VendorIDORDER BY LatestInv DESC

Page 31: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 31

The derived table generated by the subquery thatretrieves the top 5 vendors

The result set

Page 32: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 32

How to code subqueries in the SELECT clause When you code a subquery for a column specification in the

SELECT clause, the subquery must return a single value. A subquery that’s coded within a SELECT clause is usually a

correlated subquery. Subqueries are seldom coded in the SELECT clause. Joins are

used instead because they’re generally faster and morereadable.

Page 33: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 33

A query that uses a correlated subquery in its SELECT to retrieve the most recent invoice for each vendor

SELECT DISTINCT VendorName, (SELECT MAX(InvoiceDate) FROM Invoices WHERE Invoices.VendorID = Vendors.VendorID) AS LatestInv FROM Vendors ORDER BY LatestInv DESC

The result set

(122 rows)

Page 34: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 34

The same query using a join instead of asubquery in the SELECT clause

SELECT VendorName, MAX(InvoiceDate) AS LatestInvFROM Vendors JOIN Invoices ON Vendors.VendorID = Invoices.VendorIDGROUP BY VendorNameORDER BY LatestInv DESC

Page 35: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 35

A procedure for building complex queries1. State the problem to be solved by the query in English.

2. Use pseudocode to outline the query. The pseudocode shouldidentify the subqueries used by the query and the data they return.It should also include aliases used for any derived tables.

3. If necessary, use pseudocode to outline each subquery.

4. Code the subqueries and test them to be sure that they return thecorrect data.

5. Code and test the final query.

The problem to be solved Which vendor in each state has the largest invoice total?

Page 36: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 36

Pseudocode for the querySELECT Summary1.VendorState, Summary1.VendorName, TopInState.SumOfInvoicesFROM (Derived table returning VendorState, VendorName, SumOfInvoices) AS Summary1 JOIN (Derived table returning VendorState, MAX(SumOfInvoices)) AS TopInState ON Summary1.VendorState = TopInState.VendorState AND Summary1.SumOfInvoices = TopInState.SumOfInvoicesORDER BY Summary1.VendorState

Page 37: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 37

Pseudocode for the TopInState subquerySELECT Summary2.VendorState, MAX(Summary2.SumOfInvoices)FROM (Derived table returning VendorState, VendorName, SumOfInvoices) AS Summary2GROUP BY Summary2.VendorState

The code for the Summary1 and Summary2subqueries

SELECT V_Sub.VendorState, V_Sub.VendorName, SUM(I_Sub.InvoiceTotal) AS SumOfInvoicesFROM Invoices AS I_Sub JOIN Vendors AS V_Sub ON I_Sub.VendorID = V_Sub.VendorIDGROUP BY V_Sub.VendorState, V_Sub.VendorName

Page 38: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 38

The result of the Summary1 and Summary2subqueries

(34 rows)

The result of the TopInState subquery

(10 rows)

Page 39: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 39

The complete code for the query that uses threesubqueries

SELECT Summary1.VendorState, Summary1.VendorName, TopInState.SumOfInvoicesFROM (SELECT V_Sub.VendorState, V_Sub.VendorName, SUM(I_Sub.InvoiceTotal) AS SumOfInvoices FROM Invoices AS I_Sub JOIN Vendors AS V_Sub ON I_Sub.VendorID = V_Sub.VendorID GROUP BY V_Sub.VendorState, V_Sub.VendorName) AS Summary1 JOIN (SELECT Summary2.VendorState, MAX(Summary2.SumOfInvoices) AS SumOfInvoices FROM (SELECT V_Sub.VendorState, V_Sub.VendorName, SUM(I_Sub.InvoiceTotal) AS SumOfInvoices FROM Invoices AS I_Sub JOIN Vendors AS V_Sub ON I_Sub.VendorID = V_Sub.VendorID GROUP BY V_Sub.VendorState, V_Sub.VendorName) AS Summary2 GROUP BY Summary2.VendorState) AS TopInState ON Summary1.VendorState = TopInState.VendorState AND Summary1.SumOfInvoices = TopInState.SumOfInvoicesORDER BY Summary1.VendorState

Page 40: Chapter 06 How to code Subqueries MIT 22033, Database Management Systems By: S. Sabraz Nawaz

MIT 22033 By. S. Sabraz Nawaz Slide 40

The result set of the query with three subqueries

(10 rows)