34
Lecture 4 Additional Slides CSE 344, Winter 2014 Sudeepa Roy 1 CSE 344 - Winter 2014

Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Lecture 4 Additional Slides

CSE 344, Winter 2014

Sudeepa Roy

1 CSE 344 - Winter 2014

Page 2: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

• NOTE: These slides were used as a whiteboard in class, and mostly contain the same material as the lecture slides and sqlite notes.

2 CSE 344 - Winter 2014

Page 3: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Announcements

• Webquiz 1 is due on Tuesday (01/14): 11:59 pm • Webquiz 2 will be posted on Wednesday (01/15):

00:00 am • Homework 1 is due on Thursday (01/16)

• This Wednesday’s (01/15) office hour: 10-10:50 am,

CSE 344 (Sudeepa)

• Lecture 5-8 (SQL) by Daniel Halperin

CSE 344 - Winter 2014 3

Page 4: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Outline

• Nulls in SQL (6.1.6)

• Outer joins (6.3.8)

• Aggregations (6.4.3 – 6.4.6)

• Examples, examples, examples…

CSE 344 - Winter 2014 4

Page 5: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

NULL In SqLite

pname price category manufacturer

Gizmo 19.99 gadget GizmoWorks

PowerGizmo 29.99 gadget GizmoWorks

SingleTouch 149.99 photography Canon

MultiTouch 199.99 photography Hitachi

SuperGizmo 49.99 gadget Hitachi

iPad 5 NULL gadget Apple

cname country

GizmoWorks USA

Canon Japan

Hitachi Japan

Google USA

Product Company

SELECT gadegts with price < 25 and >=25 iPad5 is nowhere!

• Product(pname, price, category, manufacturer) • Company(cname, country)

5 CSE 344 - Winter 2014

Page 6: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Conditions involving NULL

• We need to evaluate in SQL conditions like this:

• ((price < 25) and (category = 'gadget')) or (manufacturer = 'Apple')

• Suppose price = 19, category = NULL, and manufacturer = NULL

• Is the predicate true or false?

6 CSE 344 - Winter 2014

Page 7: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

3-valued logic

• FALSE = 0 – E.g. price<25 is FALSE when price=99

• UNKNOWN = 0.5 – E.g. price<25 is UNKNOWN when price=NULL

• TRUE = 1 – E.g. price<25 is TRUE when price=19

• PREDICATES:

– C1 AND C2 means min(C1,C2) – C1 OR C2 means max(C1,C2) – not C means 1-C

• Answer appears only if the truth value is TRUE • Does not appear if the truth value is FALSE or UNKNOWN • Therefore, no iPad5!

7 CSE 344 - Winter 2014

Page 8: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Compute the truth value

• Your answer in class :

= (TRUE and UNKNOWN) OR UNKNOWN

= UNKNOWN OR UNKNOWN

= UNKNOWN

• ((price < 25) and (category = 'gadget‘)) or (manufacturer = 'Apple')

• Suppose price = 19, category = NULL, and manufacturer = NULL

8 CSE 344 - Winter 2014

Page 9: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

OUTERJOIN

9 CSE 344 - Winter 2014

Page 10: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

JOIN = INNER JOIN

• Select pname, cname from Product, Company where manufacturer = cname;

• Select pname, cname from Product (INNER) JOIN Company ON manufacturer = cname;

• Where is ipad5 or Google? pname price category manufacturer

Gizmo 19.99 gadget GizmoWorks

PowerGizmo 29.99 gadget GizmoWorks

SingleTouch 149.99 photography Canon

MultiTouch 199.99 photography Hitachi

SuperGizmo 49.99 gadget Hitachi

iPad 5 NULL gadget Apple

cname country

GizmoWorks USA

Canon Japan

Hitachi Japan

Google USA

Product Company

optional

10 CSE 344 - Winter 2014

Page 11: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

LEFT OUTER JOIN

• Include the left tuple even if there’s no match • Select pname, cname from Product LEFT OUTER JOIN

Company ON manufacturer = cname; • You should get (iPad5, null) • Reverse the order of relations.

pname price category manufacturer

Gizmo 19.99 gadget GizmoWorks

PowerGizmo 29.99 gadget GizmoWorks

SingleTouch 149.99 photography Canon

MultiTouch 199.99 photography Hitachi

SuperGizmo 49.99 gadget Hitachi

iPad 5 NULL gadget Apple

cname country

GizmoWorks USA

Canon Japan

Hitachi Japan

Google USA

Product Company

11 CSE 344 - Winter 2014

Page 12: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

RIGHT OUTER JOIN

• Include the right tuple even if there’s no match • Select pname, cname from Product RIGHT OUTER

JOIN Company ON manufacturer = cname; • You should get (null, Google)

pname price category manufacturer

Gizmo 19.99 gadget GizmoWorks

PowerGizmo 29.99 gadget GizmoWorks

SingleTouch 149.99 photography Canon

MultiTouch 199.99 photography Hitachi

SuperGizmo 49.99 gadget Hitachi

iPad 5 NULL gadget Apple

cname country

GizmoWorks USA

Canon Japan

Hitachi Japan

Google USA

Product Company

12 CSE 344 - Winter 2014

Page 13: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

FULL OUTER JOIN

• Include both left and right tuples even if there’s no match

• Select pname, cname from Product (FULL) OUTER JOIN Company ON manufacturer = cname;

• You should get both (null, Google) and (iPad5, null)

pname price category manufacturer

Gizmo 19.99 gadget GizmoWorks

PowerGizmo 29.99 gadget GizmoWorks

SingleTouch 149.99 photography Canon

MultiTouch 199.99 photography Hitachi

SuperGizmo 49.99 gadget Hitachi

iPad 5 NULL gadget Apple

cname country

GizmoWorks USA

Canon Japan

Hitachi Japan

Google USA

Product Company

13 CSE 344 - Winter 2014

Page 14: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

AGGREGATION IN SQL

14 CSE 344 - Winter 2014

Page 15: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Aggregation in SQL

CSE 344 - Winter 2014 15

Other DBMSs have other ways of importing data

Specify a filename where the database will be stored

>sqlite3 lecture04

sqlite> create table Purchase

(pid int primary key,

product text,

price float,

quantity int,

month varchar(15));

sqlite> -- download data.txt

sqlite> .import data.txt Purchase

Page 16: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Comment about SQLite

• One cannot load NULL values such that they are actually loaded as null values

• So we need to use two steps:

– Load null values using some type of special value

– Update the special values to actual null values

CSE 344 - Winter 2014 16

update Purchase

set price = null

where price = ‘null’

Page 17: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Simple Aggregations

Five basic aggregate operations in SQL

CSE 344 - Winter 2014 17

select count(*) from Purchase

select sum(quantity) from Purchase

select avg(price) from Purchase

select max(quantity) from Purchase

select min(quantity) from Purchase

Page 18: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Aggregates and NULL Values

18

insert into Purchase

values(12, 'gadget', NULL, NULL, 'april')

select count(*) from Purchase

select count(quantity) from Purchase

select sum(quantity) from Purchase

select sum(quantity)

from Purchase

where quantity is not null;

Null values are not used in aggregates

Let’s try the following

CSE 344 - Winter 2014

Page 19: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

COUNT applies to duplicates, unless otherwise stated:

SELECT Count(product)

FROM Purchase

WHERE price > 4.99

same as Count(*)

We probably want:

SELECT Count(DISTINCT product)

FROM Purchase

WHERE price> 4.99

Counting Duplicates

CSE 344 - Winter 2014 19

Page 20: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

More Examples

SELECT Sum(price * quantity)

FROM Purchase

SELECT Sum(price * quantity)

FROM Purchase

WHERE product = ‘bagel’

What do

they mean ?

CSE 344 - Winter 2014 20

Page 21: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Simple Aggregations

Purchase

SELECT Sum(price * quantity)

FROM Purchase

WHERE product = ‘Bagel’ 90 (= 60+30)

Product Price Quantity

Bagel 3 20

Bagel 1.50 20

Banana 0.5 50

Banana 2 10

Banana 4 10

CSE 344 - Winter 2014 21

Page 22: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

GROUP BY: Grouping and Aggregation

Purchase(product, price, quantity)

SELECT product, Sum(quantity) AS TotalSales

FROM Purchase

WHERE price > 1

GROUP BY product

Let’s see what this means…

Find total quantities for all sales over $1, by product.

CSE 344 - Winter 2014 22

Page 23: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Grouping and Aggregation

1. Compute the FROM and WHERE clauses.

2. Group by the attributes in the GROUPBY

3. Compute the SELECT clause:

grouped attributes and aggregates.

CSE 344 - Winter 2014 23

Page 24: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

1&2. FROM-WHERE-GROUPBY

Product Price Quantity

Bagel 3 20

Bagel 1.50 20

Banana 0.5 50

Banana 2 10

Banana 4 10

CSE 344 - Winter 2014 24

WHERE price > 1

SELECT product,

Sum(quantity) AS TotalSales

FROM Purchase

WHERE price > 1

GROUP BY product

Page 25: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

3. SELECT

SELECT product, Sum(quantity) AS TotalSales

FROM Purchase

WHERE price > 1

GROUP BY product

Product TotalSales

Bagel 40

Banana 20

Product Price Quantity

Bagel 3 20

Bagel 1.50 20

Banana 0.5 50

Banana 2 10

Banana 4 10

25 CSE 344 - Winter 2014

Page 26: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Other Examples

SELECT product,

sum(quantity) AS SumQuantity,

max(price) AS MaxPrice

FROM Purchase

GROUP BY product

What does

it mean ?

CSE 344 - Winter 2014

SELECT product, count(*)

FROM Purchase

GROUP BY product

SELECT month, count(*)

FROM Purchase

GROUP BY month

Compare these

two queries:

26

Page 27: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Need to be Careful…

SELECT product, max(quantity)

FROM Purchase

GROUP BY product

SELECT product, quantity

FROM Purchase

GROUP BY product

sqlite is WRONG on this query. Advanced DBMS (e.g. SQL Server) gives an error

(the attribute in SELECT must appear in an aggregate function OR GROUP BY clause)

Product Price Quantity

Bagel 3 20

Bagel 1.50 20

Banana 0.5 50

Banana 2 10

Banana 4 10

27 CSE 344 - Winter 2014

Page 28: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

What does this query do?

• Your answer in class :

Bagel september

Banana september

Gizmo march

Gadget march

Orange may

28 CSE 344 - Winter 2014

• select product, max(month) from purchase group by product;

Page 29: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Understanding groups -- 1

• select * from purchase;

• How many tuples?

• Answer: 12

29 CSE 344 - Winter 2014

Page 30: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Understanding groups -- 2

• Your answer in class : • Bagel 2 • Orange 1 • Banana 2 • Gadget 4 • Gizmo 3

30 CSE 344 - Winter 2014

• select product, count(*) from purchase group by product;

Page 31: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Understanding groups - 3

• Your answer in class:

• Bagel 1

• Banana 0 (xxxxx WON’T APPEAR)

• Orange 0 (xxxxx WON’T APPEAR)

• Gizmo 3

• Gadget 3

31 CSE 344 - Winter 2014

• select product, count(*) from purchase group by product;

Always remember that first WHERE

condition is applied, then GROUP BY is

performed, so these tuples with

0 value won’t appear in the

results

Page 32: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

DISTINCT = GROUP BY

• select month, count(*) from purchase group by month;

• BUT AGGREGATES ARE NOT NECESSARY WITH GROUP BY

• select month from purchase group by month;

• IS EQUIVALENT TO

• select distinct month from purchase;

32 CSE 344 - Winter 2014

Page 33: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

Ordering Results

CSE 344 - Winter 2014

SELECT product, sum(price*quantity) as rev

FROM purchase

GROUP BY product

ORDER BY rev desc;

33

Page 34: Lecture 4 Additional Slides - University of Washington · Lecture 4 Additional Slides CSE 344, Winter 2014 ... in class, and mostly contain the same material as the lecture slides

GROUP BY with Multiple Attributes

• Try the query you wrote in class

• select pid, pname, count(*) from purchase group by pid, pname;

CSE 344 - Winter 2014 34