44
CSC 240 (Blum) 1 Joins

CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

Embed Size (px)

Citation preview

Page 1: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 1

Joins

Page 2: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 2

Relational algebra

• Recall relational algebra was the study of actions that are performed on one or more tables and give as a result another table. – The action is called an operation. – The things acted upon (tables in this case) are

known as operands.

Page 3: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 3

Basic Operations

• The basic operations were – Selection: picking rows that satisfy some

condition (predicate) from the table.– Projection: picking columns from the table.– Union, intersection and set difference: basic

set operations that apply to union-compatible tables.

– Cartesian product: concatenate two rows, one from each table; make all such combinations.

Page 4: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 4

The Join Operation• An inner join of two tables is a Cartesian product

operation followed by a selection operation (and possibly followed by a projection operation).

• If one straightforwardly implements a join, the Cartesian product intermediary can be huge.

• On the other hand, an earlier introduction of the selection condition may require a lot of searching (for matches).

• This is a reason that relational database management systems (RDBMs) can exhibit performance problems.

Page 5: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 5

Variations of the join operation

• Theta join

• Equijoin (a particular type of Theta join)

• Natural join (a projection of an Equijoin)

• Outer join (handles unmatched records differently)

• Semijoin

Page 6: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 6

Theta join (-join)• The restriction condition selecting from the Cartesian

product does not have to be an equality, it could be any comparison operator such as – Greater than (>)– Greater than or equal to (>=)– Less than (<) – Less than or equal to (<=)– Not equal to (<>)

• Using general condition to restrict the Cartesian product is known as a Theta join.

• R FS (R and S are tables, F is a condition)

Page 7: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 7

Theta Join Example• You have a table of customers who have a

budget. • You have a table of items which have a

price. • You want to advertise your items to

customers who can afford them. • The desired relationship is an inequality, a

person’s budget should be greater than the price of the item.

Page 8: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 8

Theta Join Example: Advertising to Customers who can afford an item

The tables

Note that both have fields called ID, Access may be fooled into thinking this is the basis for a relationship.

Page 9: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 9

Theta Join Example: Advertising to Customers who can afford an item

Right click on relationship line to eliminate.

Page 10: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 10

Theta Join Example: Advertising to Customers who can afford an item

Choose fields to be displayed (projection).

Page 11: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 11

Theta Join Example: Advertising to Customers who can afford an item

No condition imposed yet, just a Cartesian product with projection.

Page 12: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 12

Theta Join Example: Advertising to Customers who can afford an item

Cartesian product projected but not restricted.

Page 13: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 13

Theta Join Example: Advertising to Customers who can afford an item

Condition added. Since it’s an inequality, this is a Theta Join.

Also added Group By so the results would be grouped by Item.

Page 14: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 14

Theta Join Example: Advertising to Customers who can afford an item

Page 15: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 15

Theta Join Example: Advertising to Customers who can afford an item

Page 16: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 16

Equijoin

• The Equijoin is a special case of the Theta join in which the restriction condition is equality.

• Example: a list of orders and the people placing them.

Page 17: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 17

Equijoin Example: a list of orders and the people that placed them

Page 18: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 18

Equijoin Example: a list of orders and the people who placed them

Condition is equality, making this an Equijoin.

Page 19: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 19

Equijoin Example: a list of orders and the people who placed them

Order.CustomerID matches Customer.CustomerID even though Access is showing lastnames instead.

Page 20: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 20

The Natural Join

• Note that the previous join had both of the matching columns (Order.CustomerID and Customer.CustomerID)

• A join that projects out one of the matching columns is known as a Natural Join.

Page 21: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 21

Natural Join Example (using Wizard)

Page 22: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 22

Natural Join Example (using Wizard)

Page 23: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 23

Natural Join Example (using Wizard)

Page 24: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 24

Natural Join Example (using Wizard)

Projecting out matching column is what makes this a Natural join.

Page 25: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 25

Natural Join Example (using Wizard)

Does counts, totals etc. instead of listing individual records.

Page 26: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 26

Natural Join Example (using Wizard)

Page 27: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 27

Natural Join Example (using Wizard)

Where’s Betty Rubble?

Page 28: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 28

Semijoin• Not all of the Customers have matches in the Order

Table. – By match we mean they have no order with that particular

CustomerID.

• If we select out those rows from the Customer table that do have a match in the Order table, we have a Semijoin.

• Semijoins can be useful in distributed systems. You can cut down on the amount of information you send across the network.– There may be more processing at the other end.

Page 29: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 29

Semijoin: Customer Orders

Two tables joined, but only one displayed in results. A semijoin.

Page 30: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 30

Semijoin: Customers who have placed orders

Jane Doe appears twice.

Page 31: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 31

Semijoin: Customers who have placed orders (SQL View)

Page 32: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 32

Semijoin: DISTINCT customers who have placed orders (SQL View)

Page 33: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 33

Semijoin: DISTINCT customers who have placed orders (DataSheet View)

Page 34: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 34

Outer Join: Bringing Back Betty

• All of the previous Equijoins have been what are called Inner Joins.

• If a record from one table does not have a match in the other table, it is eliminated.

• If this elimination feature is not desired, then you want to use an Outer Join.

• The Outer Join keeps records that do not have matches.

• R S

Page 35: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 35

Access Help: Join Type

Page 36: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 36

Inner Join: Customers and orders

Inner Join

Page 37: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 37

Inner Join: Customers and orders

Page 38: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 38

Inner Join: Customers and orders

Still Inner

Page 39: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 39

Converting to Outer Join: Right Click on Relationship Line and choose Join Properties

Page 40: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 40

Join Properties dialog box

Page 41: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 41

Outer Join: Customers and orders

Was a line, now is an arrow

Page 42: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 42

Outer Join: Customers with or without orders

Page 43: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 43

Outer Join: Customers and orders

Customers who have not placed orders.

Page 44: CSC 240 (Blum)1 Joins. CSC 240 (Blum)2 Relational algebra Recall relational algebra was the study of actions that are performed on one or more tables

CSC 240 (Blum) 44

References

• Database Systems, Rob and Coronel

• Database Systems, Connolly and Begg