34
Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof. Jeffrey D. Ullm an distributes via his course web page ( http://infolab.stanford.edu/~ullman/dscb/gslides.html )

Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

Embed Size (px)

Citation preview

Page 1: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

Databases : Relational Algebra- Complex Expression

2007, Fall

Pusan National University

Ki-Joune Li

These slides are made from the materials that Prof. Jeffrey D. Ullman distributes via his course web page (http://infolab.stanford.edu/~ullman/dscb/gslides.html)

Page 2: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

2

STEMPNU

Building Complex Expressions

Algebras allow us to express sequences of operations in a natural way.

Example: in arithmetic --- (x + 4)*(y - 3).

Relational algebra allows the same. Three notations, just as in arithmetic:

Sequences of assignment statements. Expressions with several operators. Expression trees.

Page 3: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

3

STEMPNU

Sequences of Assignments

Create temporary relation names. Renaming can be implied by giving relations a list of

attributes. Example: R3 := R1 JOINC R2 can be written:

R4 := R1 X R2

R3 := SELECTC (R4)

Page 4: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

4

STEMPNU

Expressions in a Single Assignment

Example: the theta-join R3 := R1 JOINC R2 can be written: R3 := SELECTC (R1 X R2)

Precedence of relational operators:1. Unary operators --- select, project, rename --- have highest

precedence, bind first.2. Then come products and joins.3. Then intersection.4. Finally, union and set difference bind last.

But you can always insert parentheses to force the order you desire.

Page 5: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

5

STEMPNU

Expression Trees

Leaves are operands --- either variables standing for relations or particular, constant relations.

Interior nodes are operators, applied to their child or children.

Page 6: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

6

STEMPNU

Example 1

Using the relations Bars(name, addr) and Sells(bar, beer, price), find the names of all the bars that are either on Maple St. or sell

Bud for less than $3.

Page 7: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

7

STEMPNU

As a Tree:

Bars Sells

SELECTaddr = “Maple St.” SELECT price<3 AND beer=“Bud”

PROJECTname

RENAMER(name)

PROJECTbar

UNION

Page 8: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

8

STEMPNU

Example 2

Using Sells(bar, beer, price), find the bars that sell two different beers at the same price.

Strategy: by renaming, define a copy of Sells, S(bar, beer1, price). T

he natural join of Sells and S consists of quadruples (bar, beer, beer1, price) such that the bar sells both beers at this price.

Page 9: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

9

STEMPNU

The Tree

Sells Sells

RENAMES(bar, beer1, price)

JOIN Sells.bar=S.bar and Sells.price=S.price

PROJECTbar

SELECTbeer != beer1

Page 10: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

10

STEMPNU

Relational Algebra on Bags

A bag is like a set, but an element may appear more than once. Multiset is another name for “bag.” Example:

{1,2,1,3} is a bag. {1,2,3} is also a bag that happens to be a set.

Bags also resemble lists, but order in a bag is unimportant. Example: {1,2,1} = {1,1,2} as bags, but [1,2,1] != [1,1,2] as lists.

Page 11: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

11

STEMPNU

Why Bags?

SQL, the most important query language for relational databases is actually a bag language. SQL will eliminate duplicates, but usually only if you ask it to

do so explicitly.

Some operations, like projection, are much more efficient on bags than sets.

Page 12: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

12

STEMPNU

Operations on Bags

Selection: Applies to each tuple, so its effect on bags is like its effect on sets.

Projection As a bag operator, we do not eliminate duplicates.

Products and joins done on each pair of tuples, so duplicates in bags have no effe

ct on how we operate.

Page 13: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

13

STEMPNU

Example: Bag Selection

R( A, B ) S( B, C )1 2 3 45 6 7 81 2

SELECTA+B<5 (R) = A B1 21 2

Page 14: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

14

STEMPNU

Example: Bag Projection

R( A, B ) S( B, C )1 2 3 45 6 7 81 2

PROJECTA (R) = A151

Page 15: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

15

STEMPNU

Example: Bag Product

R( A, B ) S( B, C )1 2 3 45 6 7 81 2

R X S = A R.B S.B C1 2 3 41 2 7 85 6 3 45 6 7 81 2 3 41 2 7 8

Page 16: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

16

STEMPNU

Example: Bag Theta-Join

R( A, B ) S( B, C )1 2 3 45 6 7 81 2

R JOIN R.B<S.B S = A R.B S.B C1 2 3 41 2 7 85 6 7 81 2 3 41 2 7 8

Page 17: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

17

STEMPNU

Bag Union

Union, intersection, and difference need new definitions for bags.

An element appears in the union of two bags the sum of the number of times it appears in each bag.

Example: {1,2,1} UNION {1,1,2,3,1} = {1,1,1,1,1,2,2,3}

Page 18: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

18

STEMPNU

Bag Intersection

An element appears in the intersection of two bags the minimum of the number of times it appears in either.

Example: {1,2,1} INTER {1,2,3} = {1,2}.

Page 19: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

19

STEMPNU

Bag Difference

An element appears in the difference A – B of bags as many times as it appears in A, minus the number of times it appears in B. But never less than 0 times.

Example: {1,2,1} – {1,2,3} = {1}.

Page 20: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

20

STEMPNU

Beware: Bag Laws != Set Laws

Not all algebraic laws that hold for sets also hold for bags. Example: Commutative law for union (R UNION S = S UNION R

) does hold for bags. Since addition is commutative, adding the number of times x

appears in R and S doesn’t depend on the order of R and S.

Counter Example: Set union is idempotent, meaning that S UNION S = S.

However, for bags, if x appears n times in S, then it appears 2n times in S UNION S.

Thus S UNION S != S in general

Page 21: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

21

STEMPNU

The Extended Algebra

1. DELTA = eliminate duplicates from bags.

2. TAU = sort tuples.

3. Extended projection : arithmetic, duplication of columns.

4. GAMMA = grouping and aggregation.

5. OUTERJOIN: avoids “dangling tuples” = tuples that do not join with anything.

Page 22: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

22

STEMPNU

Duplicate Elimination

R1 := DELTA(R2). R1 consists of one copy of each tuple that appears in R

2 one or more times.

Page 23: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

23

STEMPNU

Example: Duplicate Elimination

R = A B1 23 41 2

DELTA(R) = A B1 23 4

Page 24: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

24

STEMPNU

Sorting

R1 := TAUL (R2). L is a list of some of the attributes of R2.

R1 is the list of tuples of R2 sorted first on the value of the first attribute on L, then on the second attribute of L, and so on. Break ties arbitrarily.

TAU is the only operator whose result is neither a set nor a bag.

Page 25: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

25

STEMPNU

Example: Sorting

R = A B1 23 45 2

TAUB (R) = [(5,2), (1,2), (3,4)]

Page 26: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

26

STEMPNU

Extended Projection

Using the same PROJL operator, we allow the list L to contain arbitrary expressions involving attributes, for example:

1. Arithmetic on attributes, e.g., A+B.

2. Duplicate occurrences of the same attribute.

Page 27: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

27

STEMPNU

Example: Extended Projection

R = A B1 23 4

PROJA+B,A,A (R) = A+B A1 A23 1 17 3 3

Page 28: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

28

STEMPNU

Aggregation Operators

Aggregation operators are not operators of relational algebra.

Rather, they apply to entire columns of a table and produce a single result.

The most important examples: SUM, AVG, COUNT, MIN, and MAX.

Page 29: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

29

STEMPNU

Example: Aggregation

R = A B1 33 43 2

SUM(A) = 7COUNT(A) = 3MAX(B) = 4AVG(B) = 3

Page 30: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

30

STEMPNU

Grouping Operator

R1 := GAMMAL (R2). L is a list of elements that are either:

Individual (grouping ) attributes. AGG(A ), where

AGG is one of the aggregation operators and A is an attribute.

Example: StarsIn (title, year, starName) Find the earliest year in which they appeared in movie for eac

h star who has appeared at least three times

Page 31: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

31

STEMPNU

Example: Grouping/Aggregation

R = A B C1 2 34 5 61 2 5

GAMMAA,B,AVG(C) (R) = ??

First, group R :A B C1 2 31 2 54 5 6

Then, average C withingroups:

A B AVG(C)1 2 44 5 6

Page 32: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

32

STEMPNU

Example: Grouping/Aggregation

StarsIn (title, year, starName) Find the earliest year in which they appeared in movie for each s

tar who has appeared at least three times

R1:=Gamma starName, MIN(year) minYear, Count(title)ctTitle (StarIN) R2 := SELECT ctTitle >= 3 (R1)

R3 := PROJECT minYear (R2)

Page 33: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

33

STEMPNU

Outerjoin

Suppose we join R JOINC S. A tuple of R that has no tuple of S with which it join

s is said to be dangling. Similarly for a tuple of S.

Outerjoin preserves dangling tuples by padding them with a special NULL symbol in the result.

Page 34: Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof

34

STEMPNU

Example: Outerjoin

R = A B S = B C1 2 2 34 5 6 7

(1,2) joins with (2,3), but the other two tuplesare dangling.

R OUTERJOIN S = A B C1 2 34 5 NULLNULL 6 7