24
Databases - Relational Algebra (GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 1 / 24

Databases - Relational Algebra

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Databases - Relational Algebra

Databases - Relational Algebra

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 1 / 24

Page 2: Databases - Relational Algebra

This lecture

This lecture covers relational algebra which is the formal languageunderlying the manipulation of relations.

We follow the notation from Chapter 4 of Ramakrishnan & Gehrke.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 2 / 24

Page 3: Databases - Relational Algebra

Projection and Selection

Relational Algebra

Relational algebra is a procedural language that allows us to describeoperations on relations in a formal and mathematically precise.

An expression in relational algebra describes a sequence ofoperations that can be applied to a relation and which produces arelation as a result.

The primary operations of the relational algebra are projection,selection and joins.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 3 / 24

Page 4: Databases - Relational Algebra

Projection and Selection

Projection

Suppose we have a relation

R ⊆ C1 × C2 × · · · × Cn,

where C1, C2, . . ., Cn are the columns of the relation.

If S is a subset of the columns then πS(R) is the relation obtained fromR by deleting all the columns not in S, and it is called the projectiononto S of the relation R.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 4 / 24

Page 5: Databases - Relational Algebra

Projection and Selection

Example Relation

Suppose R is the following relation

customerId name address accountMgr1121 Bunnings Subiaco 1371122 Bunnings Claremont 1371211 Mitre 10 Myaree 1861244 Mitre 10 Joondalup 1861345 Joe’s Hardware Nedlands 2041399 NailsRUs Jolimont 361

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 5 / 24

Page 6: Databases - Relational Algebra

Projection and Selection

Example Projections

Then πcustomerID, name(R) is

customerId name1121 Bunnings1122 Bunnings1211 Mitre 101244 Mitre 101345 Joe’s Hardware1399 NailsRUs

We have projected the relation onto the two named columns, thusobtaining a smaller relation.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 6 / 24

Page 7: Databases - Relational Algebra

Projection and Selection

Another example projection

If R is the relation given above, then πname(R) is the relation

nameBunningsMitre 10Joe’s HardwareNailsRUs

As the result of a projection is a relation which is defined to be a set,the output cannot contain any duplicate rows.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 7 / 24

Page 8: Databases - Relational Algebra

Projection and Selection

In SQL

It should be clear that there is a direct relationship between

the projection operator π, andthe SELECT columns FROM statement

In practical terms one difference is that SQL normally does notautomatically remove duplicate rows, because this is a relativelyexpensive operation.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 8 / 24

Page 9: Databases - Relational Algebra

Projection and Selection

Selection

Projection is an operation that extracts columns from a relation, whileselection is the operation that extracts rows from a relation.

If R is a relation and B is a boolean function on the columns of R then

σB(R)

is the relation with the same columns as R and whose rows are therows of R for which the boolean function evaluates to true.

Equivalently we say that σB selects the rows from R that satisfy theboolean condition B.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 9 / 24

Page 10: Databases - Relational Algebra

Projection and Selection

Example Selection

If R is the relation defined above, then

σcustomerID<1300(R)

is the relation

customerId name address accountMgr1121 Bunnings Subiaco 1371122 Bunnings Claremont 1371211 Mitre 10 Myaree 1861244 Mitre 10 Joondalup 186

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 10 / 24

Page 11: Databases - Relational Algebra

Projection and Selection

Combining selection and projection

As the result of a selection or projection is a relation, we can use it asthe “input” for another selection or projection and thus build upcomplex relational algebra expressions.

πaccountMgr(σname=′Bunnings′(R))

This compound operation first selects the rows that correspond to“Bunnings” and then projects onto the single column accountMgr —in other words, this relational algebra expression answers the query“What is the staff id of the account manager for Bunnings".

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 11 / 24

Page 12: Databases - Relational Algebra

Projection and Selection

In SQL

The selection operator in relational algebra corresponds directly to theWHERE clause of SQL.

Thus we can directly translate the relational algebra expression of theprevious slide into an SQL query

SELECT accountMgr FROM customerWHERE name = ’Bunnings’;

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 12 / 24

Page 13: Databases - Relational Algebra

Projection and Selection

Boolean functions

The boolean functions that can be used as the selection condition arecombinations using ∧ (for AND) and ∨ (for OR) of terms of the form

attribute op constant

or

attribute1 op attribute2

where op is a comparison operator in the set

{<,≤,=, 6=,≥, >}.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 13 / 24

Page 14: Databases - Relational Algebra

Projection and Selection

Relational Algebra Expressions

A relational algebra expression is recursively defined to be

A relation, orA unary operator applied to a single expressionA binary operator applied to two expressions

Selection and projection are unary operators because they operate ona single relation.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 14 / 24

Page 15: Databases - Relational Algebra

Joins

Binary OperatorsRelational algebra permits the use of the standard set operations:

Union (∪)If R and S are union-compatible, then R ∪ S is the set of tuples ineither R or S.Intersection (∩)If R and S are union-compatible, then R ∩ S is the set of tuples inboth R and S.Set Difference (−)If R and S are union-compatible then R− S is the set of tuples in Rthat are not in S

Cartesian Product (×)If R has arity r and S has arity s, then R× S has arity r + s and hasall tuples whose projection onto the first r columns is in R andwhose projection onto the last s columns is in S.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 15 / 24

Page 16: Databases - Relational Algebra

Joins

Union in MySQL

Of the first three set operations, MySQL 5.0 only supports UNION as adirect command. This can be useful when two subsets of data arebeing pulled from different sets of tables.

The UNION can simply be placed between two quite independentSELECT statements as long as both return the same number ofcolumns.

SELECT birth FROM presidentUNIONSELECT death FROM president;

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 16 / 24

Page 17: Databases - Relational Algebra

Joins

Joins

The join of two relations R and S is one of the most importantoperations in real databases and is defined as follows.

Suppose that c is a boolean function that may involve the attributes ofboth R and S.

ThenR ./c S

is defined to beσc(R× S).

In other words a join is the result of selecting certain rows from theCartesian product.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 17 / 24

Page 18: Databases - Relational Algebra

Joins

Equijoin

In almost all the examples we have seen so far, the join condition hasactually consisted of equalities between attributes of R and S — oftenan equality between a key and a foreign key.

In such a join, the resulting relation will have a number of duplicatedcolumns — in particular any attributes used in the join condition willappear twice.

An equijoin of two tables is a join where the join condition is aconjunction of equalities of the form R.attribute1 = S.attribute2 with thecolumns of S that appear in the join condition projected out.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 18 / 24

Page 19: Databases - Relational Algebra

Joins

Natural Join

The natural join of two relations R and S is the equijoin whose joincondition involves every column having the same name in R and S.

As the join condition can be determined by context, it can simply beomitted and

R ./ S

denotes the natural join of R and S.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 19 / 24

Page 20: Databases - Relational Algebra

Relational Queries

Examples from the text

Consider the following conceptual schema taken almost directly fromRamakrishnan & Gehrke that is related to a boat-rental operation.

sid

name

age

Sailor

bid

bname

colour

Boatdate

Reserves

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 20 / 24

Page 21: Databases - Relational Algebra

Relational Queries

Sample entity setssid sname age22 Dustin 45.029 Brutus 33.031 Lubber 55.532 Andy 25.558 Rusty 35.064 Horatio 35.071 Zorba 16.074 Horatio 35.085 Art 25.595 Bob 63.5

sailor

sid bid date22 101 10-Aug-0622 102 10-Aug-0622 103 11-Aug-0622 104 12-Aug-0631 102 02-Aug-0631 103 03-Aug-0631 104 17-Aug-0664 101 18-Aug-0664 102 05-Aug-0674 103 05-Aug-06

reserves

bid name colour101 Interlake blue102 Interlake red103 Clipper green104 Marine red

boat

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 21 / 24

Page 22: Databases - Relational Algebra

Relational Queries

Queries in relational algebra

Suppose we wish to answer the question

Which sailors have reserved boat 103?

One expression that answers this query is

πsname(σbid=103(reserves) ./ sailor)

An equivalent expression that also answers this query is

πsname(σbid=103(reserves ./ sailor))

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 22 / 24

Page 23: Databases - Relational Algebra

Relational Queries

Unpacking the expression

The selection σbid=103(reserves) extracts from reserves theentries relating to boat 103.

sid bid date22 103 11-Aug-0631 103 03-Aug-0674 103 05-Aug-06

We then compute the natural join of this relation with sailor getting

sid bid date sid sname age22 103 11-Aug-06 22 Dustin 45.031 103 03-Aug-06 31 Lubber 55.574 103 05-Aug-06 74 Horatio 35.0

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 23 / 24

Page 24: Databases - Relational Algebra

Relational Queries

Unpacking this expression cont.

The final stage is the projection onto the single field sname resulting inthe final relation

snameDustinLubberHoratio

The query πsname(σbid=103(reserves ./ sailor)) produces thesame answer, but generates much larger intermediate relations.

One task of the query optimizer is to take an SQL query and todetermine an evaluation strategy by converting it into an equivalent,but more efficient relational expression.

(GF Royle, N Spadaccini 2006-2010) Databases - Relational Algebra 24 / 24