Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Relational Algebra
Relational Query Languages
} Theoretical QLs give semantics to Practical QLs
CSCI1270, Lecture 2
Recall: Query = “Retrieval Program”
Theoretical:1. Relational Algebra2. Relational Calculus
a. Tuple Relational Calculus (TRC)b. Domain Relational Calculus (DRC)
Practical:1. SQL (originally: SEQUEL from System R)2. Quel (used in Ingres)3. Datalog (Prolog-like – used in research lab systems)
Language Examples:
Relational Algebra• Basic Operators
1. select ( σ )2. project ( p )3. union ( È )4. set difference ( – )5. cartesian product ( ´ )6. rename ( ρ )
• Closure Property
CSCI1270, Lecture 2
Relational Operator
Relation
Relation
Relation
Relational Operator
Select ( σ )
CSCI1270, Lecture 2
Notation: σpredicate (Relation)Relation: Can be name of table or result of another queryPredicate:
Select rows from a relation based on a predicate
2. Complex • predicate AND predicate• predicate OR predicate• NOT predicate
Idea:
1. Simple• attribute1 = attribute2• attribute = constant value (also: ≠, <, >, ≤, ≥)
Bank Database
CSCI1270, Lecture 2
Account
bname acct_no balanceDowntown
MianusPerryR.H.
BrightonRedwoodBrighton
A-101A-215A-102A-305A-201A-222A-217
500700400350900700750
Depositor
cname acct_noJohnsonSmithHayesTurner
JohnsonJones
Lindsay
A-101A-215A-102A-305A-201A-217A-222
Customer
cname cstreet ccityJonesSmithHayesCurry
LindsayTurner
WilliamsAdamsJohnsonGlennBrooksGreen
MainNorthMainNorthPark
PutnamNassauSpringAlma
Sand HillSenatorWalnut
HarrisonRye
HarrisonRye
PittsfieldStanfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStanford
Branch
bname bcity assetsDowntownRedwood
PerryMianus
R.H.Pownel
N. TownBrighton
BrooklynPalo AltoHorseneckHorseneckHorseneckBennington
RyeBrooklyn
9M2.1M1.7M0.4M8M
0.3M3.7M7.1M
Borrower
cname lnoJonesSmithHayes
JacksonCurrySmith
WilliamsAdams
L-17L-23L-15L-14L-93L-11L-17L-16
Loan
bname lno amtDowntownRedwood
PerryDowntown
MianusR.H.Perry
L-17L-23L-15L-14L-93L-11L-16
1000200015001500500900
1300
Select ( σ )
CSCI1270, Lecture 2
bname bcity assetsDowntownBrighton
BrooklynBrooklyn
9M7.1M
bname bcity assetsDowntown Brooklyn 9M
σ bcity = “Brooklyn” (branch) =
σ assets > $8M (σ bcity = “Brooklyn” (branch)) =
Notation: σpredicate (Relation)
Project ( p )
cstreet ccityMainNorthPark
PutnamNassauSpringAlma
Sand HillSenatorWalnut
HarrisonRye
PittsfieldStanfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStanford
CSCI1270, Lecture 2
Notation: pA1, …, An (Relation)• Relation: name of a table or result of another query• Each Ai is an attribute • Idea: p selects columns (vs. σ which selects rows)
p cstreet, ccity (customer) =
Project ( p )
bcityBrooklyn
Horseneck
CSCI1270, Lecture 2
p bcity (σassets > 5M (branch)) =
Question: Does the result of Project always have the same numberof tuples as its input?
Union ( È )
cnameJohnsonSmithHayesTurnerJones
LindsayJacksonCurry
WilliamsAdams
CSCI1270, Lecture 2
Notation: Relation1 È Relation2
Example:
(p cname (depositor)) È (p cname (borrower)) =
R È S valid only if:
1. R, S have same number of columns (arity)2. R, S corresponding columns have same name and domain
(compatibility)
Depositor
cname acct_noBorrower
cname lno
Schema:
Set Difference ( – )
bname lno amountDowntownRedwood
PerryDowntown
Perry
L-17L-23L-15L-14L-16
100020001500500300
Notation: Relation1 - Relation2
R - S valid only if:
1. R, S have same number of columns (arity)2. R, S corresponding columns have same domain (compatibility)
Example:(p bname (σamount ≥ 1000 (loan))) – (p bname (σ balance < 800 (account))) =
bname acct_no balanceMianus
BrightonRedwoodBrighton
A-215A-201A-222A-217
700900700850
CSCI1270, Lecture 2
bnameDowntown
Perry
What About Intersection?
Remember:R ⋂ S = R – (R – S)
SR
R – S
Cartesian Product ( ´ )
depositor.cname
acct_no borrower.cname
lno
JohnsonJohnsonJohnsonJohnsonJohnsonJohnsonJohnsonJohnsonSmith
…
A-101A-101A-101A-101A-101A-101A-101A-101A-215
…
JonesSmithHayes
JacksonCurrySmith
WilliamsAdamsJones
…
L-17L-23L-15L-14L-93L-11L-17L-16L-17…
CSCI1270, Lecture 2
Notation: Relation1 ´ Relation2
Example:
R ´ S like cross product for mathematical relations:• every tuple of R appended to every tuple of S• flattened!!!
depositor ´ borrower =
How many tuples in the result?
A: 56
Rename ( ρ )
CSCI1270, Lecture 2
Notation: r identifier (Relation)renames a relation, or
Notation: r identifier0 (identifier1, …, identifiern) (Relation)renames relation and columns of n-column relation
Use:massage relations to make È, – valid, or ´ more readable
Rename ( ρ )
CSCI1270, Lecture 2
Notation: r identifier0 (identifier1, …, identifiern) (Relation)
Example:
r result (dcname, acctno, bcname, lno) (depositor ´ borrower) =
dccname acctno bcname lno
JohnsonJohnsonJohnsonJohnsonJohnsonJohnsonJohnsonJohnsonSmith
…
A-101A-101A-101A-101A-101A-101A-101A-101A-215
…
JonesSmithHayes
JacksonCurrySmith
WilliamsAdamsJones
…
L-17L-23L-15L-14L-93L-11L-17L-16L-17…
result
Example Query in RA•Determine lno for loans that are for an amount that is larger than the amount of some other loan. (i.e. lno for all non-minimal loans)
CSCI1270, Fall 2008, Lecture 2
Temp1 ß …Temp2 ß … Temp1 ……
Can do in steps:
Example Query in RA
CSCI1270, Lecture 2
lno amtL-17L-23L-15L-14L-93L-11L-16
10002000150015005009001300
1. Find the base data we need
Temp1 ß p lno,amt (loan)
2. Make a copy of (1)
Temp2 ß ρ Temp2 (lno2,amt2) (Temp1) lno2 amt2L-17L-23L-15L-14L-93L-11L-16
10002000150015005009001300
Example Query in RA
CSCI1270, Lecture 2
lno amt lno2 amt2L-17L-17…
L-17L-23L-23…
L-23…
10001000…
100020002000…
2000…
L-17L-23…
L-16L-17L-23…
L-16…
10002000…
130010002000…
1300…
3. Take the cartesian product of 1 and 2
Temp3 ß Temp1 ´ Temp2
Example Query in RA
• p lno (σamt > amt2 (p lno,amt (loan) ´ (ρTemp2 (lno2,amt2) (p lno,amt (loan)))))
CSCI1270, Fall 2008, Lecture 2
4. Select non-minimal loans
Temp4 ß σamt > amt2 (Temp3)
5. Project on lno
Result ß p lno (Temp4)
… or, if you prefer…
ReviewTheoretical Query Languages
CSCI1270, Fall 2008, Lecture 2
1. SELECT ( σ )2. PROJECT ( π )3. UNION ( È )4. SET DIFFERENCE ( – )5. CARTESIAN PRODUCT ( ´ )6. RENAME ( ρ )
Relational Algebra
• Relational algebra gives semantics to practical query languages• Above set: minimal relational algebra
à will now look at some redundant (but useful!) operators
Review
CSCI1270, Lecture 2
Find the names of customers who have both accounts and loans
T1 ß ρT1 (cname2, lno) (borrower)
T2 ß depositor ´ T1
T3 ß σcname = cname2 (T2)Result ß π cname (T3)
Above sequence of operators (ρ, ´, σ) very common.
Express the following query in the RA:
Motivates additional (redundant) RA operators.
Relational AlgebraAdditional Operators
6. Update ( ß ) (we’ve already been using this)
CSCI1270, Lecture 2
÷2. Division ( )
3. Generalized Projection (π)4. Aggregation
!"1. Natural Join ( )
!" !" !"5. Outer Joins ( )
• 1&2 Redundant: Can be expressed in terms of minimal RAe.g. depositor borrower =
π …(σ…(depositor ´ ρ…(borrower)))
• 3 – 6 Added for extra power
!"
Natural Join
CSCI1270, Lecture 2
Idea: combines ρ, ´, σ
A B C D1223
αααβ
+--+
10102010
E B D‘a’‘a’‘b’‘c’
ααββ
10201010
r s
A B C D E12233
αααΒβ
+--++
1010201010
‘a’‘a’‘a’‘b’‘c’
=
Relation1 Relation2Notation:
!"
πcname,acct_no,lno (σcname=cname2 (depositor ´ ρt(cname2,lno) (borrower)))≡
!"depositor borrower
!"
Division
CSCI1270, Lecture 2
A Bαααβγγγγδδ
1231134612
B12
r
s Aαδ
=÷
Query: Find values for A in r which have corresponding Bvalues for all B values in s
Relation1 Relation2Notation:
Idea: expresses “for all” queries
÷
Division
CSCI1270, Lecture 2
A Bαααβγγγγδδ
1231134612
B12
r
s Aαδ
=
17 ÷ 3 = 5
÷
The largest value of i such that: i ´ 3 ≤ 17
t
Relational Division
The largest value of t such that:( t ´ s Í r )
Another way to look at it: and ÷ ´
Division
CSCI1270, Lecture 2
A B C D Eαααββγγγ
aaaaaaaa
αγγγγγγβ
aabababb
11113111
D E
ab
11
r
s A B Cαγ
aa
γγ
=÷ t
r ÷ s = t such that s X t Í r
A More Complex Example(a,1) and (b,1), i.e., relation Sare paired with same prefix
which thus goes in the answer.
Division Adds No Power(power = set of functions that can be expressed)
Definition in terms of the basic algebra operationLet r(R) and s(S) be relations, and let S Í R
r ÷ s = ÕR-S (r) – ÕR-S ( (ÕR-S (r) x s) – ÕR-S,S(r))
To see why– ÕR-S,S(r) simply reorders attributes of r
– ÕR-S(ÕR-S (r) x s) – ÕR-S,S(r)) gives those tuples t in
ÕR-S (r) such that for some tuple u Î s, tu Ï r.
Generalized Projection
CSCI1270, Lecture 2
p e1,…,en(Relation)
e1,…,en can include arithmetic expressions – not just attributes
cname limit balanceJonesTurner
50003000
20002500
credit =
π cname, limit - balance (credit) = cname limit-balanceJonesTurner
3000500
Notation:
Example
Then…
Aggregate Functions and Operations} An aggregate function takes a collection of values and
returns a single value as a result.avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values
} Aggregate operation in relational algebra G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E )
◦ E is any relational-algebra expression◦ G1, G2 …, Gn is a list of attributes on which to group
(can be empty)◦ Each Fi is an aggregate function◦ Each Ai is an attribute name
CSCI1270, Lecture 2
Aggregate Operation – Example} Relation r:
CSCI1270, Lecture 2
A B
aabb
abbb
C
77310
g sum(c) (r)sum-C
27No grouping
Aggregate Operation – Example
} Relation account grouped by branch-name:
CSCI1270, Lecture 2
branch-name g sum(balance) (account)
branch-name account-number balancePerryridgePerryridgeBrightonBrightonRedwood
A-102A-201A-217A-215A-222
400900750750700
branch-name sum(balance)PerryridgeBrightonRedwood
13001500700
Aggregate Functions (Cont.)
Result of aggregation does not have a name– Can use rename operation to give it a name– For convenience, we permit renaming as part of
aggregate operation
CSCI1270, Lecture 2
branch-name g sum(balance) as sum-balance (account)
Outer Joins
CSCI1270, Lecture 2
bname lno amtDowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lnoJonesSmithHayes
L-170L-230L-155
borrower =
=bname lno amt cname
DowntownRedwood
L-170L-230
30004000
JonesSmith
Join result loses…à any record of Perryà any record of Hayes
Motivation:
!"loan borrower =
Outer Joins
CSCI1270, Lecture 2
bname lno amtDowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lnoJonesSmithHayes
L-170L-230L-155
borrower =
bname lno amt cnameDowntownRedwood
Perry
L-170L-230L-260
300040001700
JonesSmith
┴
• preserves all tuples in left relation1. Left Outer Join ( )!"
┴ = NULL
loan borrower =!"
Outer Joins
CSCI1270, Fall 2008, Lecture 2
bname lno amt cnameDowntownRedwood
┴
L-170L-230L-155
30004000
┴
JonesSmithHayes
!"
bname lno amtDowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lnoJonesSmithHayes
L-170L-230L-155
borrower =
• preserves all tuples in right relation2. Right Outer Join ( )
┴ = NULL
loan borrower =!"
Outer Joins
CSCI1270, Fall 2008, Lecture 2
bname lno amtDowntownRedwood
Perry
L-170L-230L-260
300040001700
loan =cname lnoJonesSmithHayes
L-170L-230L-155
borrower =
• preserves all tuples in both relations3. Full Outer Join ( )
┴ = NULL
!"
bname lno amt cnameDowntownRedwood
Perry┴
L-170L-230L-260L-155
300040001700
┴
JonesSmith
┴Hayes
loan borrower =!"
Update
CSCI1270, Lecture 2
1. Deletion: r ß r – s e.g., account ß account – σbname=Perry (account)(deletes all Perry accounts)
2. Insertion: r ß r È se.g., branch ß branch È {(Waltham, Boston, 7M)}(inserts new branch with bname = Waltham, bcity = Boston, assets = 7M)
3. Update: r ß πe1,…,en (r)
e.g., depositor ß depositor È (ρtemp (cname,acct_no) (borrower))(adds all borrowers to depositors, treating lno’s as acct_no’s)
e.g., account ß πbname,acct_no,bal*1.05 (account)(adds 5% interest to account balances)
Identifier ß QueryNotation: Common Uses:
Views• Limited access to DB.• Tailored schema
• Consider a person who needs to know a customer’s loan number but has no need to see the loan amount. This person should see a relation describedas:
Õcustomer-name, loan-number (borrower loan)
• A relation that is made visible to a user as a “virtual relation” is called a view.
View Definition• A view is defined using the create view statement which has the
formcreate view v as <query expression>
where <query expression> is any legal relational algebra query expression. The view name given as v.
• Once a view is defined, the view name can be used to refer to the virtual relation that the view generates.
• View definition is not the same as creating a new relation by evaluating the query expression Rather, a view definition causes the saving of an expression to be substituted into queries using the view.
View Examples
• Consider the view (named all-customer) consisting of branches and their customers.
create view all-customer asÕbranch-name, customer-name (depositor account)ÈÕbranch-name, customer-name (borrower loan)
• We can find all customers of the Perryridgebranch by writing:Õcustomer-name
(sbranch-name = “Perryridge” (all-customer))
Updates Through View• Database modifications expressed as views must be translated to
modifications of the actual relations in the database.• Consider the person who needs to see all loan data in the loan
relation except amount. The view given to the person, branch-loan, is defined as:
create view branch-loan asÕbranch-name, loan-number (loan)
• Since we allow a view name to appear wherever a relation name is allowed, the person may write:
branch-loan ¬ branch-loan È {(“Perryridge”, L-37)}
Updates Through Views (Cont.)• The previous insertion must be represented
by an insertion into the actual relation loanfrom which the view branch-loan is constructed.
• An insertion into loan requires a value for amount. The insertion can be dealt with by either.– rejecting the insertion and returning an error
message to the user.– inserting a tuple (“L-37”, “Perryridge”, null) into
the loan relation
Updates Through Views (Cont.)
• Some updates through views are impossible to translate.create view v as sbranch-name = “Perryridge” (account))
v ¬ v È (L-99, Downtown, 23)• Others cannot be translated uniquely
all-customer ¬ all-customer È (Perryridge, John)• Have to choose loan or account, and
create a new loan/account number!
Views Defined Using Other Views
• One view may be used in the expression defining another view
• A view relation v1 is said to depend directly on a view relation v2 if v2 is used in the expression defining v1
• A view relation v1 is said to depend on view relation v2 if either v1 depends directly on v2 orthere is a path of dependencies from v1 to v2
View Expansion
• Let view v1 be defined by an expression e1 that may itself contain uses of view relations.
• View expansion of an expression repeats the following replacement step:repeat
Find any view relation vi in e1
Replace the view relation vi by the expression defining vi
until no more view relations are present in e1
• As long as the view definitions are not recursive, this loop will terminate