Upload
vokien
View
220
Download
0
Embed Size (px)
Citation preview
Rasmus Ejlers Møgelberg
Relational algebraIntroduction to Database Design 2012, Lecture 5
Rasmus Ejlers Møgelberg
Overview
• Use of logic in constructing queries
• Relational algebra
• Translating queries to relational algebra
• Equations expressed in relational algebra
2
Rasmus Ejlers Møgelberg
Use of logic in constructing queries
• Consider the following problem
• Expressed more formally
• Translate to SQL:
3
Find all students who have taken all courses offered by the biology department
Find all students s such that for all courses c, if c is offered by ‘Biology’ then s has taken c
select * from students where [???]
Rasmus Ejlers Møgelberg
Use of logic in constructing queries
• The problem is not suitable for SQL, because it uses ‘for all’ and ‘if ... then ...’
• So reformulate
• (using classical logic)
4
Find all students s such that for all courses c, if c is offered by ‘Biology’ then s has taken c
Find all students s such that there is no course c such that c is offered by ‘Biology’ and s has not taken c
Rasmus Ejlers Møgelberg
Use of logic in constructing queries
• This can be formulated in SQL:
5
Find all students s such that there is no course c such that c is offered by ‘Biology’ and s has not taken c
select * from student where not exists (select * from course where dept_name = ‘Biology’ and course_id not in (select course_id from takes where takes.id = student.id))
Finds all courses taken by studentFinds all courses offered by Biology not take by student
Rasmus Ejlers Møgelberg
More logic
• Similar analysis is needed for the challenging exercises this week
• You will see more logic in the course Foundations of Computing
6
Rasmus Ejlers Møgelberg
Relational algebra
Rasmus Ejlers Møgelberg
Relational algebra
• A language for expressing basic operations in the relational model
• Two purposes
- Express meaning of queries
- Express execution plans in DBMSs
• SQL is declarative (what)
• Relational algebra is procedual (how)
8
Rasmus Ejlers Møgelberg
Relational algebra in DBMSs
9
Illustration from book
Rasmus Ejlers Møgelberg
Projection
• In SQL
• In relational algebra
10
select name, salary from instructor;
Πname, salary(instructor)
Rasmus Ejlers Møgelberg
Selection
11
select * from instructor where salary > 90000;
σsalary>90000(instructor)
Rasmus Ejlers Møgelberg
Combining selection and projection
12
select name, dept_name from instructor where salary > 90000;
Πname, dept name(σsalary>90000(instructor))
Rasmus Ejlers Møgelberg
Translating SQL into relational algebra
• Expression
• Is translated to
• Relational algebra expression says
- First do selection
- Then do projection
• Relational algebra procedural
13
select name, dept_name from instructor where salary > 90000;
Πname, dept name(σsalary>90000(instructor))
Rasmus Ejlers Møgelberg
Syntax trees
• The syntax tree
• represents the expression (3+4)*5
• Trees grow downwards in computer science!
• Evaluation from bottom up
• Useful graphical way of representing evaluation order (no need for parentheses)
14
*
+ 5
43
Rasmus Ejlers Møgelberg
Syntax trees for relational algebra
• represents
15
Πname, dept name
σsalary>90000
instructor
Πname, dept name(σsalary>90000(instructor))
Rasmus Ejlers Møgelberg
Cartesian products
16
mysql> select * from instructor, department;+-------+------------+------------+----------+------------+----------+-----------+| ID | name | dept_name | salary | dept_name | building | budget |+-------+------------+------------+----------+------------+----------+-----------+| 10101 | Srinivasan | Comp. Sci. | 65000.00 | Biology | Watson | 90000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Comp. Sci. | Taylor | 100000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Elec. Eng. | Taylor | 85000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Finance | Painter | 120000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | History | Painter | 50000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Music | Packard | 80000.00 | | 10101 | Srinivasan | Comp. Sci. | 65000.00 | Physics | Watson | 70000.00 | | 12121 | Wu | Finance | 90000.00 | Biology | Watson | 90000.00 | | 12121 | Wu | Finance | 90000.00 | Comp. Sci. | Taylor | 100000.00 | | 12121 | Wu | Finance | 90000.00 | Elec. Eng. | Taylor | 85000.00 | | 12121 | Wu | Finance | 90000.00 | Finance | Painter | 120000.00 | | 12121 | Wu | Finance | 90000.00 | History | Painter | 50000.00 | | 12121 | Wu | Finance | 90000.00 | Music | Packard | 80000.00 | | 12121 | Wu | Finance | 90000.00 | Physics | Watson | 70000.00 | | 15151 | Mozart | Music | 40000.00 | Biology | Watson | 90000.00 | | 15151 | Mozart | Music | 40000.00 | Comp. Sci. | Taylor | 100000.00 | | 15151 | Mozart | Music | 40000.00 | Elec. Eng. | Taylor | 85000.00 | | 15151 | Mozart | Music | 40000.00 | Finance | Painter | 120000.00 | | 15151 | Mozart | Music | 40000.00 | History | Painter | 50000.00 | ... +-------+------------+------------+----------+------------+----------+-----------+84 rows in set (0.01 sec)
Rasmus Ejlers Møgelberg
Products
• In relational algebra
• Syntax tree
17
select * from instructor, department;
×
instructor× department
instructor department
Rasmus Ejlers Møgelberg
Relational model: natural join
• First cartesian product, then select, then project
18
mysql> select * from instructor natural join department;+------------+-------+------------+----------+----------+-----------+| dept_name | ID | name | salary | building | budget |+------------+-------+------------+----------+----------+-----------+| Comp. Sci. | 10101 | Srinivasan | 65000.00 | Taylor | 100000.00 | | Finance | 12121 | Wu | 90000.00 | Painter | 120000.00 | | Music | 15151 | Mozart | 40000.00 | Packard | 80000.00 | | Physics | 22222 | Einstein | 95000.00 | Watson | 70000.00 | | History | 32343 | El Said | 60000.00 | Painter | 50000.00 | | Physics | 33456 | Gold | 87000.00 | Watson | 70000.00 | | Comp. Sci. | 45565 | Katz | 75000.00 | Taylor | 100000.00 | | History | 58583 | Califieri | 62000.00 | Painter | 50000.00 | | Finance | 76543 | Singh | 80000.00 | Painter | 120000.00 | | Biology | 76766 | Crick | 72000.00 | Watson | 90000.00 | | Comp. Sci. | 83821 | Brandt | 92000.00 | Taylor | 100000.00 | | Elec. Eng. | 98345 | Kim | 80000.00 | Taylor | 85000.00 | +------------+-------+------------+----------+----------+-----------+12 rows in set (0.01 sec)
Rasmus Ejlers Møgelberg
Join in relational algebra
• Join can be defined using other constructors
19
×
department instructor
σinstructor.dept name=department.dept name
Πdept name,ID,. . . ,budget
Rasmus Ejlers Møgelberg
Computation of joins
• In practice joins are not always computed this way
• Consider e.g.
• Can often find relevant entry on right hand side fast without having to construct cartesian product
20
Rasmus Ejlers Møgelberg
Expressing execution plans
• DBMSs use a variant of relational algebra for this
• Still, basic relational algebra good way of understanding meaning of queries
21
Rasmus Ejlers Møgelberg
General joins
• Define
• For example
• Is translated to relational algebra as
22
R ��Θ S = σΘ(R× S)
select * from student join advisor on s_ID = ID
student ��(ID=s ID) advisor
Rasmus Ejlers Møgelberg
Set operations
• Usual set operations in relational algebra
• These only allowed between relations with same set of attributes!
• Warning:
- The book treats relational algebra
- Might have been better to use multiset relational algebra
23
R ∪ S
R ∩ S
R \ S
Rasmus Ejlers Møgelberg
Using left outer join
24
mysql> select * from student natural left outer join takes;+-------+----------+------------+----------+-----------+--------+----------+------+-------+| ID | name | dept_name | tot_cred | course_id | sec_id | semester | year | grade |+-------+----------+------------+----------+-----------+--------+----------+------+-------+| 00128 | Zhang | Comp. Sci. | 102 | CS-101 | 1 | Fall | 2009 | A | | 00128 | Zhang | Comp. Sci. | 102 | CS-347 | 1 | Fall | 2009 | A- | | 12345 | Shankar | Comp. Sci. | 32 | CS-101 | 1 | Fall | 2009 | C | | 12345 | Shankar | Comp. Sci. | 32 | CS-190 | 2 | Spring | 2009 | A | | 12345 | Shankar | Comp. Sci. | 32 | CS-315 | 1 | Spring | 2010 | A | | 12345 | Shankar | Comp. Sci. | 32 | CS-347 | 1 | Fall | 2009 | A | | 19991 | Brandt | History | 80 | HIS-351 | 1 | Spring | 2010 | B | | 23121 | Chavez | Finance | 110 | FIN-201 | 1 | Spring | 2010 | C+ | | 44553 | Peltier | Physics | 56 | PHY-101 | 1 | Fall | 2009 | B- | | 45678 | Levy | Physics | 46 | CS-101 | 1 | Fall | 2009 | F | | 45678 | Levy | Physics | 46 | CS-101 | 1 | Spring | 2010 | B+ | | 45678 | Levy | Physics | 46 | CS-319 | 1 | Spring | 2010 | B | | 54321 | Williams | Comp. Sci. | 54 | CS-101 | 1 | Fall | 2009 | A- | | 54321 | Williams | Comp. Sci. | 54 | CS-190 | 2 | Spring | 2009 | B+ | | 55739 | Sanchez | Music | 38 | MU-199 | 1 | Spring | 2010 | A- | | 70557 | Snow | Physics | 0 | NULL | NULL | NULL | NULL | NULL | | 76543 | Brown | Comp. Sci. | 58 | CS-101 | 1 | Fall | 2009 | A | | 76543 | Brown | Comp. Sci. | 58 | CS-319 | 2 | Spring | 2010 | A | | 76653 | Aoi | Elec. Eng. | 60 | EE-181 | 1 | Spring | 2009 | C | | 98765 | Bourikas | Elec. Eng. | 98 | CS-101 | 1 | Fall | 2009 | C- | | 98765 | Bourikas | Elec. Eng. | 98 | CS-315 | 1 | Spring | 2010 | B | | 98988 | Tanaka | Biology | 120 | BIO-101 | 1 | Summer | 2009 | A | | 98988 | Tanaka | Biology | 120 | BIO-301 | 1 | Summer | 2010 | NULL | +-------+----------+------------+----------+-----------+--------+----------+------+-------+23 rows in set (0.00 sec)
Rasmus Ejlers Møgelberg
Outer join
• Left outer join defined as
25
∪
−
student �� takes
student
student �� takes
×
{(null, . . . ,null)}
Πstudent
students who have taken a course
students who have not taken a course
Rasmus Ejlers Møgelberg
Generalised projections
• Projections can be combined with basic operations on numbers, dates, booleans or strings, e.g.
26
Πflight num,capacity−reservations(. . . )
Rasmus Ejlers Møgelberg
Renaming
• It is often necessary to rename a relation
• The expression
• renames relation S to R and the attributes of S to a, ..., b
27
ρR(a,...,b)(S)
Rasmus Ejlers Møgelberg
Aggregation
• Special symbol for aggregation
• SQL
• Relational algebra
28
mysql> select avg(salary), dept_name from instructor -> group by dept_name;
dept nameGaverage(salary)
Rasmus Ejlers Møgelberg
Aggregation with having
• First group, then select groups
29
mysql> select avg(salary), dept_name from instructor -> group by dept_name -> having count(ID)>1;
Rasmus Ejlers Møgelberg
Having
• Recall that having is just selection on the group level
• Translate
• as
30
mysql> select avg(salary), dept_name from instructor -> group by dept_name -> having avg(salary) > 80000;
σavg salary>80000
ρR(avg salary,dept name)
dept nameGaverage(salary)
instructor
Rasmus Ejlers Møgelberg
Subqueries
• Example
• Insert tree from subquery in tree from outer query
• (details on blackboard)
• Nested queries in where clause are more involved
31
mysql> select name from instructor, -> (select max(salary) as max_salary from instructor) as S -> where instructor.salary = S.max_salary;
mysql> select name from instructor -> where salary >= all (select salary from instructor);
Rasmus Ejlers Møgelberg
Equations
Rasmus Ejlers Møgelberg
Equations
• Many different relational algebra expressions compute the same thing
• When evaluating queries, DBMS will
- generate many different relational algebra expressions computing the query
- choose the one it thinks is most efficient
• Here we see some basic equalities of expressions
33
Rasmus Ejlers Møgelberg
Relational algebra in DBMSs
34
Illustration from book
Rasmus Ejlers Møgelberg
Equalities (examples)
• Selection is commutative
• Join is commutative
• (only difference is order of attributes)
• Join is associative
35
σΘ1(σΘ2(R)) = σΘ2(σΘ1(R))= σΘ1andΘ2(R)
R1 �� R2 = R2 �� R1
R1 �� (R2 �� R3) = (R1 �� R2) �� R3
Rasmus Ejlers Møgelberg
More equalities
• Suppose only talks about attributes of and similarly only talks about attributes of
• Then
• Right hand side is often much less expensive to compute (DBMS makes such optimizations automatically)
36
Θ1 R1
Θ2 R2
σΘ1andΘ2
��
R1 R2
=
��
σΘ1 σΘ2
R1 R2
Rasmus Ejlers Møgelberg
Summary
• After this lecture you should be able to
- Translate simple queries to relational algebra
- Draw the syntax tree of relational algebra expressions
• Future goal:
- Judge which relational algebra expression represents the most efficient evaluation plan for a query
37