Upload
aron-chad-bruce
View
224
Download
0
Embed Size (px)
Citation preview
INFO 340
Lecture 4
Relational Algebra and Calculus
SQL Syntax
Relational Algebra and Calculus
• Relational Algebra is a procedural whereas Relational Calculus is declarative.
• SQL is based on Relational Calculus. You tell server WHAT you want, not how you want to get it.
Tuple Relational Calculus
• { T | P(T) }– T is a Tuple variable– P(T) is a formula defining T– Result is the set of all tuples T where P(T) is
true
• Give example..
Domain Relational Calculus
• First Order Predicate Logic– Looks at predicates on one side and
individuals on the other.
Oh CRUD
• Create - Using INSERT
• Retrieve - Using SELECT
• Update - Using UPDATE
• Delete – Using DELETE
• How we manipulate the data. Called the Data Manipulation Language (DML).
Remember SELECT from Lab?
selectselect whatever-attributes-you-want
fromfrom the-table-that-you-want
wherewhere x-attribute = what-you-want;
selectselect last_name
fromfrom myTable
wherewhere first_name = ‘Suzie’;
SQL and Relational Calculus
• SQL is grounded in Relational Calculus. You tell the DBMS WHAT you want and it figures out how best to retrieve it.
• Well, that’s not entirely true.. SELECT does break Relational Theory but that’s for another day…
Two tables for today’s examples
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DepartmentID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
Cross JoinCross joins are the Cartesian product of two tables. There are two ways
to express cross joins.
SELECTSELECT * FROMFROM Employee E, Department D
SELECTSELECT * FROMFROM Employee E CROSS JOIN CROSS JOIN Department D
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
E.LastName E.DeptID D.ID D.NameSmith 1 1 HR
Smith 1 2 Sales
Smith 1 3 Engineering
Smith 1 4 Marketing
Johnson 1 1 HR
Johnson 1 2 Sales
Johnson 1 3 Engineering
Johnson 1 4 HR
Miller 2 1 HR
Miller 2 2 Sales
Miller 2 3 Engineering
Miller 2 4 Marketing
Lee 3 1 HR
Lee 3 2 Sales
Lee 3 3 Engineering
Lee 3 4 Marketing
Inner JoinInner joins are the most common type of Join performed in SQL.
There are actually two ways to express an Inner. An inner join is done by taking the Cartesian product of the two tables, then only returning the rows that match the conditional.
SELECTSELECT * FROMFROM Employee E, Department D
WHEREWHERE D.ID=E.DeptID
SELECTSELECT * FROMFROM Employee E
JOINJOIN Department D ONON D.ID=E.DeptID
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLEE.LastName E.DeptID D.ID D.Name
Smith 1 1 HR
Johnson 1 1 HR
Miller 2 2 Sales
Lee 3 3 Engineering
Outer JoinOuter joins are used to return all rows from one table regardless of a
match in the other table. If no match is found in the other table, a NULL is returned. Three types: LEFT, RIGHT, FULL
Show all the departments and the employees in them, if any:
SELECTSELECT * FROMFROM Department D
LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
D.ID D.Name E.LastName E.DeptID
1 HR Smith 1
1 HR Johnson 1
2 Sales Miller 2
3 Engineering Lee 3
4 Marketing NULL NULL
Self Join
A self-join is a query in which a table is joined (compared) to itself. Self-joins are used to compare values in a column with other values in the same column in the same table. One practical use for self-joins: obtaining running counts and running totals in an SQL query. To write the query, select from the same table listed twice with different aliases, set up the comparison, and eliminate cases where a particular value would be equal to itself. Example Which customers are located in the same state (column name is Region)? SELECT DISTINCT c1.ContactName, c1.Address, c1.City, c1.Region FROM Customers AS c1, Customers AS c2 WHERE c1.Region = c2.Region AND c1.ContactName <> c2.ContactName ORDER BY c1.Region, c1.ContactName;
Another example: Exercise Which customers are located in the same city? (32 rows)
http://www.udel.edu/evelyn/SQL-Class3/SQL3_self.html
Aggregate Functions
• While returning rows is nice, often times you want to return data based upon a computed value from a set. – Count– Sum– Min– Max– Avg
An example of Aggregates
Name Grade
Steve 2.5
John 3.5
Wendy 3.8
Niki 4.0
Kevin 1.4
SELECT SELECT count(*), max(grade), min(grade), avg(grade), sum(grade) FROMFROM student_grades
Count(*) Max(grade) Min(grade) Avg(grade) Sum(grade)
5 4 1.4 3.04 15.2
SELECT Statement - Grouping
• Now that you have aggregate functions, they become useful in grouping results.
• Back to the example Join tables, maybe you want a count of the number of employees in each department.
• The GROUP BY clause is added to the end of the SELECT statement.
SELECT Statement - Grouping
• All column names in SELECT list must appear in GROUP BY clause unless name is used only in an aggregate function.
• If WHERE is used with GROUP BY, WHERE is applied first, then groups are formed from remaining rows satisfying predicate.
• ISO considers two nulls to be equal for purposes of GROUP BY.
Group By ExampleHow many employees are in each department?
SELECTSELECT D.Name, COUNT(E.DeptID) FROMFROM Department D
LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID
GROUP BYGROUP BY D.Name
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
D.Name COUNT(E.DeptID)
HR 2
Sales 1
Engineering 1
Marketing 0
HAVING clause
• But what if we want to return results based upon a GROUP BY? Enter the HAVING clause.
• Let’s only see the departments with people in them:
SELECTSELECT D.Name, COUNT(E.DeptID)
FROMFROM Department D
LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID
GROUP BYGROUP BY D.Name
HAVINGHAVING COUNT(E.DeptID) > 0
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
D.Name COUNT(E.DeptID)
HR 2
Sales 1
Engineering 1
ORDER BY clause• Finally, what if we want some order imposed on our
results? • Order by can contain any field or value specified in the
selection criteria.
SELECTSELECT D.Name, COUNT(E.DeptID)
FROMFROM Department D
LEFT JOIN LEFT JOIN Employee E ONON D.ID=E.DeptID
GROUP BYGROUP BY D.Name
HAVINGHAVING COUNT(E.DeptID) > 0
ORDER BYORDER BY D.NameID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
D.Name COUNT(E.DeptID)
Engineering 1
HR 2
Sales 1
Set Theory Review
• Intersection of 2 SetsR = {1,2,3,4} S = {4,5,6,7} R S = { 4 }
R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = { Ø }
R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘blue’, ‘fire trap’ , ‘AM radio’ }
R S = { ‘blue’ }
The Intersection
Set Theory Review
• Union of 2 SetsR = {1,2,3,4} S = {4,5,6,7} R S = {1,2,3,4,5,6,7}
R = { Joe, Suzie } S = { Jane, Bob, Sam }
R S = {Joe, Suzie, Jane, Bob, Sam }
R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ }
R S = {‘big stereo’, ‘blue’, ‘safe’, ‘AM radio’, ‘fire trap’ }
Set Theory Review
• Difference of 2 SetsR = {1,2,3,4} S = {4,5,6,7} R \ S = {1,2,3}
R = { Joe, Suzie } S = { Jane, Bob, Sam } R S = {Joe, Suzie }
R = { ‘big stereo’, ‘blue’, ‘safe’ } S = { ‘AM radio’, ‘blue’, ‘fire trap’ }
R S = {‘big stereo’, ‘safe’ }
The Difference
• Use between select clauses– Keyword for union is union– Keyword for intersection is intersect– Keyword for difference is except
• Column names must match in each query.• Example:
(selectselect Name fromfrom Staff) unionunion (selectselect Name fromfrom Faculty)
Union, Intersect, and Difference (Except)
INSERT
INSERT INTO INSERT INTO TableName [ (columnList) ]
VALUESVALUES (dataValueList)
• columnList is optional; if omitted, SQL assumes a list of all columns in their original CREATE TABLE order.
• Any columns omitted must have been declared as NULL when table was created, unless DEFAULT was specified when creating column.
© Pearson Education Limited 1995, 2005
INSERT
• dataValueList must match columnList as follows:– number of items in each list must be same;– must be direct correspondence in position of items
in two lists;– data type of each item in dataValueList must be
compatible with data type of corresponding column.
© Pearson Education Limited 1995, 2005
INSERT … VALUES
• Insert a new row into Employee table supplying data for all columns.– Let’s finally put someone in the marketing
department!• Full table, so can omit the column names:
INSERT INTOINSERT INTO Employee VALUESVALUES (‘Brown’, 4);• Or we can explicitly list the column names:
INSERT INTOINSERT INTO Employee (LastName, DeptID) VALUESVALUES (‘Brown’, 4);
• Perhaps we the DeptID field allows NULLs or has a default:INSERT INTOINSERT INTO Employee (LastName) VALUESVALUES (‘Brown’);
UPDATE
UPDATEUPDATE TableName
SETSET columnName1 = dataValue1
[, columnName2 = dataValue2...]
[WHEREWHERE searchCondition]
• TableName can be name of a base table or an updatable view.
• SET clause specifies names of one or more columns that are to be updated.
• WHERE clause is optional, if omitted all rows are updated.
© Pearson Education Limited 1995, 2005
UPDATE example
• Ms. Johnson gets married and wants to change her name to Anderson.UPDATE UPDATE EMPLOYEE SETSET LastName=‘Anderson’ WHEREWHERE LastName=‘Johnson’
• Better way to find Ms. JohnsonUPDATE UPDATE EMPLOYEE SETSET LastName=‘Anderson’ WHEREWHERE LastName=‘Johnson’ AND DeptID=1
• The Marketing department is being merged with Sales and as such all the employees in that department arebeing moved into Sales. UPDATE UPDATE EMPLOYEE SETSET DeptID=2 WHEREWHERE DeptID=4
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
DELETE
– DELETE FROM TableName – [WHERE searchCondition]
• TableName can be name of a base table or an updatable view.
• searchCondition is optional; if omitted, all rows are deleted from table. This does not delete table. If search_condition is specified, only those rows that satisfy condition are deleted.
© Pearson Education Limited 1995, 2005
DELETE example
• Mr. Smith decides to take another job and quits:DELETE FROM DELETE FROM EMPLOYEE WHEREWHERE LastName=‘Smith’ AND DeptId=1
• Remember the Marketing department? Well, rather than merge with Sales we are going to eliminate it and all the employees in thatdepartment. DELETE FROM DELETE FROM EMPLOYEE WHERE WHERE DeptID=4
ID Name
1 HR
2 Sales
3 Engineering
4 Marketing
DEPARTMENT TABLE
LastName DeptID
Smith 1
Johnson 1
Miller 2
Lee 3
EMPLOYEE TABLE
Variants & Like
• There is a rich set of functions that can be used in SQL. Of course, most of them are highly language-variant dependent.
• LIKE. Allows searching a text field for a value. SELECT SELECT * FROMFROM students WHEREWHERE name LIKELIKE ‘R%’ – % is a wildcard, whereas _ matches just one
character
CASE statements
• SELECTSELECT CASE Sex
WHEN ‘M’ THEN ‘Male’WHEN ‘F’ THEN ‘Female’
END CASEFROMFROM Students
Mini-Project
• Due Feb 4, 2009– Build on your iSchool MySQL account
• Choose between the following:
– UW OnTech Archive– UW Privacy Policy Set
Mini-Project
UW OnTech Archive -- http://www.washington.edu/computing/ontech/archive.php
UW Privacy Policy – http://depts.washington.edu/comply/privacy.shtml
http://security.uwmedicine.org/policies/sec_policies.asp
UW OnTech
archive contents contents by issue
UW OnTech
issue contents article example
• Sample questions:– How many contents by issue pages list topics
that are not the same as the topics on the corresponding issue contents pages ? Or are missing entirely ?
– How many pages list an ‘exposed e-mail address’ on its readable page?
– How many pages have an e-mail address that is visible in the page source?
UW OnTech
UW OnTech
• More sample questions:– What is the average number of clickable links
per article in the archive ?– What is the min & max number of clickable
links in the archive• Which articles were they
– More to come
UW Privacy & Security Policies
• Sample questions:– Which policies have the greatest distance
between Effective Date & Review Date ?• Which ones are they?
– How many policies have the same Effective Date & Review Date?
• Which ones are they?
– Which policies have more than 5 attachments?
• More sample questions:– Which policies have greater than 5
references?• Which policies are they ?
• What is most often cited reference in the reference section ?
• Do any of the policies in these two sets reference each other ?
UW Privacy & Security Policies