24
Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard language for all relational DBMS’s. The main operations: Definition and modification of tables, views: CREATE, DROP, ALTER Inserting data, modification of data: INSERT, DELETE, UPDATE Searching for some data: SELECT CREATE TABLE command is used to specify: (1) The table name (2) Description of the attributes, including: (a) Attribute name (b) The data type of the attribute (c) Constraints on the attribute values, including: (i) Whether some tuple may have NULL value for this attribute; (ii) The range of values allowed for this attribute (iii) Referential constraints on the attribute (3) Primary key attributes (4) Foreign key attributes Example 1: Here is a simplified DB table for records of books in a library: CREATE TABLE Book ( isbn VARCHAR(15) NOT NULL, title VARCHAR(200) NOT NULL, catalog_no VARCHAR(15) NOT NULL, copy_no INT NOT NULL DEFAULT 1, keywords CHAR(100) NULL, purchase_date DATE NULL, PRIMARY KEY CLUSTERED(catalog_no, copy_no)) 1

Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

  • Upload
    others

  • View
    61

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Introduction to Structured Query Language (SQL)

SQL is the most popular Data Manipulation Language (DML), and in fact, the standard language for all

relational DBMS’s.

The main operations:

Definition and modification of tables, views: CREATE, DROP, ALTER

Inserting data, modification of data: INSERT, DELETE, UPDATE

Searching for some data: SELECT

CREATE TABLE command is used to specify:

(1) The table name

(2) Description of the attributes, including:

(a) Attribute name

(b) The data type of the attribute

(c) Constraints on the attribute values, including:

(i) Whether some tuple may have NULL value for this attribute;

(ii) The range of values allowed for this attribute

(iii) Referential constraints on the attribute

(3) Primary key attributes

(4) Foreign key attributes

Example 1:

Here is a simplified DB table for records of books in a library:

CREATE TABLE Book (

isbn VARCHAR(15) NOT NULL,

title VARCHAR(200) NOT NULL,

catalog_no VARCHAR(15) NOT NULL,

copy_no INT NOT NULL DEFAULT 1,

keywords CHAR(100) NULL,

purchase_date DATE NULL,

PRIMARY KEY CLUSTERED(catalog_no, copy_no))

1

Page 2: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

The following table lists some of the commonly used data types. For a complete list, you can check any

online SQL documentation, e.g. at http://dev.mysql.com/doc/mysql/en/index.html

Data type Description/Notes

BOOLEAN, INT BOOLEAN is 1 bit; INT is a signed 32-bit integer.

FLOAT, DOUBLE single or double precision floating point numbers

CHAR(N), VARCHAR(N) character string of at most N characters

DATETIME

DATE

Date and time stored in format: 'YYYY-MM-DD HH:MM:SS'

Date stored in format 'YYYY-MM-DD' You can specify Date and Time in several formats:

• As a string in either 'YYYY-MM-DD HH:MM:SS' or 'YY-MM-DD HH:MM:SS' format. Any punctuation character can be used as the delimiter between date parts or time parts. e.g., '98-12-31 11:30:45', '98.12.31 11+30+45', '98/12/31 11*30*45', …

• If only Date is specified for a datetime, then the time part of the entries will be 00:00:00.

BLOB This type is used to store large binary objects, e.g. files, images,…

Example 2:

The following examples shows how some common types of constraints can be defined on tables. CHECK

constraints are checked whenever the value of this attribute is modified or a new row is inserted. The

requested operation is disallowed if the constraint fails. Foreign key constraints are checked at these two

types of actions, plus, when a referenced attribute in the parent table is changed or deleted.

CREATE TABLE Borrows (

catalog_num VARCHAR(15) NOT NULL,

copy_num INT NOT NULL,

issue_date DATE NOT NULL,

person_id CHAR(8) NOT NULL,

PRIMARY KEY CLUSTERED(catalog_num, copy_num, person_id, issue_date),

CONSTRAINT fk_borrows_book FOREIGN KEY(catalog_num, copy_num) REFERENCES

Books(catalog_no, copy_no),

CONSTRAINT fk_borrows_person FOREIGN KEY(person_id) REFERENCES Person( id) )

2

Page 3: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 3:

CREATE TABLE Person (

lname VARCHAR(35) NOT NULL,

fnames VARCHAR(50) NOT NULL,

email varchar(60) NOT NULL UNIQUE CHECK ( email LIKE ‘%@%’),

id CHAR(8) NOT NULL,

phone CHAR(12) NULL,

PRIMARY KEY (id) )

Each time you insert a row of data (or even change the value of any attribute of a tuple in a table), several

types of constraints are checked to ensure against incorrect data entry. These include:

- the data is the correct type (domain integrity)

- disallowing a ‘null’ value if the design requires the attribute to be non-null

- the value entered must match any constraint specified via ‘CHECK’ functions

- if the attribute(s) are key, or UNIQUE, then no two rows in the table will be allowed to have these

values repeated

- If the attribute refers to an attribute of another table, the entered value must be present in at least one

row of the referred table

If the primary key has more than one attributes, then you must use the keyword CLUSTERED.

WARNINGS ABOUT IMPLEMENTATIONS:

1. Some DBMS’s, e.g. MySQL, do not have complete implementation yet -- for example,

CHECK (..) constraints do no work on the version provided by ITSC;

2. Referential constraints specification: Notice that when we define the Table ‘Borrows’, it

references entries from a table ‘Person’, which has not yet been defined. In many DBMS’s, such

references will not be allowed. In such cases, you will first need to create the table ‘Borrows’

without the constraint fk_borrows_person. Next, you create the table ‘Person’; and finally, you

add the fk_borrows_person constraint by the use of an “ALTER TABLE Borrows …” command.

DROP TABLE command is used to:

(1) Delete all the data in a table AND

(2) Delete the definition of the table itself from the DB.

3

Page 4: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 1:

DROP TABLE Person;

However, in this case, we have another table ‘Borrows’, which references some attribute form the

‘Person’ table. The deletion of all data from ‘Person’ will result in violation of referential constraints for

each row in ‘Borrows’. Therefore, most DBMS’s will warn/disallow the above DROP TABLE command.

In such cases, you may choose to use the CASCADE option:

Example 1a:

DROP TABLE Person CASCADE;

This command does the following:

- if there is any attribute of ‘Person’ that is referenced from another table, then that referential constraints

are deleted from the definitions of the referencing table.

- all data in the table is deleted, and

- the definition of the table itself is deleted from the DB.

ALTER TABLE command is used to:

(1) Add a new column in a table

(2) Delete a column from a table

(3) Add/Delete a constraint specified on a table

Suppose we want to add a column, fines, to store the outstanding total fine that a person needs to pay.

Example 1:

ALTER TABLE Person ADD fines FLOAT;

In this case, the attribute ‘fines’ will be defined as ‘NULL’ by default, since there may already be some

rows of data in the table ‘Person’, which have no value set for ‘fines’. If you must add a ‘NOT NULL’

attribute, then you must also provide a default value for it. In our example, suppose that the library will

categorize all books so as to control the allowed period of borrowing. This attribute can be added as

follows:

4

Page 5: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 2:

ALTER TABLE Book ADD category VARCHAR(10) NOT NULL DEFAULT “normal” CHECK

(category in (“normal”, “reserve”, “media”));

In general, you should try to design a DB such that there is no need for ALTER commands. What

happens to data integrity after an ALTER TABLE command is issued?

Example 3a.

ALTER TABLE Borrows DROP CONSTRIANT fk_borrows_person;

Suppose we now enter a record in ‘Borrows’ with a person-id that is not in the Person table. This will be

allowed, since the corresponding foreign key constraint was dropped. Suppose we now realize our

mistake, and add back the foreign key constraint:

Example 3b:

ALTER TABLE Borrows ADD CONSTRAINT fk_borrows_person FOREIGN KEY(person_id)

REFERENCES Person( id);

What happens to the Person whose record was entered between the two ALTER TABLE commands ?

INSERT INTO TABLEcommand is used to:

(1) Add one or more rows of data into a table

Example 1:

INSERT INTO Person VALUES ( ‘Bush’, ‘George W.’, ‘[email protected]’, ‘09112001’, NULL, 0);

Notice that

- Since the ‘fines’ attribute was added after the table was created, it is the last attribute in the table.

- Since NULL values are allowed for the attribute ‘phone’, we can enter NULL. Also, CHAR and

VARCHAR data types must be put in single-quotes. Numeric types (INT, FLOAT) are not quoted. In

mySQL, DATE type is usually input in the format ‘YYYY-MM-DD’ and is also displayed as such.

In many DBMS’s, you will be allowed to set your format for DATE and DATETIME types, after which

you can enter such data in your specified format.

5

Page 6: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 2:

INSERT INTO Book VALUES ( ‘0321122267’, ‘Fundamentals of Database Systems’, ‘QA76.9.D3’, 1,

‘Databases’, ‘2004-09-25’);

Notes:

- The arguments to VALUES are placed in exactly the same sequence as they occur in the CREATE

TABLE command.

- The data type of each argument must match the data type of the attribute

- A null entry can be entered as NULL (no quotes)

- Records are inserted one at a time, so it is quite tedious to enter each row of data into a table by

manually typing the INSERT INTO command -- it is much easier to create a data entry form, and use a

program to make and run the INSERT command (you will learn this in your labs).

Most DBMS’s, you will be allowed to directly import data from a text file into a table. However, there are

several restrictions regarding how the data in such input files must be formatted (e.g. each entry must be

separated by exactly one TAB, each row ends with a Newline, NULL entries are entered using special

symbols such as \N, and so on).

Most DBMS’s will also allow you to directly import multiple rows or entire tables and their definitions

from other RDBMS’s.

DELETE FROM TABLEcommand is used to:

(1) Delete one or more rows of data into a table

Example 1:

DELETE FROM Person

WHERE id= ‘09112001’;

The above command will delete the entire row corresponding to the person with id=’09112001’, namely

the record for the person called George W. Bush from our DB. DELETE can delete more than one row

from a table:

6

Page 7: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 2:

DELETE FROM Person WHERE lname=’Bush’;

This will delete all records for people with last name “Bush”.

Example 3:

DELETE FROM Borrows where 1

The above SQL command will delete every row of the table ‘Borrows’.

Example 4a:

DELETE FROM Borrows WHERE person_id IN (“09112001”, “55554444”, “12345678”);

This will delete all records of borrowed books that were taken persons whose id is in the provided list.

The list can be written explicitly (as in example 4a), or it can be generated by a different DB query, as in

example 4b below. The “SELECT …” part in the query below is itself an SQL query, which returns a list

of id’s from the table ‘Persons’, for those persons who have last name “Bush”.

Example 4b:

DELETE FROM Borrows WHERE person_id IN (SELECT id FROM Person WHERE lname= ‘Bush’);

UPDATE TABLE command is used to:

(1) Modify the value of one or more cells in a table.

UPDATE modifies some data in rows that are already in the table. It cannot create new rows (you must

use the INSERT command to do so).

Example 1:

UPDATE Borrows SET issue_date=CURRENT_DATE( ) WHERE person_id=’09112001’;

There are two things to note in the above example:

- The function CURRENT_DATE( ) returns the current date according to the time/date set on the DB

server. This function is specific to MySQL, although all DBMS’s have a similar function.

- The effect of the above operation is to change the issue_date of every book that has been borrowed by

the person with id = 09112001.

7

Page 8: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

In the following example, we shall see that you can put arithmetic expressions inside SQL queries.

Example 2:

UPDATE Person SET fines= fines*2.0 WHERE id=’09112001’;

The above query will have the effect of doubling the fines for the person with id = 09112001. In practice,

we can use fairly complex conditions in the ‘WHERE’ clause to isolate particular tuples in which we

want to make changes. We will see many examples of this in our study of the SELECT command.

SELECT command is used to:

(1) Output required information from one or more tables.

The SQL SELECT command is equivalent to the combination of all the RA functions and even some

extra functionality. We shall use a ‘lern-by-example’ approach, using the Employee-Department-Projects

database from the Elmasri-Navathe textbook for all examples.

Example 1: Report the birth date and address of employee named "John Smith"

SELECT BDate, Address

FROM EMPLOYEE

WHERE Fname = ‘John’ AND

Lname = ‘Smith’;

OUTPUT BDate Address

9-Jan-55 731 Fonden

Notes:

- The SELECT command has at least two clauses: SELECT [list of attributes] FROM [tables]. If the

FROM clause has more than one table, then all of the named tables will be JOIN-ed. The attributes listed

after SELECT must belong to one of the tables named in the SELECT command.

- In the above example, the WHERE [expression] is also used. This expression is evaluated for each row

of the named table (EMPLOYEE in our example). If it is TRUE, then the named attributes of that row are

output.

8

Page 9: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

- The output of a SELECT command may contain repeated identical rows (this is different from RA). If

you want SQL to eliminate any repeated rows, you can use the DISTINCT keyword:

Example 1a: Report the SSN of Employees who spend more than 15 hours on some project.

The difference in the outputs with or without the use of DISTINCT is shown below. Using DISTINCT is

preferred if we only want to know how many people work over 15 Hrs on some project; if we want to

know how many assignments of over 15 Hours per week are there, then the second query is useful.

SELECT DISTINCT ESSN

FROM WORKS_ON

WHERE Hours > 15;

OUTPUT ESSN

123456789 666884444 453453453 999887777 987987987

SELECT ESSN

FROM WORKS_ON

WHERE Hours > 15;

OUTPUT ESSN

123456789 666884444 453453453 453453453 999887777 987987987 987987987

Example 2: Report the Name and address of employees working in the “Research” department.

SELECT Fname, Lname, Address

FROM EMPLOYEE, DEPARTMENT

WHERE Dname = ‘Research’ AND

Dnumber = Dno

- The information is spread across two tables, EMPLOYEE and DEPARTMENT, so we need a JOIN

operation, which is done by listing both tables in the FROM clause.

- The join-condition is: (Dnumber = Dno);

- The selection condition is (Dname = ‘Research’)

9

Page 10: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

OUTPUT Fname Lname Address

John Smith 731 Fonden Franklin Wong 638 Voss Ramesh Narayan 975 Fire Oak Joyce English 5631 Rice

Example 3: For each project located in Stafford, list the project number, the controlling department, and

the department manager's last name and address.

SELECT Pnumber, Dnum, Lname, Address

FROM PROJECT, DEPARTMENT, EMPLOYEE

WHERE Dnum = Dnumber AND

MgrSSN = SSN AND

Plocation = ‘Stafford’;

OUTPUT

Pnumber Dnum Lname Address

10 4 Wallace 291 Berry 30 4 Wallace 291 Berry

- In this case, there are two join operations:

1. The join condition (Dnum = Dnumber) relates a project to its controlling department.

2. The join condition (MgrSSN = SSN) relates the controlling department to the employee who

manages that department.

The Dot-notation:

Sometimes, attributes in different tables can have the same name (e.g., Dnumber in DEPARTMENT and

DEPT_LOCATIONS). If two such tables need to be joined, then we must specify which attribute we are

really referring to. This is done by using the DOT-Notation for naming of attributes:

DEPARTMENT.Dnumber, DEPT_LOCATIONS.Dnumber, EMPLOYEE.Lname, etc.

ALIAS

Some queries need to refer to the same table twice. In this case, we can assign ALIAS names to the tables:

10

Page 11: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 4: For each employee, give the last name, and the last name of his/her supervisor.

SELECT E.Lname, S.Lname

FROM EMPLOYEE AS E, EMPLOYEE AS S

WHERE E.SuperSSN = S.SSN

OUTPUT

E.Lname S.Lname

Smith Wong Wong Borg Zeleya Wallace Wallace Borg Narayan Wong English Wong Jabbar Wallace

- An alias is defined using the keyword “AS”. It is a method of assigning an alternate name to an object,

e.g. a table or an attribute. E and S (both identical to the EMPLOYEE table), are joined using the

condition (E.SuperSSN = S.SSN).

Example 5: Print SSN of all employees.

SELECT SSN

FROM EMPLOYEE

OUTPUT SSN 123456789 333445555 999887777 987654321 666884444 453453453 987987987 888665555

11

Page 12: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

- If the WHERE clause is missing, then SQL assumes that WHERE is TRUE for all rows; the above

query is the same as: SELECT SSN FROM EMPLOYEE WHERE 1;

Example 6: A common error in writing queries:

SELECT SSN, Dname

FROM EMPLOYEE, DEPARTMENT

- How many rows will the output contain? If you are JOIN-ing two or more tables, don’t forget to

specify the JOIN-condition!

Example 7: You can use ‘*’ to denote all attributes

SELECT *

FROM EMPLOYEE

- This command will output the entire EMPLOYEE Table.

- The ‘*’ can also be used in the dot-notation:

Example 7a:

SELECT DEPT_LOCATION.*, DEPARTMENT.Dname

FROM DEPT_LOCATION, DEPARTMENT

WHERE DEPT_LOCATION.Dnumber = DEPARTMENT.Dnumber

OUTPUT

Dnumber Dlocation Dname 1 Houston Headquarters 4 Stafford Administration 5 Bellaire Research 5 Sugarland Research 5 Houston Research

12

Page 13: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 8: List all projects which either use an employee called "Wong", or are controlled by a

department managed by somebody called "Wong".

(SELECT Pname

FROM PROJECT, WORKS_ON, EMPLOYEE

WHERE Pnumber = PNo AND ESSN = SSN AND LName = 'Wong' )

UNION

(SELECT PName

FROM PROJECT, DEPARTMENT, EMPLOYEE

WHERE DNum = Dnumber AND SSN = MgrSSN AND LName = 'Wong');

First sub-query Second sub-query OUTPUT

PName ProductY ProductZ Computerization Reorganisation

PName ProductX ProductY ProductZ

OUTPUT PName ProductX ProductY ProductZ Computerization Reorganisation

- The output is the set union of the results of two separate sub-queries;

- In some systems, other set theoretic operators are available (set-difference is called EXCEPT, and

intersection is called INTERSECT); however, it is better not to assume that these are available

- For UNION to succeed, if the results of the two sub-queries have the same attributes, that are defined

identically.

Nested Queries

One of the most powerful features of SQL is that it allows arbitrary nesting of the queries within other

queries. This is good because it allows us to logically break down a complex query into simpler ones, and

then combine the queries to produce the final result. There are two types of nested queries: un-correlated

and correlated.

Un-correlated nested queries are those in which the nested query (the inner query) can be solved by

itself.

If the inner query also makes a reference to some attribute(s) of the outer query, then the query is called a

correlated nested query.

13

Page 14: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 9: Report the name and address of all employees working in the 'Research' department.

SELECT Fname, Lname, Address

FROM EMPLOYEE

WHERE Dno IN ( SELECT Dnumber

FROM DEPARTMENT

WHERE Dname = 'Research' )

The inner query, (SELECT Dnumber …) outputs a table with one column:

Result of inner query: Dnumber5

The outer query then tests the condition (WHERE Dno IN …): here, the result of the inner query is

treated as a set, and the IN-clause tests for set-membership. The type of the attribute compared ising ‘IN’

(in our example, Dno) must match the type of the attribute output in the inner query (Dnumber). vThe

result of the query is:

OUTPUT Fname Lname Address Dno John Smith 731 Fonden 5 Franklin Wong 638 Voss 5 Ramesh Narayan 975 Fire Oak 5 Joyce English 5631 Rice 5

Example 10: (correlated nested query) Get the names of all employees who have a dependent with the

same first name.

SELECT E.Fname, E.Lname

FROM EMPLOYEE AS E

WHERE E.SSN IN ( SELECT ESSN

FROM DEPENDENT

WHERE ESSN = E.SSN

AND

E.Fname = DependentName )

14

Page 15: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

OUTPUT E.Fname E.Lname

- The output is empty (in this case), but it is useful to understand how nested queries work:

For each tuple of the outer query,

Evaluate the inner query;

Check if the ‘WHERE…’ is TRUE;

If TRUE, output the result for this tuple.

Go to next tuple.

- Some general guidelines when writing nested queries:

1. Do not write nested queries of more than 3-levels.

2. By using aliases, it is always possible to create a single level query exactly equivalent to a multi-

level query:

Example 10a:

SELECT E.Fname, E.Lname

FROM EMPLOYEE AS E, DEPENDENT AS D,

WHERE E.SSN = D.ESSN AND E.Fname = DependentName

3. Always use explicit references in nested queries: tableName.attributeName

Why? Because the same table may be referred to in different levels of the nested query. A

reference to an unqualified attribute refers to the relation used in the innermost nested query

which uses that relation.

- Up to now, we have used nested queries in which the WHERE clause links to the inner query using the

set-element check, ‘IN’. This is only useful if the inner query has a single attribute in its “SELECT …”

list. However, often we need to identify tuples of inner queries by multiple attributes. In such cases, we

must use the EXISTS operator.

Example 11: Get names of employees who work for at least one project.

SELECT Fname, Lname

FROM EMPLOYEE

WHERE EXISTS ( SELECT *

FROM WORKS_ON

WHERE SSN = ESSN )

15

Page 16: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

OUTPUT Fname Lname John Smith Franklin Wong Alicia Zeleya Ramesh Narayan Joyce English Ahmad Jabbar James Borg

Let’s see how the EXISTS operator works.

For each tuple of the outer sub-query {

Evaluate inner sub-query as OutputI;

if OutputI has at least one tuple,

EXISTS is TRUE report the attributes listed in the outer sub-query from this tuple; }

You can use logical inverse of the IN and EXISTS operators, by using “NOT IN” or “NOT EXISTS”.

The evaluation of the NOT EXISTS operator is as follows:

For each tuple of the outer sub-query {

Evaluate the inner query as OutputI;

If OutputI has no tuples,

NOT EXISTS is TRUE report the attributes listed in the outer sub-query from this tuple; }

NOTE: EXISTS is only useful in correlated nested queries (WHY?)

Example 12: Find names of employees who do not work for even one project.

SELECT Fname, Lname

FROM EMPLOYEE

WHERE NOT EXISTS ( SELECT *

FROM WORKS_ON

WHERE SSN = ESSN )

OUTPUT Fname Lname Jennifer Wallace

16

Page 17: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

String comparison using ‘wildcards’. For all CHAR(n), VARCHAR(n), and even DATE, DATETIME

type of entries, we can perform substring matching. This is very useful when we do not know the exact

string that is input. For substring matches, there are two wildcards: ‘%’ matches zero or more contiguous

characters; ‘_’ matches exactly one character. The only operator that allows the use of wildcards is the

LIKE operator.

Example 13: Find names of all Employees who live on Fonden street.

SELECT Lname

FROM EMPLOYEE

WHERE Address LIKE ‘%Fonden%’;

OUTPUT Lname Smith

- This is useful if you don’t recall whether your data entry was “731 Fonden St” or 731 Fonden”, or if you

even forgot the house number.

Example 13a: Find names of all projects with name starting with ‘Product’.

SELECT PName

FROM PROJECT

WHERE PName LIKE ‘Product_’;

OUTPUT PName

ProductX ProductY ProductZ

- Depending on which version of SQL your DBMS uses, LIKE may be case-sensitive or case-insensitive.

- Advanced users: A very powerful pattern matching function for strings is called REGEXP or RLIKE. It

acts just like a regular expression in PHP or PERL. For example, if you want to match any LName

starting with ‘J’ or ‘j’, you can use:

17

Page 18: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

SELECT * FROM EMPLOYEE WHERE LName RLIKE ‘^[Jj]’;

To select records of all LNames ending with ‘ja’, you can use:

SELECT * FROM EMPLOYEE WHERE LName RLIKE ‘ja$’;

Post-processing outputs: Aggregate Functions and Grouping. Often, you would like to generate an

output, and then perform some post-processing operations to get a meaningful result. This is a very useful

mechanism and you will find that you use it quite often. The usual mechanism is as follows:

1. Use the output of a SQL query;

2. Divide the output into groups, with each group having some common (specified) characteristic;

3. Compute some statistical value for each group

4. Report the output group-by-group

Common aggregation functions include:

Sum: find the total value of some numerical valued attribute of several tuples;

Max: find the tuple with the maximum value for a given attribute;

Min: find the tuple with the minimum value for a given attribute;

Avg: find the average value of some numerical valued attributes for several tuples.

Example 14: Get the minimum, maximum, average and total salaries for employees of the Research

department.

SELECT sum(Salary), max( Salary), min( Salary), avg( Salary)

FROM EMPLOYEE, DEPARTMENT

WHERE Dno = Dnumber AND Dname = 'Research'

OUTPUT

13300 4000 2500 3325

By default, SQL will not assign names for aggregated attributes. However, it is good practice to assign an

alias name to each:

18

Page 19: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 14a: Get the minimum, maximum, average and total salaries for employees of the Research

department.

SELECT sum(Salary) AS Tot, max( Salary) AS Max, min( Salary) AS Min, avg( Salary) AS Mean

FROM EMPLOYEE, DEPARTMENT

WHERE Dno = Dnumber AND Dname = 'Research'

OUTPUT

Tot Max Min Mean

13300 4000 2500 3325

- In the above example, we did not explicitly form a ‘group’, so all rows in the output of the query were

put into one group. It is more common to have several groups.

Example 15: For departments other than Headquarters, get the department number, the number of

employees in that department, and their average salary.

SELECT Dno, count(*) AS HeadCount, avg(Salary) AS MeanSalary

FROM EMPLOYEE, DEPARTMENT

WHERE Dno = Dnumber AND Dname <> 'Headquarters'

GROUP BY Dno;

OUTPUT Dno HeadCount MeanSalary 5 4 3325 4 3 3100

1. The SELECT...FROM...WHERE query is first evaluated

2. In the resulting table, every tuple which has the same Dno value is ‘grouped’

3. The count(*) function prints out the number of rows in the group

4. The avg(Salary) function computes the mean of the ‘Salary’ attribute of the rows of each group

It is possible also to conditionally allow/exclude some groups from the results:

19

Page 20: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

Example 16: For ‘Large’ departments other than Headquarters, get the department number, the number of

employees in that department, and their average salary.

SELECT Dno, count(*) AS HeadCount, avg(Salary) AS MeanSalary

FROM EMPLOYEE, DEPARTMENT

WHERE Dno = Dnumber AND Dname <> 'Headquarters'

GROUP BY Dno

HAVING HeadCount > 3;

OUTPUT Dno HeadCount MeanSalary 5 4 3325

Example 17: Mathematical operators. Display the result of a 10% increase in Salary of employees

whose Last name starts with "B".

SELECT Lname, 1.1 * Salary AS IncreasedSalary

FROM EMPLOYEE

WHERE Lname LIKE 'B%'

OUTPUT Lname IncreasedSalary Borg 6050

Note that this does not change Borg’s salary -- it only displays what the increased value will be!

Sorted display of the output: Output can be sorted using the ORDER BY clause.

The default for the ordering is Ascending order. If you want to order in descending order, just use:

ORDER BY … DESC.

Example 18:

SELECT Lname, Salary

FROM EMPLOYEE

ORDER BY Salary DESC

20

Page 21: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

OUTPUT Lname Salary Borg 5500 Wallace 4300 Wong 4000 Narayan 3800 Smith 3000 Zeleya 2500 English 2500 Jabbar 2500

SELECT is a powerful command. In addition, different DBMS’s will provide many extra functions to

allow the output to be formatted and grouped as you desire. It is important to read the user-guide and

tutorials for the DBMS you will use to learn these additional examples and functions.

VIEWS and Security Control

A view is a single, virtual table derived from a set of existing tables. It may be defined using any

combination of one or more existing tables or views. Views have two important uses:

(a) They can be used to show data that is conceptually related in one table, even though the Normalization

process has required us to store the data physically in separate tables.

(b) They can be used to hide some part of information from (some subset of) users, making it easier to

control data security.

Example 1: Create a view showing the names of employees, which project they work on, and how many

hours they spend on each project.

CREATE VIEW EMP_WORKS_ON

AS SELECT Fname, Lname, Pname, Hours

FROM EMPLOYEE, PROJECT, WORKS_ON

WHERE SSN = ESSN AND Pno = Pnumber;

Example 1a: Show the data in EMP_WORKS_ON.

SELECT * FROM EMP_WORKS_ON

21

Page 22: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

EMP_WORKS_ON Fname Lname Pname Hours John Smith ProductX 32.5 John Smith ProductY 7.5 Ramesh Narayan ProductZ 40 Joyce English ProductX 20 Joyce English ProductY 20 Franklin Wong ProductY 10 Franklin Wong ProductZ 10 Franklin Wong Computerization 10 Franklin Wong Reorganization 10 Alicia Zeleya Newbenefits 30 Alicia Zeleya Computerization 10 Ahmad Jabbar Computerization 35 Ahmad Jabbar Newbenefits 5 Ahmad Jabbar Newbenefits 20 Ahmad Jabbar Reorganization 15 James Borg Reorganization null

Caution in using views

Views appear similar to any other table in a DB, yet it is important to understand the differences between

tables and views:

1. Each table has some information that exists in the computer memory (on the disk). A VIEW does

not correspond to any stored information on the disk. Data for a view is only generated when the query

is processed.

2. Update a view attribute data in the underlying table is updated

Example 2: What happens to employee hours if they work one-shift overtime?

UPDATE EMP_WORKS_ON

SET Hours = Hours * 1.5

What is the outcome ?

Since ‘Hours’ in EMP_WORKS_ON is actually derived from ‘Hours’ in WORKS_ON, therefore all the

data for ‘Hours’ in WORKS_ON will be modified.

22

Page 23: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

This behaviour can cause some unexpected results. Suppose that John Smith, who is currently working on

‘ProductX’ project, is reassigned assigned to ‘ProductY’ project. We may be tempted to write:

Example 3 (incorrect):

UPDATE EMP_WORKS_ON

SET Pname = 'ProductY'

WHERE Lname = 'Smith' AND

Pname = 'ProductX'

What happens now? Since the base table for ‘PName’ is PROJECTS, therefore in that table, the data

‘ProductX’ is changed to ‘ProductY’ -- this is obviously incorrect (why ?).

The correct query should be something like the following.

Example 3:

UPDATE WORKS_ON

SET Pno = ( SELECT Pnumber FROM PROJECTS WHERE Pname = 'ProductY')

WHERE ESSN = ( SELECT SSN FROM EMPLOYEE WHERE Lname = 'Smith')

AND

Pno = ( SELECT Pnumber FROM PROJECT WHERE Pname = 'ProductX');

3. A view can contain computed attributes, which are usually not explicitly stored in normal tables.

Example 3:

CREATE VIEW DEPT_INFO

AS SELECT DName, count(*) AS NumEmps, sum( Salary) AS TotalSalary

FROM DEPARTMENT, EMPLOYEE

WHERE DNumber = DNo

GROUP BY DName;

- Note that you cannot UPDATE a computed attribute.

23

Page 24: Introduction to Structured Query Language (SQL) · Introduction to Structured Query Language (SQL) SQL is the most popular Data Manipulation Language (DML), and in fact, the standard

A big advantage of using a DBMS is that you can control arbitrarily fine level of control on who can do

what operations to which data. This control is specified via the GRANT and REVOKE commands.

The DB Administrator (DBA) creates all user accounts (including user name and passwords), and has the

right to control all the privileges. In SQL, these privileges cover the commands SELECT (i.e. authority to

see the data), UPDATE, INSERT and DELETE (individual DBMS’s may provide further types of

security control, including encryption etc).

Example 4: Allow user U1 to see/modify all Employee data except Salaries.

CREATE VIEW EMP_PERSONNEL AS

SELECT Fname, Minit, Lname, SSN, BDate, Address, Sex, SuperSSN, Dno

FROM EMPLOYEE;

GRANT SELECT, UPDATE ON EMP_PERSONNEL to U1;

If we expect that the user U1 will need to assign data-lookup duties to other users, then we could use:

GRANT SELECT, UPDATE ON EMP_PERSONNEL TO U1 WITH GRANT OPTION;

Now, U1 can log onto the DB, and let other users, e.g. U2, to see (but not modify) the data:

GRANT SELECT ON EMP_PERSONNEL TO U2;

If later U2 is to be denied access, the privilege can be revoked:

REVOKE SELECT ON EMP_PERSONNEL FROM U2;

NOTE: In general, it is good practice to use VIEW’s to GRANT access for SELECT, but it is better to

use actual tables for GRANT on INSERT, DELETE and UPDATE commands. The reason is that

modifying data from a view will actually change the data in the underlying table, which can cause

unexpected results as we saw in Example 3 above.

- You can GRANT access for individual columns:

GRANT UPDATE ON EMPLOYEE( Salary) TO U3;

24