Upload
leah-randall
View
51
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Distributed Database Design. COSC 5040 Week One. Outline. Introduction Course overview Database systems concepts Relational database model Structured query language (SQL). Database System Concept. Data Known facts Database A collection of related data - PowerPoint PPT Presentation
Citation preview
Distributed Database Design
COSC 5040Week One
Jiangping WangWebster University Distributed Database Design
OutlineIntroductionCourse overviewDatabase systems conceptsRelational database modelStructured query language (SQL)
Jiangping WangWebster University Distributed Database Design
Database System ConceptData
Known facts
DatabaseA collection of related data
Database Management System (DBMS)A software system to facilitate the defining, constructing, manipulating, and sharing of a computerized database
Database SystemThe DBMS software together with the data itselfSometimes, the applications are also included
Jiangping WangWebster University Distributed Database Design
Typical DBMS Functionality
Define a databaseConstruct and load the databaseManipulating the database
Querying, generating reports, insertions, deletions and modifications
Concurrent processing and sharingProtection or securityPresentation and visualization
Jiangping WangWebster University Distributed Database Design
Database System Environment
Jiangping Wang
Example of a Database
Webster University Distributed Database Design
Figure 1.2 A database that stores student and course information.
Jiangping Wang
Example of a Database
Webster University Distributed Database Design
Jiangping Wang
Database ManipulationDatabase manipulation involves querying and updating
P. 9Examples of queriesExamples of updates
Webster University Distributed Database Design
Jiangping WangWebster University Distributed Database Design
Database Approach Characteristics
Self-describing nature of a database systemMeta-data
Insulation between programs and data, data abstraction
Program-data independence
Support of multiple views of the dataVirtual data
Sharing of data and multi-user transaction processing
Concurrency control
Jiangping WangWebster University Distributed Database Design
Database UsersActors on the scene
Database administrators (DBA)Authorizing access to the databaseAcquiring software, and hardware resourcesControlling and monitoring efficiency of operations
Database designersDefine content, structure, constraints, and functions or transactionsCommunicate with the end-users
End-usersQueries, reportsUpdate the database content
Actors behind the scene
Jiangping WangWebster University Distributed Database Design
Database Users
Jiangping WangWebster University Distributed Database Design
Advantages of Database Approach
Controlling redundancyRestricting unauthorized accessProviding persistent storageProviding storage structures for efficient query processingProviding backup and recoveryProviding multiple interfacesRepresenting complex relationships among dataEnforcing integrity constraintsDrawing inferences and actions
Jiangping WangWebster University Distributed Database Design
Historical Development
Early database applicationsHierarchical modelNetwork model
Relational model based systemsObject-oriented applications: OODBs and ORDBMSsWeb and e-commerce applicationsDatabase for new applications
Jiangping WangWebster University Distributed Database Design
Data ModelsData model
Data abstractionA collection of concepts that can be used to describe the structure of a databaseEntities, attributes, relationshipsData types, constraints
Categories of data modelsConceptual (high-level, semantic) data modelsImplementation (representational) data modelsPhysical (low-level, internal) data models
Jiangping WangWebster University Distributed Database Design
Schemas and InstancesDatabase schema
Description of a database
Schema diagramDiagrammatic display of a database schema
Database stateActual data in the database at a particular moment in timeCurrent set of occurrences or instances
Jiangping WangWebster University Distributed Database Design
Schema Diagram
Jiangping WangWebster University Distributed Database Design
Three-Schema Architecture
Jiangping WangWebster University Distributed Database Design
Data IndependenceLogical data independence
The capacity to change the conceptual schema without having to change the external schemas and their application programs
Physical data independenceThe capacity to change the internal schema without having to change the conceptual schema
Jiangping WangWebster University Distributed Database Design
DBMS LanguagesStructured query language (SQL)Data definition language (DDL)
To specify database conceptual schema
Data manipulation language (DML)To specify database retrievals and updates
DBMS InterfacesStand-alone query language interfacesProgrammer interfaces for embedding DML in programming languages
Jiangping WangWebster University Distributed Database Design
Database System UtilitiesTo perform certain functions such as:
Loading data stored in files into a databaseData conversion toolsBacking up the database periodicallyReorganizing database file structuresReport generation utilitiesPerformance monitoring utilitiesSorting, user monitoring, data compressionData dictionary
Jiangping WangWebster University Distributed Database Design
Client-Server Architectures Centralized architectureClient-server architecture
ClientProvide appropriate interfaces and a client-version of the system to access and utilize the server resources
ServerProvides services to clientsDatabase server provides database query and transaction services to clients
Jiangping WangWebster University Distributed Database Design
Three Tier Client-Server Architecture
Jiangping WangWebster University Distributed Database Design
Classification of DBMSBased on data model
RelationalNetworkHierarchicalObject-orientedObject-relational
Other classificationsSingle-user vs. multi-userCentralized vs. distributed
Jiangping WangWebster University Distributed Database Design
Relational Model ConceptsThe relational model is based on the concept of a relationA relation is a mathematical concept based on the ideas of setsRelation: A table of values
Contains a set of rows and columns
Jiangping WangWebster University Distributed Database Design
Example of a Relation
Jiangping WangWebster University Distributed Database Design
DefinitionsThe schema, or description of a relation
R (A1, A2, .....An)
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
A tuple is an ordered set of valuesEach value is derived from an appropriate domain
A domain is a set of atomic valuesData type or format
An attribute designates the role played by the domain
Jiangping WangWebster University Distributed Database Design
DefinitionsThe relation is formed over a subset of the Cartesian product of the setsEach set has values from a domainThat domain is used in a specific role which is the attribute nameGiven R(A1, A2, .........., An)
r(R) dom (A1) X dom (A2) X ....X dom(An)
R: schema of the relationr of R: a specific "value" or population of R
Jiangping WangWebster University Distributed Database Design
ExampleLet R(A1, A2)
Let dom(A1) = {0,1}Let dom(A2) = {a,b,c}
Then, for example:r(R) = {<0,a> , <0,b> , <1,c> }is one possible “state” or “population” or “extension” r of the relation R, defined over domains D1 and D2It has three tuples
Jiangping WangWebster University Distributed Database Design
Definition ComparisonInformal Terms Formal Terms
Table Relation
Column Attribute
Row Tuple
Values in a column Domain
Table Definition Schema of a Relation
Populated Table State of the Relation
Jiangping WangWebster University Distributed Database Design
Characteristics of Relations
Ordering of tuples in a relation r(R)The tuples are not considered to be ordered
Ordering of values within each tupleThe attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> are ordered
Values in a tupleAll values are considered atomic (indivisible)A special null value is used to represent values that are unknown or inapplicable to certain tuples
Jiangping WangWebster University Distributed Database Design
Relational Integrity Constraints
Constraints are conditions that must hold on all valid relation instancesTypes of constraints
Domain constraintsKey constraintsEntity integrity constraintsReferential integrity constraints
Jiangping WangWebster University Distributed Database Design
Key ConstraintsUniqueness
A set of attributes of R such that no two tuples in any valid relation instance r(R) will have the same value
MinimalRemoval of any attribute results in a set of attributes that is not a key
If a relation has several candidate keys, one is chosen to be the primary key
The primary key value is used to uniquely identify each tuple in a relation
Jiangping Wang
Foreign KeyA set of attributes in one relation that references the primary key in another relation
Same domain(s)Value of foreign key either occurs as a value of primary key or is null
Webster University Distributed Database Design
Jiangping WangWebster University Distributed Database Design
Entity and Referential Integrity
Entity integrity constraint
No primary key value can be null
Referential integrity constraint
Foreign key value can be either an existing primary key value or a null value
Jiangping WangWebster University Distributed Database Design
Update OperationsUpdate operations
Insert a tuple (p. 76)Delete a tuple (p. 77)Update a tuple (p. 78)
Maintain integrity constraintsChild insert restrictChild update restrictParent update restrictParent delete restrict
Jiangping WangWebster University Distributed Database Design
Relational Database Schema
Jiangping WangWebster University Distributed Database Design
Jiangping WangWebster University Distributed Database Design
Exercise 3.16Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Specify the foreign keys for this schema, stating any assumptions you make.
Jiangping WangWebster University Distributed Database Design
SQLStructured query language (SQL)
SQL-86 or SQL1SQL-92 or SQL2SQL-99 or SQL3
Comprehensive database languageData definition (DDL)Data manipulation (DML)
QueryUpdate
Jiangping WangWebster University Distributed Database Design
Data Definition Language (DDL)
Used to CREATE, DROP, and ALTER the descriptions of the tables (relations) of a databaseData types
NumericCharacter stringBooleanData/time
Jiangping WangWebster University Distributed Database Design
CREATE TABLESpecifies its name, its attributes and their data typesA constraint NOT NULL may be specified
CREATE TABLE DEPARTMENT ( DNAME VARCHAR(10) NOT NULL,
DNUMBER INTEGER NOT NULL,MGRSSN CHAR(9),MGRSTARTDATE CHAR(9));
Jiangping WangWebster University Distributed Database Design
CREATE TABLEUse the CREATE TABLE command for specifying
Primary key attributesSecondary keys, andReferential integrity constraints (foreign keys)
CREATE TABLE DEPT( DNAME VARCHAR(10) NOT NULL,
DNUMBER INTEGER NOT NULL,MGRSSN CHAR(9),MGRSTARTDATE CHAR(9),PRIMARY KEY (DNUMBER),UNIQUE (DNAME),FOREIGN KEY (MGRSSN) REFERENCES EMP );
Jiangping WangWebster University Distributed Database Design
DROP TABLE and ALTER TABLE
Remove a relation (base table) and its definition
DROP TABLE DEPENDENT;
Add an attribute to one of the base relations
ALTER TABLE EMPLOYEE ADD JOB VARCHAR(12);
Jiangping WangWebster University Distributed Database Design
Retrieval Queries in SQLOne basic statement for retrieving information from a database
SELECT statement
Basic form is a SELECT-FROM-WHERE blockSELECT <attribute list>
FROM <table list>
WHERE <condition>
Jiangping WangWebster University Distributed Database Design
Simple SQL QueriesQuery 0:
Retrieve the birthdate and address of the employee whose name is 'John B. Smith'
SELECT BDATE, ADDRESS
FROM EMPLOYEE
WHERE FNAME='John'
AND MINIT='B'
AND LNAME='Smith';
Jiangping WangWebster University Distributed Database Design
Simple SQL QueriesQuery 1:
Retrieve the name and address of all employees who work for the 'Research' department
SELECT FNAME, LNAME, ADDRESSFROM EMPLOYEE, DEPARTMENTWHERE DNAME='Research'
AND DNUMBER=DNO;
DNAME='Research' is a selection conditionDNUMBER=DNO is a join condition
Jiangping WangWebster University Distributed Database Design
Simple SQL QueriesQuery 2:
For every project located in 'Stafford', list the project number, the controlling department number, and the department manager's last name, address, and birthdate
SELECT PNUMBER, DNUM, LNAME, ADDRESS, BDATE FROM PROJECT, DEPARTMENT, EMPLOYEEWHERE DNUM=DNUMBER AND MGRSSN=SSN
AND PLOCATION='Stafford';
There are two join conditionsDNUM=DNUMBER relates a project to its controlling departmentMGRSSN=SSN relates the controlling department to the employee who manages that department
Jiangping WangWebster University Distributed Database Design
AliasesA query that refers to the same name must qualify the attribute name with the relation nameSome queries need to refer to the same relation twiceQuery 8:
For each employee, retrieve the employee's name, and the name of his or her immediate supervisor
SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAMEFROM EMPLOYEE E, EMPLOYEE SWHERE E.SUPERSSN=S.SSN;
Jiangping WangWebster University Distributed Database Design
Unspecified Where-ClauseQuery 9:
Retrieve the SSN values for all employees
SELECT SSNFROM EMPLOYEE;
Query 10:Retrieve the SSN and department name values for all employees
SELECT SSN, DNAMEFROM EMPLOYEE, DEPARTMENT;
Resulting CARTESIAN PRODUCT
Jiangping WangWebster University Distributed Database Design
Use of Asterisk *Q1C:
SELECT *FROM EMPLOYEEWHERE DNO=5;
Q1D:
SELECT *FROM EMPLOYEE, DEPARTMENTWHERE DNAME='Research' AND
DNO=DNUMBER;
To retrieve all the attribute values
Jiangping WangWebster University Distributed Database Design
Use Of DistinctTo eliminate duplicate tuples in a query result, the keyword DISTINCT is used
Q11:SELECT SALARYFROM EMPLOYEE;
Q11A:SELECT DISTINCT SALARYFROM EMPLOYEE;
Jiangping WangWebster University Distributed Database Design
Set OperationsUNION, MINUS and INTERSECT operationsQuery 4:
Make a list of all project numbers for projects that involve an employee whose last name is 'Smith' as a worker or as a manager of the department that controls the project(SELECT PNAME
FROM PROJECT, DEPARTMENT, EMPLOYEEWHERE DNUM=DNUMBER AND MGRSSN=SSN
AND LNAME='Smith')UNION
(SELECT PNAMEFROM PROJECT, WORKS_ON, EMPLOYEEWHERE PNUMBER=PNO AND ESSN=SSN
AND LNAME='Smith');
Jiangping WangWebster University Distributed Database Design
Substring Matching
Query 12:Retrieve all employees whose address is in Houston, Texas
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE ADDRESS LIKE '%Houston, TX’;
Jiangping WangWebster University Distributed Database Design
Arithmetic Operations
Query 13:Show the resulting salaries if every employee on the ‘ProductX’ project is given a 10 percent raise
SELECT FNAME, LNAME, 1.1*SALARY AS INCREASED_SAL
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE SSN=ESSN AND PNO=PNUMBER AND PNAME=‘ProductX’;
Jiangping WangWebster University Distributed Database Design
Ordering of Query Results
Query 15:Retrieve a list of employees and the projects they are working on, ordered by department and, within each department, ordered alphabetically by last name, first name
SELECT DNAME, LNAME, FNAME, PNAME
FROM DEPARTMENT, EMPLOYEE, WORKS_ON, PROJECT
WHERE DNUMBER=DNO AND SSN=ESSN AND PNO=PNUMBER
ORDER BY DNAME, LNAME, FNAME;
Jiangping WangWebster University Distributed Database Design
Specifying Updates in SQLThere are three SQL commands to modify the database
INSERTDELETE, andUPDATE
Jiangping WangWebster University Distributed Database Design
INSERTU1:INSERT INTO EMPLOYEE
VALUES ('Richard', 'K', 'Marini', '653298653', ‘1962-12-30', '98 Oak Forest,Katy,TX', 'M', 37000, '987654321', 4);
U1A:INSERT INTO EMPLOYEE (FNAME, LNAME, SSN)
VALUES ('Richard', 'Marini', '653298653');
Jiangping WangWebster University Distributed Database Design
DELETEU4:
DELETE FROM EMPLOYEE WHERE LNAME='Brown';
DELETE FROM EMPLOYEE WHERE SSN='123456789';
DELETE FROM EMPLOYEE WHERE DNO=5;
DELETE FROM EMPLOYEE;
Jiangping WangWebster University Distributed Database Design
UPDATEU5:
Change the location and controlling department number of project number 10 to 'Bellaire' and 5, respectively
UPDATE PROJECTSET PLOCATION = 'Bellaire', DNUM = 5WHERE PNUMBER=10;
U6:Give all employees in the 'Research' department a 10% raise in salary
UPDATE EMPLOYEESET SALARY = SALARY * 1.1WHERE DNO IN = 5;
Jiangping WangWebster University Distributed Database Design
Reading and HomeworkReadings
Chapter 1, 2, 3, and 4
Week one homework