Upload
randolph-mckenzie
View
215
Download
0
Embed Size (px)
Citation preview
Database Conceptual and Logical Design
Zachary G. Ives/Grigoris KarvounarakisUniversity of Pennsylvania
CIS 550 – Database & Information Systems
October 3, 2007
Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan
2
Now: How Do We Get the Database in the First Place?
Database design theory!
Neat outcome: we can actually prove that we have optimal design, in a manner of speaking…
But first we need to understand how to visualize in pretty pictures…
3
Databases Anonymous:A 6-Step Program
1. Requirements Analysis: what data, apps, critical operations
2. Conceptual DB Design: high-level description of data and constraints – typically using ER model
3. Logical DB Design: conversion into a schema4. Schema Refinement: normalization
(eliminating redundancy)5. Physical DB Design: consider workloads,
indexes and clustering of data6. Application/Security Design
4
Entity-Relationship Diagram(based on our running example)
STUDENTS COURSESTakes
namesid serno subj
PROFESSORS
Teaches
cid
fid name
entity set relationship set
exp-grade
attributes (recall these have domains)
Underlined attributes are keys
semester
5
Conceptual Design Process
What are the entities being represented?
What are the relationships?
What info (attributes) do we store about each?
What keys & integrity constraints do we have?
name
STUDENTS
Takes
sid
exp-grade
6
Translating Entity Sets toLogical Schemas & SQL DDL
CREATE TABLE STUDENTS (sid INTEGER, name VARCHAR(15) PRIMARY KEY (sid) )
CREATE TABLE COURSES (serno INTEGER, subj VARCHAR(30), cid CHAR(15), PRIMARY KEY (serno) )
Fairly straightforward to generate a schema…
7
Translating Relationship Sets
Generate schema with attributes consisting of: Key(s) of each associated entity (foreign keys) Descriptive attributes
CREATE TABLE Takes (sid INTEGER, serno INTEGER, exp-grade CHAR(1), PRIMARY KEY (?), FOREIGN KEY (serno) REFERENCES COURSES, FOREIGN KEY (sid) REFERENCES STUDENTS)
8
… OK, But What about Connectivityin the E-R Diagram?
Attributes can only be connected to entities or relationships
Entities can only be connected via relationships
As for the edges, let’s consider kinds of relationships and integrity constraints…
COURSESPROFESSORS Teaches
(warning: the book has a slightly different notation here!)
9
Logical Schema Design
Roughly speaking, each entity set or relationship set becomes a table (not always be the case; see Monday)
Attributes associated with each entity set or relationship set become attributes of the relation; the key is also copied (ditto with foreign keys in a relationship set)
10
Binary Relationships & Participation
Binary relationships can be classified as 1:1, 1:Many, or Many:Many, as in:
1:1 1:n m:n
11
1:Many (1:n) Relationships
Placing an arrow in the many one direction, i.e. towards the entity that’s ref’d via a foreign key
Suppose profs teach multiple courses, but may not have taught yet:
Suppose profs must teach to be on the roster:
COURSESPROFESSORS Teaches
COURSESPROFESSORS Teaches
Partial participation (0 or more…)
Total participation (1 or more…)
12
Many-to-Many Relationships
Many-to-many relationships have no arrows on edges The “relationship set” relation has a key that
includes the foreign keys, plus any other attributes specified as key
STUDENTS COURSESTakes
13
Examples
Suppose courses must be taught to be on the roster
Suppose students must have enrolled in at least one course
14
Representing 1:n Relationships in Tables
CREATE TABLE Teaches( fid INTEGER, serno CHAR(15), semester CHAR(4), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS, FOREIGN KEY (serno) REFERENCES COURSES)
CREATE TABLE Teaches_Course( serno INTEGER, subj VARCHAR(30), cid CHAR(15), fid CHAR(15), name CHAR(40), PRIMARY KEY (serno), FOREIGN KEY (fid) REFERENCES PROFESSORS)
• Key of relationship set:
• Or embed relationship in “many” entity set:
15
1:1 Relationships
If you borrow money or have credit, you might get:
What are the table options?
CreditReport Borrower
delinquent?
ssn
namedebt
Describesrid
16
Roles: Labeled Edges
Sometimes a relationship connects the same entity, and the entity has more than one role:
This often indicates the need for recursive queries
name
qty
Partsid
Assembly Subpart
Includes
17
DDL for Role ExampleCREATE TABLE Parts (Id INTEGER, Name CHAR(15), … PRIMARY KEY (ID) )
CREATE TABLE Includes (Assembly INTEGER, Subpart INTEGER, Qty INTEGER, PRIMARY KEY (Assemb, Sub), FOREIGN KEY (Assemb) REFERENCES Parts, FOREIGN KEY (Sub) REFERENCES Parts)
18
Married
Roles vs. Separate Entities
Husband Wifeid id
Husband Wife
name name
What is the differencebetween these two representations?
Married
Personid
name
19
ISA Relationships: Subclassing(Structurally)
Inheritance states that one entity is a “special kind” of another entity: “subclass” should be member of “base class”
name
ISA
Peopleid
Employees salary
20
But How Does this Translateinto the Relational Model?
Compare these options: Two tables, disjoint tuples Two tables, disjoint attributes One table with NULLs Object-relational databases
21
Weak Entities
A weak entity can only be identified uniquely using the primary key of another (owner) entity. Owner and weak entity sets in a one-to-many
relationship set, 1 owner : many weak entities Weak entity set must have total
participation
People Feeds Pets
ssn name weeklyCost name species
22
Translating Weak Entity Sets
Weak entity set and identifying relationship set are translated into a single table; when the owner entity is deleted, all owned weak entities must also be deleted
CREATE TABLE Feed_Pets ( name VARCHAR(20), species INTEGER, weeklyCost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn), FOREIGN KEY (ssn) REFERENCES People, ON DELETE CASCADE)
23
N-ary Relationships
Relationship sets can relate an arbitrary number of entity sets:
Student Project
Advisor
IndepStudy
24
Summary of ER Diagrams
One of the primary ways of designing logical schemas
CASE tools exist built around ER (e.g. ERWin, PowerBuilder, etc.) Translate the design automatically into DDL,
XML, UML, etc. Use a slightly different notation that is better
suited to graphical displays Some tools support constraints beyond what ER
diagrams can capture Can you get different ER diagrams from the
same data?
25
Schema Refinement & Design Theory
ER Diagrams give us a start in logical schema design
Sometimes need to refine our designs further There’s a system and theory for this Focus is on redundancy of data
Let’s briefly touch on one key concept in preparation for Monday’s lecture on normalization…
26
Not All Designs are Equally Good
Why is this a poor schema design?
And why is this one better?
Stuff(sid, name, cid, subj, grade)
Student(sid, name)Course(cid, subj)Takes(sid, cid, exp-grade)
27
Focus on the Bad Design
Certain items (e.g., name) get repeated Some information requires that a student be
enrolled (e.g., courses) due to the key
sid
name
cid
subj
exp-grade
1 Sam 570
AI B
23 Nitin 550
DB A
45 Jill 505
OS A
1 Sam 505
OS C
28
Functional DependenciesDescribe “Key-Like” Relationships
A key is a set of attributes where:If keys match, then the tuples match
A functional dependency (FD) is a generalization:If an attribute set determines another, written A ! B
then if two tuples agree on A, they must agree on B:
sid ! Address
What other FDs are there in this data?
FDs are independent of our schema design choice
29
Formal Definition of FD’s
Def. Given a relation scheme R (a set of attributes) and subsets X,Y of R:An instance r of R satisfies FD X Y if,
for any two tuples t1, t2 2 r, t1[X ] = t2[X ] implies t1[Y] = t2[Y]
For an FD to hold for scheme R, it must hold for every possible instance of r
(Can a DBMS verify this? Can we determine this by looking at an instance?)
30
General Thoughts on Good Schemas
We want all attributes in every tuple to be determined by the tuple’s key attributesWhat does this say about redundancy?
But: What about tuples that don’t have keys (other
than the entire value)? What about the fact that every attribute
determines itself?
Stay tuned for Monday!