26
Logical Database Design (1 of 3) John Ortiz

Logical Database Design (1 of 3)

  • Upload
    sinjin

  • View
    60

  • Download
    2

Embed Size (px)

DESCRIPTION

Logical Database Design (1 of 3). John Ortiz. Introduction. The logical design is a process of refining DB schema based on functional dependencies. Why is a logical design needed? - PowerPoint PPT Presentation

Citation preview

Page 1: Logical Database Design (1 of 3)

Logical Database Design (1 of 3)

John Ortiz

Page 2: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 2

Introduction The logical design is a process of refining

DB schema based on functional dependencies.

Why is a logical design needed? Relations translated from ER may have

introduced data redundancy which leads to inconsistencies and update anomalies.

What is involved in logical design? Check for normal forms If necessary, decompose relations

Page 3: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 3

A Database with RedundancyAssume all students of same major are

assigned to the same academic advisor.SID Name Major GPA Advisor Office 2011 John CS 3.4 Smith 3345 1235 Carl CS 3.2 Smith 3345 1003 Ken Math 3.5 Johnson 1120 1034 Bill Math 2.5 Johnson 1120 2005 Mary CS 2.9 Smith 3345 2078 Frank Math 4.0 Johnson 1120 1922 Scott Chem 3.45 Ford 2525

Students

(FD): major Advisor, Office What problems does it have?

Page 4: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 4

Update Anomalies Insertion anomaly: How to add a new

major? Modification anomaly: What would happen

if we change office of Smith in the first tuple?

Deletion anomaly: What would happen if Scott is deleted?

SID Name Major GPA Advisor Office 2011 John CS 3.4 Smith 3345 1235 Carl CS 3.2 Smith 3345 1003 Ken Math 3.5 Johnson 1120 1034 Bill Math 2.5 Johnson 1120 2005 Mary CS 2.9 Smith 3345 2078 Frank Math 4.0 Johnson 1120 1922 Scott Chem 3.45 Ford 2525

Students

Page 5: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 5

A Better Design Decompose Students into two relations

Major Advisor Office CS Smith 3345 Math Johnson 1120 Chem Ford 2525

Major_AdvisorSID Name Major GPA 2011 John CS 3.4 1235 Carl CS 3.2 1003 Ken Math 3.5 1034 Bill Math 2.5 2005 Mary CS 2.9 2078 Frank Math 4.0 1922 Scott Chem 3.45

Students

Decomposition can remove redundancy It may also cause problems if not done

carefully.

Page 6: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 6

Cardinality Ratio The number of relationship instances that

an entity can participate in. There are three:

One-to-one (1:1) One-to-Many (1:M) Many-to-many (M:N)

Note: “1” may mean either zero or one and “many” may mean zero to the DBMS maximum

Page 7: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 7

1:1 Relationships A one-to-one relationship simply means

that for any single instance of one entity, it may participate in zero or one instance in the related entity

E.g. Employee manages Department An employee may or may not be a department

manager, however, each department needs one manager

Employee has partial participation, generally a department would have total participation

Page 8: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 8

1:1 Relationships The foreign key should be placed in the

entity with total participation If both have total participation, the location

of the foreign key is arbitrary Additionally, if both participations are total,

the two entities may be merged Why would you not want to do this?

Access Restrictions Special Attributes Referenced by another entity Performance in Distributed Databases

Page 9: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 9

1:1 Relationships Access Restrictions Special Attributes Referenced by another entity

Salary Med RecsEmp_ID

AIRLINE EMPLOYEES

1

1

Flying HoursEmp_ID1 Pilots only!

Confidential!

Tool Bag ID Color

Emp_ID Job StartDate Address Tool Bag ID

Storage Bin Tool Bag ID

Maint only!

Page 10: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 10

1:M Relationships A one-to-many relationship means that for

any single instance of one entity, it may participate in zero or more instances in the related entity, but the related entity may only participate in zero or one instance.

Subordinate to supervisor. Each subordinate generally will have one supervisor, but each supervisor usually may have zero or more employees.

Not absolute! Sometimes a subordinate may have >1 supervisors – consider this in design!

Page 11: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 11

1:M Relationships The foreign key always goes in the “many”

part of the relationship! This is counter-intuitive, be careful! The FK must always reference a PK in the

related entity or be null. referential integrity

What results from placing the FK in the “one” part of the relationship?

Can part of a PK also be a FK? That coupled with the “existence dependancy”

indicates which type of entity?

Page 12: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 12

1:M Relationship Design Thought Consider the attribute “class” of a

STUDENT entity. Given that all possible values are “Freshman”, “Sophomore”, “Junior”, “Senior”, or “Grad Student.” Is there any reason to place it in its own separate entity?

Page 13: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 13

M:N Relationships A many-to-many relationship means that for

any single instance of one entity, it may participate in zero or more instances in the related entity, and the related entity may participate in zero or more instances as well.

Subordinate to supervisor. Consider allowing each subordinate to have zero or more supervisors.

Now where does the FK go?

Page 14: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 14

M:N Relationships Actually, there are two FKs, one for each

entity. These two keys are placed in their own

entity. Together they make up the PK of that entity. In this case, all FKs must reference a valid

PK in the related entity – why? This entity may be called a link table, join

table, junction table, or simply many-to-many relationship.

Page 15: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 15

M:N Relationships The link table may contain it’s own

attributes. For example, an evaluation by a supervisor. That evaluation may NOT go in the

subordinate entity, because there may be others from different supervisors.

It can not go in the supervisor entity because there may be more than one subordinate.

Only the subordinate/supervisor entities together can uniquely determine that evaluation.

Consider course grade vs. G.P.A.

Page 16: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 16

Functional Dependencies (FDs) Let R(A1, ..., An) be a relation schema. Let

X and Y be two subsets of {A1, ..., An}, i.e., X, Y R. R satisfies a functional dependency, denoted by X Y, if in every legal instance r(R), for any pair of tuples t1 and t2 in r(R), if t1[X] = t2[X], then t1[Y] = t2[Y].

If X Y, we say that “X (functionally) determines Y”.

X may also be called a determinant FDs are about ALL states of a relation.

Page 17: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 17

An Example of FDs Consider a relation Student(SID, Name, Address, Phone,

Major)Some FDs may be: S NM, NA P, … Notations

Use single letter for attributes, i.e., S for SID, N for Name, A for Address, …

Use the relation name for the set of attributes of relation, i.e., Students for {SNAPM}

Page 18: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 18

Properties of FDs X Y says redundant X-values always

cause the redundancy of Y-values. FDs are given by DBAs or DB designers. FDs are enforced/guaranteed by DBMS. Given an instance r of a relation R, we

can only determine that some FD is not satisfied by R, but can not determine if an FD is satisfied by R.

Page 19: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 20

Which FD is satisfied? Which FD does R(A, B, C, D) satisfy, if the

following instance is the only instance of R?

A B, A C, C A, A D, B D, AB D

A B C D A1 B1 C1 D1 A1 B2 C1 D2 A2 B2 C2 D2 A2 B3 C2 D3 A3 B3 C2 D4

R

Page 20: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 21

Closure of FDs Given some FDs, new FDs can often be

inferred. E.g., from S NM and M D, we can infer S D.

Let F be a set of FDs satisfied by R, and X, Y, K be subsets of attributes of R.

F (logically) implies X Y if whenever R satisfies (all FDs in) F, R also satisfies X Y.

The closure of F, denoted by F+, is the set of FDs implied by F, that is,

F+ = {X Y | F implies X Y}.

Page 21: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 22

Candidate Key & Trivial FDs Let F be a set of FDs satisfied by R, and K

R. K is a candidate key (CK) of R if K R is in F+ (i.e., F implies K R); and there is no X K, such that X R is also

in F+ (minimality). (no part of K R ) An FD X Y is trivial if Y X.

Some trivial FDs are: AB B, A , .

Find the candidate keys and trivial FDs in the next example. What is ABC?

Page 22: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 23

F+: An Example *Example: Let F = {AB C, C B} be a set of

FDs satisfied by R(A, B, C).F+ = {A , A A, AB , AB A, AB B,

AB C, AB AB, AB AC, AB BC, AB ABC, AC , AC A, AC B, AC C, AC AB, AC AC, AC BC, AC ABC, ABC , ABC A, ABC B, ABC C, ABC AB, ABC AC, ABC BC, ABC ABC, B , B B, BC , BC B, BC C, BC BC, C , C B, C C, C BC, }

Page 23: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 24

Inference Rules for FDsLet F be a set of FDs satisfied by R, and X, Y,

Z R. Armstrong’s Axioms (1974) for deriving

new FDs(IR1) Reflexivity: If X Y, then X Y is

satisfied by R.(IR2) Augmentation: If X Y is satisfied by

R, then XZ YZ is also satisfied by R. (IR3) Transitivity: If X Y and Y Z are

satisfied by R, then so is X Z.

Page 24: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 25

Compute FD Closure F+ Let F be a set of FDs. To compute F+, start

with FDs in F, repeatedly apply IR1-IR3, until no new FD can be derived.

Theorem: Armstrong's Axioms are sound and complete. Sound: no incorrect FD will be added to

F+. Complete: all FDs in F+ will be

generated.

Page 25: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 26

Additional Axioms Additional rules derivable from

Armstrong's Axioms. (IR4) Decomposition: { X YZ } { X Y, X Z } (IR5) Union: { X Y, X Z } X YZ (IR6) Pseudotransitivity: { X Y, WY Z } WX Z

Page 26: Logical Database Design (1 of 3)

Lecture 6 Logical Database Design (1) 27

Closure of Attributes How to determine if F implies X Y? Method 1: Check if X Y is in F+.

Problem: F+ is too expensive to compute!

Method 2: Compute closure of X under F. X+ = { A |= X A F+ }

X+ is the set of attributes that are functionally determined by X under F.

Theorem: X Y F+ if and only if Y X+. Proof: Use IR4 & IR5.