34
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems Database Systems

Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Embed Size (px)

Citation preview

Page 1: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Lecture No 14Functional Dependencies &

Normalization ( III )Mar 04th 2011

Database SystemsDatabase Systems

Page 2: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Database Development Requirements Analysis

Collect and Analyze the requirements of the users. Conceptual Design

Design a conceptual model, e.g., ER model. Logical Design

Translate the ER model into the relational model. Normalization.

Database Building Build the database and write application programs.

Operation, Maintenance, & Tuning Use, maintain, and “tune” the database.

Page 3: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Normalization

Decides which attributes should be grouped together in a relation

Validates and improves the logical design to the point before proceeding to physical design.

Based on the concept of Functional dependency.

Page 4: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Good Database Design

no redundancy of FACT (!)

no inconsistency

no insertion, deletion or update anomalies

no information loss

no dependency loss

Page 5: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Normalization and Normal Forms

Normalization: Decompose a “bad” table into several “good” ones. Move from a lower normal form to a higher (better) one.

Normal Forms: First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) *Higher Normal Forms (BCNF, 4NF, 5NF)

In practice, 3NF is often good enough.

Page 6: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

First Normal Form (1NF)

A table is in 1NF, if every row contains exactly one value for each attribute.

In other words, a table is in 1NF if it contains no multi-valued attribute.

By definition, any table must be in 1NF.

Page 7: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Second Normal Form (2NF)

A table is in 2NF, if it is in 1NF and every non-key attribute is irreducibly (fully) dependent

on the primary key.

2NF prohibits partial dependencies. If {A, B} is the primary key of a table, A -> C is a

partial dependency (i.e., a non-key attribute is dependent on part of the primary key.)

Page 8: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Example

Insert - cannot insert the fact that a supplier is located in a particular city until supplier supplies at least one part

Delete - delete the value P3, and we also lose the information that S3 is located in London

Update - S1 appears in the table more than once, and we must change all of them -- Possibility of producing an inconsistent result.

Status City P# Qty

20 London P1 1600

20 London P2 2000

S#

S1

S1

S2

S3

10

20

Paris

London

P2

P3

500

3000

Primary Key (Supplier#, Part#)

Page 9: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Example Note: City is a non-key field and only Note: City is a non-key field and only

dependent on part of the primary key(i.e. dependent on part of the primary key(i.e. supplier#). So this relation is not in 2NF.supplier#). So this relation is not in 2NF.

Page 10: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

By analyzing a given relation, we can identify some dependencies between its attributes.

In principle, we want to make attributes independent of each other as far as possible, so that updating one attribute will have no impact on the others.

Supplier#

Part# Status

City

Quantity

Example

Page 11: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

2NF Example

FIRST {S#, STATUS, CITY, P#, QTY} Primary key: {S#, P#} {S#} -> {CITY}, {CITY} -> {STATUS}

Test of 2NF {S#} -> {STATUS, CITY}: partial dependencies. FIRST is in 1NF, but not in 2NF.

Decomposition: SECOND {S#, STATUS, CITY} SP {S# (FK: references SECOND), P#, QTY}

Page 12: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

2NF-Solution to Problems

The solution is to replace the relation by two new relations (called One and Two)

Status City

20 London

10 Paris

Supplier#

S1

S2

S3 20 London

One: Primary Key (Supplier#)

Part# Quantity

P1 1600

P2 2000

Supplier#

S1

S1

S2

S3

P2

P3

500

3000

Two: Primary Key (Supplier#, Part#)

This revised structure overcomes all the problems sketched earlier, but:This revised structure overcomes all the problems sketched earlier, but:• Insert Insert - cannot insert the fact that a particular city has a status until a - cannot insert the fact that a particular city has a status until a

supplier is actually located in that city.supplier is actually located in that city.• DeleteDelete - delete the value S2, and we also lose the information that - delete the value S2, and we also lose the information that

Paris has the status value 10.Paris has the status value 10.• UpdateUpdate - Status value 20 appears in the table more than once. - Status value 20 appears in the table more than once.

Page 13: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Third Normal Form (3NF)

A table is in 3NF if and only if it is in 2NF and every non-key attribute is non-transitively dependent

on the primary key.

3NF prohibits transitive dependencies.

If A -> B and B -> C, C is transitively dependent on A.

Page 14: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

3NF Example

SECOND {S#, STATUS, CITY} Primary key: {S#} {S#} -> {CITY}, {CITY} -> {STATUS}

Test of 3NF: {S#} -> {CITY} -> {STATUS}: transitive dependency SECOND is in 2NF, but not 3NF.

Decomposition: CS {CITY, STATUS} SC {S#, CITY (FK: references CS)}

Page 15: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Third Normal Form (3NF)

A relation is in 3NF if and only if it is 2NF and all non-key fields are mutually independent.

Status City

20 London

10 Paris

Supplier#

S1

S2

S3 20 London

One: Primary Key (Supplier#)

Part# Quantity

P1 1600

P2 2000

Supplier#

S1

S1

S2

S3

P2

P3

500

3000

Two: Primary Key (Supplier#, Part#)

Note: Note: Relation Two is in 3NF (with only one non-key field), Relation Two is in 3NF (with only one non-key field), but relation One is not in 3NF because Status is dependent but relation One is not in 3NF because Status is dependent on City (both are non-key fields)on City (both are non-key fields)

Page 16: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

3NF - Converting 2NF into 3NF

We can now normalize relation One into two new relations (called three and four), which both are in 3NF.

City

London

Paris

Supplier#

S1

S2

S3 London

Three: Primary Key (Supplier#)

Status

20

10

City

London

Paris

Four: Primary Key (City)

Higher Normal Forms (e.g. 4NF and 5NF) do exist, Higher Normal Forms (e.g. 4NF and 5NF) do exist, but they are mainly of interest in academic societies but they are mainly of interest in academic societies rather than in the practical applications of database rather than in the practical applications of database design.design.

Page 17: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Relationships of Normal Forms

1NF

2NF

3NF/BCNF

4NF

5NF

DKNF

Page 18: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

*Higher Normal Forms

Boyce/Codd Normal Form (BCNF) A more restrictive 3NF.

Fourth normal forms Concerned with Multi-valued dependencies

Fifth normal forms Concerned with Join dependencies.

Other normal forms Domain-key normal form “Restriction-union” normal form

In practice, 3NF is often good enough.

Page 19: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

How to Decompose a Table

Let R{A, B, C} be a table, where A, B, and C are sets of attributes. If R satisfies the FD A -> B, then We can decompose R on {A, B} and {A, C} Heath’s Theorem

Why do we keep A in both tables?

Strategy: Move a problematic (partial or transitive) FD into a separate table. Keep common attributes so we can join the tables back.

Page 20: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Decomposition Example FIRST {S#, STATUS, CITY, P#, QTY}

Primary key: {S#, P#} {S#} -> {CITY}, {CITY} -> {STATUS}

Decompose FIRST into: SECOND {S#, STATUS, CITY} SP {S# (FK: references SECOND), P#, QTY}

Decompose SECOND into: CS {CITY, STATUS} SC {S#, CITY (FK: references CS)}

C. J. Date 6C. J. Date 6thth Edition Ch.10 page (292 Edition Ch.10 page (292305)305)

Page 21: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Exercise: Rescue a "Bad" design

Original Design: SPDB {S#, Sname, City, Status, P#, Pname, Weight,

QTY} S# -> Sname, City, Status City -> Status P# -> Pname, Weight S#, P# -> QTY

Problems: data redundancies data anomalies

Page 22: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Exercise: Check 2NF

SPDB is in 1NF. There are partial dependencies:

S# -> Sname, City, Status P# -> Pname, Weight

So SPDB is not in 2NF.

Decomposition: S {S#, Sname, City, Status} P {P#, Pname, Weight} SP {S# (references S), P# (references P), QTY}

Page 23: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Exercise: Check 3NF

In S {S#, Sname, City, Status}: S# -> City -> Status: transitive dependency. So S is not in 3NF.

Decomposition: C {City, Status} S {S#, Sname, City (references C)}

Now all the tables, C, S, P, and SP, are in 3NF.

Page 24: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Example 2: University Database Table

StdSSN StdClass StdCity OfferNo OffYear Offterm EnrGrade CourseNo CrsDesc

S1 JUN ISB O1 2003 1 3.5 C1 DB S1 JUN ISB O2 2003 2 3.3 C2 VB S2 JUN ISB O3 2003 1 3.1 C3 OO S2 JUN ISB O2 2003 1 3.4 C2 VB

StdSSN StdCity, StdClass

OfferNo OffTerm, OffYear, CourseNo, CrsDesc

CourseNo CrsDesc

StdSSN, OfferNo EnrGrade

Page 25: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Problems

Anomalies: - PK: combination of StdSSN and OfferNo - Insert: cannot insert a new student without enrolling in an

offering (OfferNo part of PK) - Update: change a course description; change every

enrollment of the course - Delete: remove third row; lose information about course C3

Table has obvious redundancies Easier to query: no joins More difficult to change: can work around problems

(dummy PK) but tedious to do

Page 26: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

FD Diagrams and Lists

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGradeCourseNo CrsDesc

StdSSN StdCity, StdClass

OfferNo OffTerm, OffYear, CourseNo, CrsDesc

CourseNo CrsDesc

StdSSN, OfferNo EnrGrade

Page 27: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Normalization

Process of removing unwanted redundancies

Apply normal forms

Identify FDs Determine whether FDs meet normal form Split the table to meet the normal form if there is a violation

Page 28: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

1NF

No repeating groups: flat rows

StdSSN StdClass OfferNo OffYear EnrGrade CourseNo CrsDesc

S1 JUN O1 2003 3.5 C1 DB O2 2003 3.3 C2 VB S2 JUN O3 2003 3.1 C3 OO O2 2003 3.4 C2 VB

Page 29: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Combined Definition of 2NF/3NF

Key column: candidate key or part of candidate key

Analogy to the traditional justice oath Every non key depends on a key, the whole

key, and nothing but the key Usually taught as separate definitions

Page 30: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

2NF

Every nonkey column depends on a whole key, not part of a key

Violations Part of key nonkey Violations only for combined keys

Page 31: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

2NF Example

Many violations for the big university database table StdSSN StdCity, StdClass OfferNo OffTerm, OffYear, CourseNo,

CrsDesc Splitting the table

UnivTable1 (StdSSN, StdCity, StdClass) UnivTable2 (OfferNo, OffTerm, OffYear,

CourseNo, CrsDesc)

Page 32: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

3NF

Every nonkey column depends only on a key not on non key columns

Violations: Nonkey Nonkey Alterative formulation

No transitive FDs A B, B C then A C OfferNo CourseNo, CourseNo CrsDesc then

OfferNo CrsDesc

Page 33: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

3NF Example

One violation in UnivTable2 CourseNo CrsDesc

Splitting the table UnivTable2-1 (OfferNo, OffTerm, OffYear,

CourseNo) UnivTable2-2 (CourseNo, CrsDesc)

Page 34: Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems

Example 2