28
Dr Gordon Russell, Cop Dr Gordon Russell, Cop yright @ Napier Univer yright @ Napier Univer sity sity Normalisation 2 - V2.0 Normalisation 2 - V2.0 1 Normalisation 2 Normalisation 2 Unit 3.2 Unit 3.2

Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Embed Size (px)

DESCRIPTION

Dr Gordon Russell, Napier University Normalisation 2 - V2.03 Boyce-Codd Normal Form (BCNF) When a relation has more than one candidate key, anomalies may result even though the relation is in 3NF. 3NF does not deal satisfactorily with the case of a relation with overlapping candidate keys – –i.e. composite candidate keys with at least one attribute in common. BCNF is based on the concept of a determinant. – –A determinant is any attribute (simple or composite) on which some other attribute is fully functionally dependent. A relation is in BCNF is, and only if, every determinant is a candidate key.

Citation preview

Page 1: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 11

Normalisation 2Normalisation 2Unit 3.2Unit 3.2

Page 2: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 22

Normalisation 2 Overview

– normalise a relation to Boyce Codd Normal Form (BCNF)

– normalise a relation to forth normal form (4NF)– normalise a relation to fifth normal form (5NF)

Page 3: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 33

Boyce-Codd Normal Form (BCNF) When a relation has more than one candidate key,

anomalies may result even though the relation is in 3NF. 3NF does not deal satisfactorily with the case of a

relation with overlapping candidate keys– i.e. composite candidate keys with at least one

attribute in common. BCNF is based on the concept of a determinant.

– A determinant is any attribute (simple or composite) on which some other attribute is fully functionally dependent.

A relation is in BCNF is, and only if, every determinant is a candidate key.

Page 4: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 44

The theoryThe theory Consider the following relation and determinants.Consider the following relation and determinants.

R(R(a,ba,b,c,d),c,d)a,c -> b,da,c -> b,da,d -> ba,d -> b

To be in BCNF, all valid determinants must be a To be in BCNF, all valid determinants must be a candidate key. In the relation R, a,c->b,d is the candidate key. In the relation R, a,c->b,d is the determinate used, so the first determinate is fine.determinate used, so the first determinate is fine.

a,d->b suggests that a,d can be the primary key, a,d->b suggests that a,d can be the primary key, which would determine b. However this would not which would determine b. However this would not determine c. This is not a candidate key, and thus R is determine c. This is not a candidate key, and thus R is not in BCNF.not in BCNF.

Page 5: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 55

Example 1Example 1

Patient No Patient Name Appointment Id Time Doctor

1 John 0 09:00 Zorro

2 Kerr 0 09:00 Killer

3 Adam 1 10:00 Zorro

4 Robert 0 13:00 Killer

5 Zane 1 14:00 Zorro

Page 6: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 66

Two possible keysTwo possible keys DB(Patno,PatName,appNo,time,doctor)DB(Patno,PatName,appNo,time,doctor) Determinants:Determinants:

– Patno -> PatNamePatno -> PatName– Patno,appNo -> Time,doctorPatno,appNo -> Time,doctor– Time -> appNoTime -> appNo

Two options for 1NF primary key selection:Two options for 1NF primary key selection:– DB(DB(Patno,Patno,PatNamePatName,appNo,appNo,time,doctor) (example ,time,doctor) (example

1a)1a)– DB(DB(Patno,Patno,PatNamePatName,,appNo,appNo,timetime,doctor) (example ,doctor) (example

1b)1b)

Page 7: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 77

Example 1aExample 1a DB(DB(Patno,Patno,PatName,PatName,appNoappNo,time,doctor) ,time,doctor) No repeating groups, so in 1NFNo repeating groups, so in 1NF 2NF – eliminate partial key dependencies:2NF – eliminate partial key dependencies:

– DB(DB(Patno,appNoPatno,appNo,time,doctor),time,doctor)– R1(R1(PatnoPatno,PatName) ,PatName)

3NF – no transient dependences so in 3NF3NF – no transient dependences so in 3NF Now try BCNF.Now try BCNF.

Page 8: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 88

BCNF Every BCNF Every determinant is a determinant is a candidate keycandidate key

DB(DB(Patno,appNoPatno,appNo,time,doctor),time,doctor)R1(R1(PatnoPatno,PatName) ,PatName)

Is determinant a candidate key?Is determinant a candidate key?– Patno -> PatNamePatno -> PatName

Patno is present in DB, but not PatName, so not Patno is present in DB, but not PatName, so not relevant.relevant.

– Patno,appNo -> Time,doctorPatno,appNo -> Time,doctorAll LHS and RHS present so relevant. Is this a candidate All LHS and RHS present so relevant. Is this a candidate key? Patno,appNo IS the key, so this is a candidate key. key? Patno,appNo IS the key, so this is a candidate key.

– Time -> appNoTime -> appNoTime is present, and so is appNo, so relevant. Is this a Time is present, and so is appNo, so relevant. Is this a candidate key? If it was then we could rewrite DB as:candidate key? If it was then we could rewrite DB as: DB(Patno,appNo, DB(Patno,appNo,timetime,doctor),doctor)This will not work, so not BCNF.This will not work, so not BCNF.

Page 9: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 99

Rewrite to BCNFRewrite to BCNF DB(DB(Patno,appNoPatno,appNo,time,doctor),time,doctor)

R1(R1(PatnoPatno,PatName) ,PatName) BCNF: rewrite toBCNF: rewrite to

DB( DB(Patno,timePatno,time,doctor),doctor) R1( R1(PatnoPatno,PatName),PatName) R2( R2(timetime,appNo),appNo)

time is enough to work out the appointment number time is enough to work out the appointment number of a patient. Now BCNF is satisfied, and the final of a patient. Now BCNF is satisfied, and the final relations shown are in BCNFrelations shown are in BCNF

Page 10: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1010

Example 1bExample 1b DB(DB(Patno,Patno,PatName,appNo,PatName,appNo,timetime,doctor) ,doctor) No repeating groups, so in 1NFNo repeating groups, so in 1NF 2NF – eliminate partial key dependencies:2NF – eliminate partial key dependencies:

– DB(DB(PatnoPatno,,timetime,doctor),doctor)– R1(R1(PatnoPatno,PatName) ,PatName) – R2(R2(timetime,appNo),appNo)

3NF – no transient dependences so in 3NF3NF – no transient dependences so in 3NF Now try BCNF.Now try BCNF.

Page 11: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1111

BCNF Every BCNF Every determinant is a determinant is a candidate keycandidate key

DB(DB(PatnoPatno,,timetime,doctor),doctor)R1(R1(PatnoPatno,PatName) ,PatName) R2(R2(timetime,appNo),appNo)

Is determinant a candidate key?Is determinant a candidate key?– Patno -> PatNamePatno -> PatName

Patno is present in DB, but not PatName, so not Patno is present in DB, but not PatName, so not relevant.relevant.

– Patno,appNo -> Time,doctorPatno,appNo -> Time,doctorNot all LHS present so not relevantNot all LHS present so not relevant

– Time -> appNoTime -> appNoTime is present, but not appNo, so not relevant. Time is present, but not appNo, so not relevant.

– Relations are in BCNF.Relations are in BCNF.

Page 12: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1212

Summary - Example 1This example has demonstrated three things:

BCNF is stronger than 3NF, relations that are in 3NF are not necessarily inBCNF

BCNF is needed in certain situations to obtain full understanding of the data model

there are several routes to take to arrive at the same set of relations in BCNF.– Unfortunately there are no rules as to which route

will be the easiest one to take.

Page 13: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1313

Example 2Grade_report(StudNo,StudName,(Major,Adviser,

(Course,Ctitle,InstrucName,InstructLocn,Grade))) Functional dependencies

– StudNo -> StudName– CourseNo -> Ctitle,InstrucName– InstrucName -> InstrucLocn– StudNo,CourseNo,Major -> Grade– StudNo,Major -> Advisor– Advisor -> Major

Page 14: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1414

Example 2 cont... Unnormalised

Grade_report(StudNo,StudName,(Major,Adviser, (Course,Ctitle,InstrucName,InstructLocn,Grade)))

1NF Remove repeating groups– Student(StudNo,StudName)– StudMajor(StudNo,Major,Adviser)– StudCourse(StudNo,Major,Course,

Ctitle,InstrucName,InstructLocn,Grade)

Page 15: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1515

Example 2 cont... 1NF

Student(StudNo,StudName)StudMajor(StudNo,Major,Adviser)StudCourse(StudNo,Major,Course, Ctitle,InstrucName,InstructLocn,Grade)

2NF Remove partial key dependenciesStudent(StudNo,StudName)StudMajor(StudNo,Major,Adviser)StudCourse(StudNo,Major,Course,Grade)Course(Course,Ctitle,InstrucName,InstructLocn)

Page 16: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1616

Example 2 cont... 2NF

Student(StudNo,StudName)StudMajor(StudNo,Major,Adviser)StudCourse(StudNo,Major,Course,Grade)Course(Course,Ctitle,InstrucName,InstructLocn)

3NF Remove transitive dependenciesStudent(StudNo,StudName)StudMajor(StudNo,Major,Adviser)StudCourse(StudNo,Major,Course,Grade)Course(Course,Ctitle,InstrucName)Instructor(InstructName,InstructLocn)

Page 17: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1717

Example 2 cont... BCNF Every determinant is a candidate key

– Student : only determinant is StudNo– StudCourse: only determinant is StudNo,Major– Course: only determinant is Course– Instructor: only determinant is InstrucName– StudMajor: the determinants are

StudNo,Major, or AdviserOnly StudNo,Major is a candidate key.

Page 18: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1818

Example 2: BCNFExample 2: BCNF BCNF

Student(StudNo,StudName)StudCourse(StudNo,Major,Course,Grade)Course(Course,Ctitle,InstrucName)Instructor(InstructName,InstructLocn)StudMajor(StudNo,Adviser)Adviser(Adviser,Major)

Page 19: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 1919

Problems BCNF Problems BCNF overcomesovercomes

If the record for student 456 is deleted we lose not only information on student 456 but also the fact that DARWIN advises in BIOLOGY

we cannot record the fact that WATSON can advise on COMPUTING until we have a student majoring in COMPUTING to whom we can assign WATSON as an advisor.

STUDENT MAJOR ADVISOR

123 PHYSICS EINSTEIN

123 MUSIC MOZART

456 BIOLOGY DARWIN

789 PHYSICS BOHR

999 PHYSICS EINSTEIN

Page 20: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2020

Split into two Split into two tablestablesIn BCNF we have two tablesIn BCNF we have two tables

STUDENT ADVISOR

123 EINSTEIN

123 MOZART

456 DARWIN

789 BOHR

999 EINSTEIN

ADVISOR MAJOR

EINSTEIN PHYSICS

MOZART MUSIC

DARWIN BIOLOGY

BOHR PHYSICS

Page 21: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2121

Fourth Normal FormUnder 4NF, a record type should not contain two or more independent multi-valued facts about an entity. A relation is in 4NF if it is in BCNF and it contains no

multi-valued dependencies. A multi-valued fact may correspond to a many-many

relationship or to a many-one relationship. A multi-valued dependency exists when there are

three attributes (e.g. A,B, and C) in a relation, and for each value of A there is a well-defined set of valued B and a well defined set of values C. However, the set of values of B is independent of set C and vice-versa.

Page 22: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2222

Example Consider information stored about movie stars. It

includes details of their various addresses and the movies they starred in:– Name -> Address– Name -> Movie

Name Street City Title Year

 C. Fisher  123 Maple St. Hollywood Star Wars 1977

 C. Fisher  5 Locust Ln. Malibu Star Wars 1977

 C. Fisher  123 Maple St. Hollywood Empire Strikes Back 1980

 C. Fisher  5 Locust Ln. Malibu Empire Strikes Back 1980

 C. Fisher  123 Maple St. Hollywood Return of the Jedi 1983

 C. Fisher  5 Locust Ln. Malibu Return of the Jedi 1983

Page 23: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2323

Example cont... Carrie Fisher has two addresses and has been in three

movies The only way to express the fact that addresses and

movies are independent is to have each address appear with each movie– but this has introduced redundancy

There is no BCNF violation– But the relation is not in 4NF– We need to break it up into 2 tables

Page 24: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2424

Example cont…Example cont…

Name Street City

C. Fisher 123 Maple St. Hollywood

C. Fisher 5 Locust Ln. Malibu

Name Title Year

C.Fisher Star Wars 1977

C.Fisher Empire Strikes Back 1980

C.Fisher Return of the Jedi 1983

Page 25: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2525

Fifth Normal Form 5NF is designed to cope with a type of dependency

called join dependency– A relation that has a join dependency cannot be

decomposed by a projection into other relations without spurious results

– a relation is in 5NF when its information content cannot be reconstructed from several smaller relations

i.e. from relations having fewer attributes than the original relation

Page 26: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2626

Join Dependency Join Dependency DecompositionDecomposition

Name Language Hobby

Name Language Name Hobby

C. Fisher French Cooks C. Fisher French C. Fisher Cooks

C. Fisher Spanish Cooks C. Fisher Spanish C. Fisher Writes

C. Fisher English Writes C. Fisher English M. Brown Read

M. Brown Spanish Read M. Brown Spanish M. Brown Cook

M. Brown Italian Cook M. Brown Italian K. Clark Cook

K. Clark Italian Cook K. Clark Italian K. Clark Decorating

K. Clark Japanese Decorating K. Clark Japanese

Page 27: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2727

Spurious resultsSpurious resultsName Language Hobby Name Language Hobby

C. Fisher French Cooks M. Brown Spanish Cook

C. Fisher French Writes M. Brown Italian Read

C. Fisher Spanish Cooks M. Brown Italian Cook

C. Fisher Spanish Writes K. Clark Italian Cook

C. Fisher English Cooks K. Clark Italian Decorating

C. Fisher English Writes K. Clark Japanese Cook

M. Brown Spanish Read K. Clark Japanese Decorating

Page 28: Dr Gordon Russell, Napier University Normalisation 2 - V2.0 1 Normalisation 2 Unit 3.2

Dr Gordon Russell, CopyrigDr Gordon Russell, Copyright @ Napier Universityht @ Napier University

Normalisation 2 - V2.0Normalisation 2 - V2.0 2828

Returning to the ER Returning to the ER Model Model Now that we have reached the end of the Now that we have reached the end of the

normalisation process, you must go back and compare normalisation process, you must go back and compare the resulting relations with the original ER model the resulting relations with the original ER model

– You may need to alter it to take account of the You may need to alter it to take account of the changes that have occurred during the changes that have occurred during the normalisation process Your ER diagram should normalisation process Your ER diagram should always be a prefect reflection of the model you are always be a prefect reflection of the model you are going to implement in the database, so keep it up going to implement in the database, so keep it up to date!to date!

– The changes required depends on how good the ER The changes required depends on how good the ER model was at first! model was at first!