13
BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia University, Montréal 1

BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Embed Size (px)

Citation preview

Page 1: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

BTM 382 Database Management

Chapter 6:Normalization of Database Tables

Chitu OkoliAssociate Professor in Business Technology ManagementJohn Molson School of Business, Concordia University, Montréal

1

Page 2: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Problems with unnormalized tables

• Needless redundancy, hence insert, update and delete anomalies (inconsistencies)

• Data updates are less efficient because tables are larger

• Indexing is more cumbersome• No simple strategies for creating views

(virtual tables)

2

Page 3: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Understand dependencies for normalization:

Functional dependency• Functional dependency: A→B or (A,B)→(C,D)

– B is functionally dependent on A means that knowing A will therefore give you the correct value of B

– E.g. Project.ID → Project.Name– Also called determination: “A determines B”

• Full functional dependency: (A,B)→C where A↛C and B↛C– When all the attributes in a key are required for the

determination (none is optional)– E.g. (Project.ID, Project.Manager) → Project.Name

Project.Manager is optional—this is not a full functional dependency

– E.g. (Project.Manager, Project.StartDate) → Project.NameThis is a full functional dependency, assuming a manager can launch no more than one project on a given date

3

Page 4: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Understand dependencies for normalization:

Repeating group = multivalued attributeMultivalued dependency• Repeating group = multivalued attribute

– Attribute whose values contain multiple values (a list or array of values), instead of a single value

– Illegal in the relational model; troublesome for normalization if you don’t catch it

• Functional dependency: A→B• Multivalued dependency: A→B1/B2/B3/…/Bn

– Instead of determining just one value of B in a table, A determines multiple values at the same time

– E.g. Project.ID → Project.EmployeeID– Usually indicates a problem with normalization

4

Page 5: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Understand dependencies for normalization:

Partial and transitive dependencies

• Partial dependency: (A,B)→(C,D) and B→C– (A,B) is a candidate key (e.g. primary key)– C doesn’t need both A and B to determine it; it only needs B – E.g. (Project.ID,Project.ManagerID) → Project.Name

and Project.ID → Project.Name• Transitive dependency: A→(B,C) and B→C

– A is a candidate key• Technically speaking, a transitive dependency requires that B and C not

be part of any candidate key. However, if you expand the meaning to include even if they are part of the key, then you will avoid BCNF automatically

– A determines C, but so does B, even though B is not a candidate key

– E.g. Project.ID → (Project.Client,Project.Location)and Project.Client → Project.Location

5

Page 6: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Summary of attaining normal forms• 1NF: Primary key identified and no multivalued attributes

– Legitimate primary key selected (unique identifying key)– Only one value per table cell; no lists/arrays (multivalued attributes) in

any table cell• 2NF: 1NF minus partial dependencies

– All dependencies are fully functional– (A,B)→C where A↛C and B↛C

• 3NF/BCNF: 2NF minus transitive dependencies– Only a candidate key determines any attribute– If A→(B,C), then B ↛ C– There is a technical distinction between 3NF and BCNF, but if you keep

this rule, then you take care of both 3NF and BCNF• 4NF: BCNF minus multivalued dependencies

– Each row strictly describes just one entity– Only a problem if you missed multivalued attributes in 1NF

• DKNF, 5NF, 6NF– relatively rare and often not worth the trouble normalizing

6

Page 7: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Dependency diagram:Basic tool for normalization• Depicts all dependencies found in a given table

structure• Gives bird’s-eye view of all relationships among

table’s attributes• Makes it less likely that you will overlook an important

dependency

7

Page 8: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

8

Page 9: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

9

Page 10: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

3NF vs BCNF

• BCNF is only an issue because of poor selection of primary key for 1NF step

• Removing dependencies resolves table into BCNF

10

Page 11: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Fixing 4NF problem

• The only reason a table might be in 3NF/BCNF but not in 4NF is because two originally multivalued attributes existed at 1NF stage

• Two multivalued attributes should always be placed in separate tables, as the solution shows11

Page 12: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Denormalization• Although normalization is important,

processing speed and efficiency is also important in database design

12

Page 13: BTM 382 Database Management Chapter 6: Normalization of Database Tables Chitu Okoli Associate Professor in Business Technology Management John Molson School

Sources

• Most of the slides are adapted from Database Systems: Design, Implementation and Management by Carlos Coronel and Steven Morris. 11th edition (2015) published by Cengage Learning. ISBN 13: 978-1-285-19614-5

• Other sources are noted on the slides themselves

13