Upload
marianna-walker
View
215
Download
1
Embed Size (px)
Citation preview
ITN 170 - Table Normalization 1
ITN 170 MySQL Database Programming
Lecture 3 :Database Analysis and Design (III)Normalization
ITN 170 - Table Normalization 2
Define normalization and explain its benefits.
Place tables in Third Normal Form. Explain how conceptual data modeling
rules ensure normalized tables.
Section objectives
ITN 170 - Table Normalization 3
Normalize Tables
Categorize tables according to their degree of normalization.
Normal Form Rule Description
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
The table must be expressed as a set of unordered, two-dimensional tables. The table cannot contain repeating groups.
The table must be in 1NF. Every non-key column must be dependent on all parts of the primary key.
The table must be 2NF. No non-key column may be functionally dependent on another non-key column.
ITN 170 - Table Normalization 4
Normalize Tables
“Each non-primary key value MUST be dependent on the key, the whole key, and nothing but the key.”
Why normalize tables?
Normalization minimizes data redundancy. Un-normalized data is redundant.
Data redundancy causes integrity problems. Update and delete transactions may not be consistently applied to all copies of the data causing inconsistencies in the data.
Normalization helps identify missing entities, relationships, and tables.
ITN 170 - Table Normalization 5
Normalize Tables
In addition to the three normal forms we mentioned, there are some more higher normal forms such as Boyce-Codd normal form, fourth normal form, and fifth normal form. However, they are not widely used in database designs.
In general, third normal form is accepted goal for a database design that eliminates redundancy.
ITN 170 - Table Normalization 6
This an Excel table. Looks nice, isn’t it?
ITN 170 - Table Normalization 7
Why is this data un-normalized?
Recognize Un-normalized Data
Let’s simplify the Excel Table as follows:
Consider the following set of data. The Excel Table is un-normalized. Un-normalized data does not comply with any of the rules of normalization we just mentioned.
Three variable length records are shown – one for each ORDER_ID, i.e. 2301, 2302, and 2303.
ORDER_ID DATE CUST_ID CUST_NAME STATE ITEM NUM
ITEM DESCRIP
QUANTITY PRICE
2301 6/23 101 Volleyrite IL 3786 net 3 35.00
4011 racket 6 65.00
9132 3-pack 8 4.75
2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00
2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00
3141 cover 2 10.00
ITN 170 - Table Normalization 8
Recognize Un-normalized Data
Remember, First Normal Form prohibits repeating groups.
[Answer] The table contains a repeating group of ITEM NUM, ITEM DESCRIPTION, QUANTITY, and PRICE.
ORDER_ID DATE CUST_ID CUST_NAME STATE ITEM NUM
ITEM DESCRIP
QUANTITY PRICE
2301 6/23 101 Volleyrite IL 3786 net 3 35.00
4011 racket 6 65.00
9132 3-pack 8 4.75
2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00
2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00
3141 cover 2 10.00
ITN 170 - Table Normalization 9
First Normal Form
Remove any repeating groups:
Fill the identical data in the empty spaces of the base table (temporarily makes the base table to be a non-repeated group)
Remove the repeating group from the base table Create a new table with the PK column from the
base table and the repeating group.
ITN 170 - Table Normalization 10
Fill the identical data in the base table to temporarily avoid the repeating group in the base table, for a table with repeating groups is illegal for violating database definition.
First Normal Form (continued)
ORDER_ID DATE CUST_ID CUST_NAME STATE ITEM NUM
ITEM DESCRIP
QUANTITY PRICE
2301 6/23 101 Volleyrite IL 3786 net 3 35.00
4011 racket 6 65.00
9132 3-pack 8 4.75
2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00
2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00
3141 cover 2 10.00
2301 6/23 101 Volleyrite IL
2301 6/23 101 Volleyrite IL
2303 6/26 110 We-R-Sports MI
ITN 170 - Table Normalization 11
Remove the repeating group of ITEM NUM, ITEM DESCRIPTION, QUANTITY, and PRICE from the following table to a new table. The PK of the remaining table is ORDER ID. Create a new ORDER_ITEM table with ORDER ID and the repeating group.
First Normal Form (continued)
ORDER_ID DATE CUST_ID CUST_NAME STATE ITEM NUM
ITEM DESCRIP
QUANTITY PRICE
2301 6/23 101 Volleyrite IL 3786 net 3 35.00
4011 racket 6 65.00
9132 3-pack 8 4.75
2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00
2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00
3141 cover 2 10.00
2301 6/23 101 Volleyrite IL
2301 6/23 101 Volleyrite IL
2303 6/26 110 We-R-Sports MI
ITN 170 - Table Normalization 12
First Normal Form (continued)
ORDER_ID DATE CUST_ID CUST_NAME STATE
2301 6/23 101 Volleyrite IL
2302 6/25 107 Herman’s WI
2303 6/26 110 We-R-Sports MI
Therefore, we normalize the “Big” table into two “Small” relational tables (ORDER table and ORDER_ITEM table).
ORDER_ID ITEM NUM ITEM DESCRIP QUANTITY PRICE
2301 3786 net 3 35.004011 racket 6 65.00
9132 3-pack 8 4.752302 5794 6-pack 4 5.00
2303 4011 racket 2 65.00 3141 cover 2 10.00
2301
2301
2303
ORDER
ORDER_ITEM
ITN 170 - Table Normalization 13
Second Normal FormRemove any non-key columns that are not dependent upon the table’s entire primary key.
Determine which non-key columns are not dependent upon the table’s entire primary key.
Remove those columns from the base table.Create second table with those columns and
the columns(s) from the PK that they are dependent upon.
ITN 170 - Table Normalization 14
Put the ORDER table in 2NF.
ORDER_ID DATE CUST_ID CUST_NAME STATE
2301 6/23 101 Volleyrite IL
2302 6/25 107 Herman’s WI
2303 6/26 110 We-R-Sports MI
Remember, for a Second Normal Form table. The table must be first in 1NF. Then, every non-key column must be dependent on all parts of the primary key.
Is this in 2NF?
Second Normal Form (continued)
ITN 170 - Table Normalization 15
The ORDER table is already in 2NF. Any value of ORDER_ID uniquely determines a single value of each column. Therefore, all columns are dependent on the PK ORDER_ID.
ORDER_ID DATE CUST_ID CUST_NAME STATE
2301 6/23 101 Volleyrite IL
2302 6/25 107 Herman’s WI
2303 6/26 110 We-R-Sports MI
Second Normal Form (continued)
[Answer]
ITN 170 - Table Normalization 16
So, what about this?
Remove any non-key columns that are not dependent upon the table’s entire primary key.
Put the ORDER_ITEM table in 2NF.
Still. Remember, for a Second Normal Form table. The table must be first in 1NF. Then, every non-key column must be dependent on all parts of the primary key.
ORDER_ID ITEM NUM ITEM DESCRIP QUANTITY PRICE
2301 3786 net 3 35.004011 racket 6 65.00
9132 3-pack 8 4.752302 5794 6-pack 4 5.00
2303 4011 racket 2 65.00
3141 cover 2 10.00
2301
2301
2303
Second Normal Form (continued)
ITN 170 - Table Normalization 17
No. The ORDER_ITEM table is not in 2NF since PRICE and ITEM DESCRIPTION are dependent upon ITEM NUM, but not dependent upon ORDER ID.
ORDER_ID QUANTITY
2301 36
82302 4
2303 2
ITEM NUM
37864011
91325794
4011
3141 2
ITEM DESCRIP PRICE
net 35.00racket 65.00
3-pack 4.756-pack 5.00
racket 65.00
cover 10.00
2301
2301
2303
Now. How to convert to 2NF?
Second Normal Form (continued)
[Answer]
ITN 170 - Table Normalization 18
To convert the table to 2NF, remove any partially dependent columns. Create an ITEM table with those columns and the column from part of PK columns that they are dependent upon.
ORDER_ID QUANTITY
2301 36
8
2302 4
2303 2
ITEM NUM
37864011
9132
5794
4011
3141 2
2301
2301
2303
ITEM
ORDER_ITEMITEM NUM ITEM DESCRIP PRICE
3786 net 35.004011 racket 65.00
9132 3-pack 4.755794 6-pack 5.00
3141 cover 10.00
???
Second Normal Form (continued)
ITN 170 - Table Normalization 19
Remove any columns that are dependent upon another non-key column.
Determine which columns are dependent upon another non-key column.
Remove those columns from the base table.Create a second table with those columns
and the non-key column that they are dependent upon.
Third Normal Form
ITN 170 - Table Normalization 20
Is this in 3NF?
ORDER table is already in 2NF as mentioned. Put the ORDER table in 3NF.
Remember. Remove any columns that are dependent upon another non-key column
ORDER_ID DATE CUST_ID CUST_NAME STATE
2301 6/23 101 Volleyrite IL
2302 6/25 107 Herman’s WI
2303 6/26 110 We-R-Sports MI
Third Normal Form (continued)
ITN 170 - Table Normalization 21
ORDER_ID DATE CUST_ID CUST_NAME STATE
2301 6/23 101 Volleyrite IL
2302 6/25 107 Herman’s WI
2303 6/26 110 We-R-Sports MI
CUSTOMER NAME and STATE are dependent upon CUSTOMER ID. Since you know that CUSTOMER ID is not the PK. Therefore, the ORDER table is not in 3NF.
Third Normal Form (continued)
[Answer]
ITN 170 - Table Normalization 22
ORDER_ID DATE CUST_ID
2301 6/23 101
2302 6/25 107
2303 6/26 110
Move the dependent non-key columns with the non-key column they depend upon into a new CUSTOMER table.
CUST_ID CUST_NAME STATE
101 Volleyrite IL
107 Herman’s WI
110 We-R-Sports MI
ORDER CUSTOMER
Note: A table is in Third Normal Form if no non-key column is functionally dependent upon another non- key column
Third Normal Form (continued)
ITN 170 - Table Normalization 23
Is this in 3NF?
No non-key column can be functionally dependent upon another non-key column.
Example
Consider the ORDER_ITEM table as follows:
ORDER_ID QUANTITY
2301 36
8
2302 4
2303 2
ITEM NUM
37864011
9132
5794
4011
3141 2
2301
2301
2303
Third Normal Form (continued)
ITN 170 - Table Normalization 24
Consider the ORDER_ITEM table as follows:ORDER_ID QUANTITY
2301 36
8
2302 4
2303 2
ITEM NUM
37864011
9132
5794
4011
3141 2
2301
2301
2303
All non-key attributes are dependent on the key, the whole key, and nothing but the key. Therefore,
the ORDER_ITEM table is in 3NF.
[Answer]
Third Normal Form (continued)
ITN 170 - Table Normalization 25
What about this?
No non-key column can be functionally dependent upon another non-key column.
Example
Consider the ITEM table as follows:
ITEM NUM ITEM DESCRIP PRICE
3786 net 35.004011 racket 65.00
9132 3-pack 4.755794 6-pack 5.00
3141 cover 10.00
Third Normal Form (continued)
ITN 170 - Table Normalization 26
Consider the ITEM table as follows:
All non-key attributes are dependent on the key, the whole key, and nothing but the key.
Therefore, the ITEM table is in 3NF.
ITEM NUM ITEM DESCRIP PRICE
3786 net 35.004011 racket 65.00
9132 3-pack 4.755794 6-pack 5.00
3141 cover 10.00
[Answer]
Third Normal Form (continued)
ITN 170 - Table Normalization 27
Ensure a 3NF table design by following the rules of data modeling.
A table must contain no repeating groups[First Normal Form Rule]
Example
CLIENT #* identifier * date contacted
Is this entity CLIENT in 1NF? If not, how could it be converted to 1NF?
Normalization During data Modeling
ITN 170 - Table Normalization 28
CONTACT #* date contacted o location o result
[Answer] The attribute date contacted has multiple values, therefore the entity CLIENT is not in 1NF. Create an additional entity CONTACT with a M:1 relationship to CLIENT. Create an additional entity and 1:M relationship to ensure 1NF.
CLIENT #* identifier
for
the subject
of
Normalization During data Modeling
ITN 170 - Table Normalization 29
Validate attribute dependence upon its entity’s entire UID.
Every non-key column must be dependent upon all parts of the primary key.
[Second Normal Form Rule]
An attribute must be dependent upon it entity’s entire unique identifier.
[Corresponding Data Modeling Rule]
Normalization During data Modeling
ITN 170 - Table Normalization 30
ACCOUNT #* number o balance o date opened o bank location
BANK #* number * name
managed by
the manager of
Are all of the attribute in the E-R diagram dependent upon their entity’s UID?
Example
Normalization During data Modeling
ITN 170 - Table Normalization 31
[Answer]The attribute bank location is not dependent upon the UID of ACCOUNT. It is dependent upon the UID of BANK.Move the attribute and place it where it depends upon the UID of it’s entity.
ACCOUNT #* number o balance o date opened
BANK #* number * name o bank location
managed by
the manager of
Normalization During data Modeling
ITN 170 - Table Normalization 32
Validate attribute placement to ensure a normalized table design.
No non-key column can be functionally dependent upon another non-key column.
[Third Normal Form Rule]
No non-UID attribute can be dependent upon another non-UID attribute.
[Corresponding Data Modeling Rule]
Normalization During data Modeling
ITN 170 - Table Normalization 33
Example
ORDER #* id * date of order * customer id * customer name * state
Are any of the non-UID attributes for this entity dependent upon another non-UID attributes?
Normalization During data Modeling
ITN 170 - Table Normalization 34
ORDER #* id * date of order
CUSTOMER #* id * name * state
for
the submitter of
[Answer]The attributes customer name and state are dependent upon the customer id.Create another entity called CUSTOMER with a UID of customer id, and place the attributes accordingly.
Normalization During data Modeling