42
Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Normalization can be applied to individual relation so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly. Normalization

Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Embed Size (px)

DESCRIPTION

Insertion Anomalies Deletion Anomalies Modification Anomalies

Citation preview

Page 1: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Normalization can be applied to individual relation so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly.

Normalization

Page 2: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Data Redundancy and Update Anomalies

The main purpose of database design is to identify the optimal grouping of attributes in order to minimize data redundancy which affect on saving space for data storage.

Data redundancy always causes UPDATE ANOMALIES which are classified into 3 types:

Insertion anomaliesDeletion AnomaliesModification Anomalies

Page 3: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Insertion Anomalies Deletion AnomaliesModification Anomalies

Page 4: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Insertion AnomaliesTo insert the details of new students into the

Class_Info relation, we must include the details of the lecturer and subject in order to avoid null value.

Deletion AnomaliesIf we delete a lecturer from the Class_Info relation, the details

of students and subjects are also lost from the database.

Modification AnomaliesIf we want to change the value of one of the attributes of a

particular student in the Class_Info relation, we must update all rows which associate to the student. If this modification is not carried out on all the appropriate rows of the Class_Info relation, the database will become inconsistent.

Class_Info

Page 5: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

LID Lname Salary Dept Subject Credit SID Sname GPA

E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75E5001 Dusit 28700 EE Electronic 1 3 S7 Vichu 3.15E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25E9001 Pattara 18500 CPE Data Structure 3 S1 Preeda 2.85E9001 Pattara 18500 CPE Data Structure 3 S2 Panu 2.45E9001 Pattara 18500 CPE Data Structure 3 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Service 4 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Services 4 S1 Preeda 2.85E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45

NULL NULL NULL NULL NULL NULL S999 Luxana NULL

E9999 Thana 17500 CPE NULL NULL NULL NULL NULL

NULL NULL NULL CPE GIS 3 NULL NULL NULL

Insert new records may cause data redundancy in other fields.

Insertion AnomalyClass_Info

Page 6: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

LID Lname Salary Dept Subject Credit SID Sname GPA

E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35

E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96

E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75

E5001 Dusit 28700 EE Electronic 1 3 S7 Vichu 3.15

E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54

E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08

E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54

E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08

E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67

E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25

E9001 Pattara 18500 CPE Data Structure 3 S1 Preeda 2.85

E9001 Pattara 18500 CPE Data Structure 3 S2 Panu 2.45

E9001 Pattara 18500 CPE Data Structure 3 S3 Vallapa 3.02

E9001 Pattara 18500 CPE Web Service 4 S3 Vallapa 3.02

E9001 Pattara 18500 CPE Web Services 4 S1 Preeda 2.85

E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45

E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54

E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08

E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54

E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08

Deletion Anomaly

Deletion Anomaly may cause loss other necessary data.

Class_Info

Page 7: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

LID Lname Salary Dept Subject Credit SID Sname GPA

E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35

E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96

E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75

E5001 Dusit 28700 EE Electronic 1 3 S7 Vichu 3.15

E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54

E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08

E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54

E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08

E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67

E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25

E9001 Pattara 18500 CPE Data Structure 3 S1 Preeda 2.85

E9001 Pattara 18500 CPE Data Structure 3 S2 Panu 2.45

E9001 Pattara 18500 CPE Data Structure 3 S3 Vallapa 3.02

E9001 Pattara 18500 CPE Web Service 4 S3 Vallapa 3.02

E9001 Pattara 18500 CPE Web Services 4 S1 Preeda 2.85

E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45

Dusit 45000Dusit 45000

Dusit 45000Dusit 45000

Modification Anomaly

If we want to change the value of one of the attributes of a particular entity in the relation, we must update all rows that relate to this entity. If this modification is not carried out on all the appropriate rows ,the data base will become inconsistent.

Class_Info

Pattara 25000

Panu 2.67

Panu 2.45

Pattara 18500

Pattara 18500Pattara 18500

Pattara 18500

Pattara 18500

Pattara 21000

Page 8: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

To solve update anomalies, a relation must be normalized by using normalization process to remove existing data redundancy.

LID Lname Salary Dept Subject Credit SID Sname GPA

E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75E5001 Dusit 28700 EE Electronic 1 3 S7 Vichu 3.15E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25E9001 Pattara 18500 CPE Data Structure 3 S1 Preeda 2.85E9001 Pattara 18500 CPE Data Structure 3 S2 Panu 2.45E9001 Pattara 18500 CPE Data Structure 3 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Service 4 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Services 4 S1 Preeda 2.85E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45

Page 9: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Functional Dependency

One of the main concepts associated with normalization is functional dependency, which describes the relationship between attributes.

Functional Dependency describes the relationship between attributes in a relation. For example, if A and B are attributes (or set of attributes) of relation R, B is functionally dependent on A (denoted AB), if each value of A is associated with exactly one value of B.

The symbol of Functional Dependency (AB can be described as followings: B is functionally dependent on A or A determines B or B depends on A

Page 10: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Functional DependenciesOne of the main concepts associated with normalization

is functional dependency, which describes the relationship between attributes.

(Definition of Functional Dependency)

Suppose that B is an attribute and A is another one, we said that B is functionally dependent on A (denoted A B), if each value of A is associated with exactly one value of B. ( A and B may each consists of one or more attributes.)

The symbol of functional dependence (A B) means B is functionally dependent on A or A functionally defines B or B depends on A

Page 11: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

If the functional dependency holds on schema R, in any legal relation r, for all pairs of tuples t1 and t2 in r such that t1[] = t2[], it is also the case that t1[] = t2[].

Given a relation r, attribute y of r is dependent on attribute x if and only if whenever two tuples of R agree on their x-value,they must necessarily agree on their y-value.

For every tuple in the relation r, if the value of attribute in tuples are the same, DBMS guarantees that the value of the attribute in those tuples must be the same. That is

If holds on R and if t1[] = t2[] DBMS must guarantee that t1[] = t2[]

Page 12: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

A BB is functionallydependent on A

When a functional dependency exists, the attribute or groupOf attributes on the left-hand side of the arrow is called the determinant.

Staff_No PositionPosition is functionally

dependent on Staff_No

SL21 System Engineer

Position Staff_NoStaff_No is not functionallydependent on Position

System Engineer SL21

SG5

Page 13: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

LID Lname Salary Dept Subject Credit SID Sname GPA

E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75E5001 Dusit 28700 EE Electronic 1 3 S7 Vichu 3.15E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25E9001 Pattara 18500 CPE Data Structure 3 S1 Preeda 2.85E9001 Pattara 18500 CPE Data Structure 3 S2 Panu 2.45E9001 Pattara 18500 CPE Data Structure 3 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Service 4 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Services 4 S1 Preeda 2.85E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45

( LID, Subject,SID ) Lname, Salary, Dept, Credit, Sname, GPALID Lname, Salary, DeptSubject CreditSID Sname, GP

Page 14: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Utilization of FD to decompose a relation

LID Lname Salary Dept Subject Credit SID Sname GPA

E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75E5001 Dusit 28700 EE Electronic 1 3 S7 Vichu 3.15E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54E6001 Anan 24900 IE Optimization 3 S9 Charoen 3.08…… ………….. ………….. …………. ………….. ………….. ………….. ………….. …………..

LID Lname Salary Dept

E5001 Dusit 28700 EEE6001 Anan 24900 IEE6002 Saeree 53020 IEE9001 Pattara 18500 CPE

Subject Credit

Electronic 1 3Optimization 3Prob Stat 4Data Structure 3Web Service 4

SID Sname GPA

S1 Preeda 2.85S2 Panu 2.45S3 Vallapa 3.02S4 Panita 3.35S5 Sarun 2.96S6 Kanok 2.75S7 Vichu 3.15S8 Kitti 2.54S9 Chareon 3.08

Lecturer Subject Student

Page 15: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema.

Unnormalized Form

1st Normal Form

2nd Normal Form

3rd Normal Form

Boyce-Codd Normal Form

Normalization can be applied to individual relation so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly.

The process of normalization is a formalmethod that identifies relations based onprimary key (or candidate keys in the case of BCNF the functional dependencies among their attributes).

Page 16: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Relationships of Normal Forms1NF2NF

3NF/BCNF

4NF

5NF

DKNFHigherNormalforms

Page 17: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for
Page 18: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Case StudyThe DreamHome company manages property on behalf of the owners, and as part of this service, the company takes care of the property’s rental. To simplify this example, we assume that a customer rents a given property only once, and cannot rent more than one property at any one time.

Unnormalized form (UNF) : A table that contains one or more repeating groups.

A repeating group is an attribute or group of attributes within a table that occurs with multiple values for a single occurrence of the key attribute (s) for that table. The term key refers to the attribute (s)that uniquely identify each row within the unnormalized table.

Cust_No CName Property_No PAddress Rent RentStart RentFinish Owner_No OName

CR76 John Kay PG4

PG16

6 Lawrence St,

5 Norwar Dr

350

450

1-Jul-94

1-Sep-96

31-Aug-96

1-Sep-98

CO40

CO93

Tina Murphy

Tony Shaw

CR56 Aline Stewart PG4

PG36

PG16

6 Lawrence St,

2 Manor Rd,

5 Norwar Dr

350

375

450

1-Sep-92

10-Oct-94

1-Jan-96

10-Jan-94

1-Dec-95

10-Aug-96

CO40

CO93

CO93

Tina Murphy

Tony Shaw

Tony Shaw

Customer_Rental Relation

Page 19: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Case StudyThe DreamHome company manages property on behalf of the owners, and as part of this service, the company takes care of the property’s rental. To simplify this example, we assume that a customer rents a given property only once, and cannot rent more than one property at any one time.

Adjust Unnormalized form to 1st NF by removing of repeating groups in order to form relational data model (data are conceptually structured in the form of table) .

Cust_No CName Property_No PAddress Rent RentStart RentFinish Owner_No OName

CR76 John Kay PG4

PG16

6 Lawrence St,

5 Norwar Dr

350

450

1-Jul-94

1-Sep-96

31-Aug-96

1-Sep-98

CO40

CO93

Tina Murphy

Tony Shaw

CR56 Aline Stewart PG4

PG36

PG16

6 Lawrence St,

2 Manor Rd,

5 Norwar Dr

350

375

450

1-Sep-92

10-Oct-94

1-Jan-96

10-Jan-94

1-Dec-95

10-Aug-96

CO40

CO93

CO93

Tina Murphy

Tony Shaw

Tony Shaw

Customer_Rental Relation

CR76 John Kay

CR56 Aline Stewart

CR56 Aline Stewart

Page 20: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

First normal form (1NF) : A relation in which the intersection of each row and column contains one and only one value.

For the relational data model, it is important to recognize that it is only first normal form(1NF) that is critical in creating appropriate relations. All the subsequent normal forms are optional. However, to avoid the update anomalies, it is recommended that we proceed to at least 3NF.

Custome_No Property_No CName PAddress Rent RentStart RentFinish Owner_No OName

CR76 PG4 John Kay 6 Lawrence St, 350 1-jul-94 31-Aug-96 CO40 Tina Murphy

CR76 PG16 John Kay 5 Norwar Dr 450 1-Sep-98 1-Sep-98 CO93 Tony Shaw

CR56 PG4 Aline Stew 6 Lawrence St, 350 10-Jun-94 10-Jun-94 CO40 Tina Murphy

CR56 PG36 Aline Stew 2 Manor Rd, 375 1-Dec-95 1-Dec-95 CO93 Tony Shaw

CR56 PG16 Aline Stew 5 Norwar Dr 450 10-Aug-96 10-Aug-96 CO93 Tony Shaw

Customer_Rental Relation

Page 21: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Set of the Functional Dependency of Customer_Rental relation

fd1 Customer_No, Property_No RentStart, RentFinish (Primary key)

fd2 Customer_No CName (Partial dependency)

fd3 Property_No PAddress, Rent, Owner_No, OName (Partial dependency)

fd4 Owner_No OName (Transitive dependency)

fd5 Customer_No, RentStart Property_No, PAddress, RentFinish, Rent, Owner, OName (Candidate key)

fd6 Property_No, RentStart Customer_No, CName, RentFinish (Candidate key)

Page 22: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Customer_No Property_No CName RentFinish RentPAddress Owner_NoRentStart OName

(Primary key)

(Partial dependency)(Partial dependency)

(Transitive dependency)

(Candidate key)

(Candidate key)

fd1

fd2

fd3

fd5

fd6

fd4

Page 23: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

A relation that is in the first normal form and every non-primary key attribute is fully functionally dependent on the primary key.

Second Normal Form (2NF) :

Full functional : Indicates that if A and B are attributes of a relation, B is fully functionally dependentdependency on A if B is functionally dependent on A, but not on any proper subset of A.

ถ้า B เป็น Non-Key attribute ซึ่งมฟีงัก์ชัน่การขึ้นต่อกันอยูก่ับสว่นใดสว่นหน่ึง ของคียห์ลัก เราจะเรยีกวา่ B

partial dependence on A. Partial dependency ต้องถกูขจดัออกโดย การแยก ออกไปตัง้เป็นตารางใหม่ เพื่อให้

Non-Key attribute ตัวน้ี fully dependent on คียห์ลักCustomer_No Property_No CName RentFinish RentPAddress Owner_NoRentStart

(Primary key)

(Partial dependency)

(Partial dependency)

fd1

fd2

fd3

OName

Page 24: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

2NF applies to relations with composite keys, that is, relations with a primary key that composed of two or more attributes. A relation with a single attribute primary key is automatically in at least 2NF.

Customer (Customer_No, CName)

Customer_No CName

CR76 John Kay

CR56 Aline Stewart

Customer_No Property_No

RentStart RentFinish

CR76 PG14 1-Jul-94 31-Aug-96

CR766 PG16 1-Sep-96 1-Sep-98

CR56 PG4 1-Sep-92 10-Jun-94

CR56 PG36 10-Oct-94 1-Dec-95

CR56 PG16 1-Jan-96 10-Aug-96

Customer RelationRental Relation

Property_No PAddress Rent Owner_No OName

PG14 6 Lawrence St, 350 CO40 Tina Murphy

PG16 5 Norwar Dr 450 CO93 Tony Shaw

PG36 2 Manor Rd, 375 CO93 Tony Shaw

Property-Owner Relation

Rental (Customer_No, Property_No, RentStart, RentFinish)Property_Owner (Property_No, PAddress, Rent, Owner_No, OName)

Page 25: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Transitive dependency

Property_No

PAddress Rent Owner_No OName address

PG14 6 Lawrence St, 350 CO40 Tina Murphy 28 North Rye

PG16 5 Norwar Dr 450 CO93 Tony Shaw 550/8 Lake Shore Dr.

PG36 2 Manor Rd, 375 CO93 Tony Shaw 550/8 Lake Shore Dr.

Property-Owner Relation

Customer (Customer_No, CName)Rental (Customer_No, Property_No, RentStart, RentFinish)Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, address)

Transitive dependency

Customer_No CName

CR76 John Kay

CR56 Aline Stewart

Customer_No Property_No

RentStart RentFinish

CR76 PG14 1-Jul-94 31-Aug-96

CR766 PG16 1-Sep-96 1-Sep-98

CR56 PG4 1-Sep-92 10-Jun-94

CR56 PG36 10-Oct-94 1-Dec-95

CR56 PG16 1-Jan-96 10-Aug-96

Customer Relation Rental Relation

Page 26: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Transitive dependency : A condition where A, B, and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).

Definition of Third Normal Form:

A relation that is in first and second normal form, and in which no non-primary key attributeis transitively dependent on the primary key.

Property_No PAddress Rent Owner_No

PG14 6 Lawrence St, 350 CO40

PG16 5 Norwar Dr 450 CO93

PG36 2 Manor Rd, 375 CO93

Property-for-Rent Relation

Customer (Customer_No, CName)Rental (Customer_No, Property_No, RentStart, RentFinish)Property_Owner (Property_No, PAddress, Rent, Owner_No, OName)

Owner_No OName address

C040 Tina Murphy 28 North Rye

Co93 Tony Shaw 550/8 Lake Shore Dr.

Owner Relation

Page 27: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Custome_No Property_No CName PAddress Rent RentStart RentFinish Owner_No OName

CR76 PG4 John Kay 6 Lawrence St, 350 1-jul-94 31-Aug-96 CO40 Tina Murphy

CR76 PG16 John Kay 5 Norwar Dr 450 1-Sep-98 1-Sep-98 CO93 Tony Shaw

CR56 PG4 Aline Stew 6 Lawrence St, 350 10-Jun-94 10-Jun-94 CO40 Tina Murphy

CR56 PG36 Aline Stew 2 Manor Rd, 375 1-Dec-95 1-Dec-95 CO93 Tony Shaw

CR56 PG16 Aline Stew 5 Norwar Dr 450 10-Aug-96 10-Aug-96 CO93 Tony Shaw

Customer_Rental Relation

Customer (Customer_No, CName)Rental (Customer_No, Property_No, RentStart, RentFinish)Property (Property_No, PAddress, Rent, Owner_No)Owner (Owner_No, Oname, address)

Page 28: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Customer_No CName

CR76 John Kay

CR56 Aline Stewart

Customer_No Property_No

RentStart RentFinish

CR76 PG14 1-Jul-94 31-Aug-96

CR766 PG16 1-Sep-96 1-Sep-98

CR56 PG4 1-Sep-92 10-Jun-94

CR56 PG36 10-Oct-94 1-Dec-95

CR56 PG16 1-Jan-96 10-Aug-96

Customer Rental

Property_No

PAddress Rent Owner_No

PG14 6 Lawrence St, 350 CO40

PG16 5 Norwar Dr 450 CO93

PG36 2 Manor Rd, 375 CO93

Property-Owner

Owner_No OName address

CO40 Tina Murphy 28 North Rye

CO93 Tony Shaw 550/8 Lake Shore

Owner

Customer_Rental

Customer Rental

Property_Owner

Property_for_Rent Owner

1NF

2NF

3NF

Page 29: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

From 3NF to Boyce-Codd Normal Form (BCNF)

The difference between 3NF and BCNF is that for a functional dependency AB, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Therefore, BCNF is a stronger form of 3NF, such every relation in BCNF is also in 3NF.

BCNF is based on functional dependencies that take into account all candidate keys in a relation. For a relation with only one candidate key, 3NF and BCNF are equivalent.

Boyce-Codd : A relation is in BCNF if and only if every determinant is normal form (BCNF) a candidate key.

Violation of BCNF is quite rare, since it may only happen under specific conditions. The potential to violate BCNF may occur in relation that

• contains two (or more) composite candidate keys and

• which overlap, that is share at least one attribute in common

Page 30: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Case Study

In this example, Client_Interview relation is presented. It contains details of the arrangements for interviews of clients by members of staff of the DreamHome company. The members of staff involved in interviewing clients are allocated to a specific room on the day of interview. However, a room may be allocated to several members of staff as required throughout a working day. A client is only interviewed once on a given date, but may be requested to attend further interviews at later dates. This relation has three candidate keys:

(Client_No, Interview_Date), (Staff_No, Interview_Date, Interview_Time), and (Room_No, Interview_Date, Interview_Time).

Therefore the Client_Interview relation has three composite candidate keys, which overlap by sharing the common attribute Interview_Date. We select Client_No, Interview_Date) to act as the primary key for this relation.

Page 31: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Fd1 Client_No, Interview_Date Interview_Time, Staff_No, Room_No (Primary key)

Fd2 Staff_No, Interview_Date, Interview_Time Client_No (Candidate key)

Fd3 Room_No, Interview_Date, Interview_Time Staff_No, Client_No (Candidate key)Fd4 Staff_No, Interview_Date Room_No

Client_Interview (Client_No, Inverview_Date, Interview_Time, Staff_No, Room_No)

The Client_Interview relation has the following functional dependencies :

Client_No Interview_Date Interview_Time Staff_No Room_No

CR76 13-May-98 10:30 SG5 G101

CR56 13-May-98 12:00 SG5 G101

CR74 13-May-98 12:00 SG37 G102

CR56 1-Jul-98 10:30 SG5 G102

Client_Interview Relation

Page 32: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Client_No Interview_Date Interview_Time Staff_No

CR76 13-May-98 10:30 SG5

CR56 13-May-98 12:00 SG5

CR74 13-May-98 12:00 SG37

CR56 1-Jul-98 10:30 SG5

Interview Relation

Staff_No Interview_Date Room_No

SG5 13-May-98 G101

SG37 13-May-98 G102

SG5 1-Jul-98 G102

Staff_Room Relation

Interview (Client_No, Interview-Date, Interview_Time, Staff_No)

Staff_Room (Staff_No, Interview-Date, Room_No)

Page 33: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Review of Normalization (1NF to BCNF)

The DreamHome company manages property on behalf of the owners, and as part of this service the company undertakes regular inspections of the property by members of staff. When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. However, a car may be allocated to several members of staff, as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date.

Property_No PAddress IDate ITime Comments Staff_No SName Car_Reg

PG4 6 Lawrence St, 18-Oct-96

22-Apr-97

1-Oct-98

10:00

09:00

12:00

Need to replace crockery

In good order

Damp rot in bathroom

SG37

SG14

SG14

Ann Beech

David Ford

David Ford

M231 JGR

M533 HDR

N721 HFR

PG16 5 Norwar Dr 22-Apr-96

24-Oct-97

13:00

14:00

Replace room carpet

Good condition

SG14

SG37

David Ford

Ann Beech

M533 HDR

N721 HFR

Property_Inspection Relation

Property_Inspection (Property_No, PAddress, IDate, ITime, Comments, Staff_No, SName, OName)

Page 34: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

1NF : Property_Inspection RelationProperty_No IDate ITime PAddress Comments Staff_No SName Car_Reg

PG4 18-Oct-96 10:00 6 Lawrence St, Need to replace crockery SG37 Ann Beech M231 JGR

PG4 22-Apr-97 09:00 6 Lawrence St, In good order SG14 David Ford M533 HDR

PG4 1-Oct-98 12:00 6 Lawrence St, Damp rot in bathroom SG14 David Ford N721 HFR

PG16 22-Apr-96 13:00 5 Norwar Dr Replace room carpet SG14 David Ford M533 HDR

PG16 24-Oct-97 14:00 5 Norwar Dr Good condition SG37 Ann Beech N721 HFR

Property_Inspection (Property_No, IDate, ITime, PAddress, Comments, Staff_No, SName, OName)

Property_No IDate ITime PAddress Comments Staff_No SName Car_Reg

(Partial dependency)

(Primary key)

(Transitive dependency)

(Candidate key)

(Candidate key)

FD1

FD2

FD3

FD4

FD5

FD6

Page 35: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Property_No IDate ITime PAddress Comments Staff_No SName Car_Reg

(Partial dependency)

(Primary key)FD1

FD2

Property RelationProperty_No PAddress

PG4 6 Lawrence St,

PG16 5 Norwar Dr

Property_Inspection RelationProperty_No IDate ITime Comments Staff_No SName Car_Reg

PG4 18-Oct-96 10:00 Need to replace crockery SG37 Ann Beech M231 JGR

PG4 22-Apr-97 09:00 In good order SG14 David Ford M533 HDR

PG4 1-Oct-98 12:00 Damp rot in bathroom SG14 David Ford N721 HFR

PG16 22-Apr-96 13:00 Replace room carpet SG14 David Ford M533 HDR

PG16 24-Oct-97 14:00 Good condition SG37 Ann Beech N721 HFR

Remove Partial dependency (decompose the relation) to obtain 2NF

Page 36: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Property Relation (Property_No, PAddress)

Property_No PAddress

PG4 6 Lawrence St,

PG16 5 Norwar Dr

Property_Inspection RelationProperty_No IDate ITime Comments Staff_No SName Car_Reg

FD1

FD3

FD4

FD5

FD6

(Primary key)

(Transitive dependency)

(Candidate key)

(Candidate key)

Page 37: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Property RelationProperty_No PAddress

PG4 6 Lawrence St,

PG16 5 Norwar Dr

Property_Inspection RelationProperty_No IDate ITime Comments Staff_No Car_Reg

PG4 18-Oct-96 10:00 Need to replace crockery SG37 M231 JGR

PG4 22-Apr-97 09:00 In good order SG14 M533 HDR

PG4 1-Oct-98 12:00 Damp rot in bathroom SG14 N721 HFR

PG16 22-Apr-97 13:00 Replace room carpet SG14 M533 HDR

PG16 24-Oct-97 14:00 Good condition SG37 N721 HFR

Remove Transitive dependency (decompose the relation) to obtain 3NF

Staff RelationStaff_No SName

SG37 Ann Beech

SG14 David Ford

Page 38: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Property RelationProperty_No PAddress

PG4 6 Lawrence St,

PG16 5 Norwar Dr

Property_Inspection RelationProperty_No IDate ITime Comments Staff_No Car_Reg

Remove remaining anomalies from functional dependencies to obtain BCNF

Staff RelationStaff_No SName

SG37 Ann Beech

SG14 David Ford

Staff_Car (Staff_No, IDate, Car_Reg)Inspection (Property_No, IDate, ITime, Comments, Staff_No)

Page 39: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

From BCNF to Fourth Normal Form (4NF)

Although BCNF removes any anomalies due to functional dependencies, further research led to the identification of another type of dependency called multi-valued dependency (MVD), which can cause similar design problems for relations in terms of data redundancy.

Lecturer_Name Subject Research

Yuen Data Structure Natural Language Processing

Yuen Data Structure Protocal Analyzer

Yuen Discrete Math Natural Language Processing

Yuen Discrete Math Protocal Analyzer

Yuen Data Base Natural Language Processing

Yuen Data Base Protocal Analyzer

Chalerrmsak Data Structure Protocal Analyzer

Chalerrmsak Data Structure Compiler Utilities

Chalerrmsak Data Structure Natural Language Processing

ตารางต่อไปนี้เป็น BCNF แต่ยงัเกิดปัญหา update anomaliesLect_Sub_Research Relation

Page 40: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Multi-valued : Represents a dependency between attributes (for example, A,dependency B, and C) in a relation, such that for each value of A there is a (MVD) set of values for B, and a set of values for C. However, the set

of values for B and C are independent of each other.

A > BA > C

Lecturer > SubjectLecturer > Research

Lecturer_Name Subject Research

Yuen Data Structure Natural Language Processing

Yuen Data Structure Protocal Analyzer

Yuen Discrete Math Natural Language Processing

Yuen Discrete Math Protocal Analyzer

Yuen Data Base Natural Language Processing

Yuen Data Base Protocal Analyzer

Chalerrmsak Data Structure Protocal Analyzer

Chalerrmsak Data Structure Compiler Utilities

Chalerrmsak Data Structure Natural Language Processing

Lec_Sub_Research RelationLecturer_Name Subject

Yuen Data Structure

Yuen Discrete Math

Yuen Data Base

Chalerrmsak Data Structure

Lec_Sub Relation

Lecturer_Name Research

Yuen Natural Language Processing

Yuen Protocal Analyzer

Chalerrmsak Protocal Analyzer

Chalerrmsak Compiler Utilities

Chalerrmsak Natural Language Processing

Lec_Research Relation

Page 41: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

Unnormalized form (UNF)

First normal form (1NF)

Second normal form (2NF)

Third normal form (3NF)

Boyce-Codd form (BCNF)

Fourth normal form (4NF)

Remove repeating groups

Remove partial dependencies

Remove transitive dependencies

Remove remaining anomalies From functional dependencies

Remove multi-valued dependencies

Page 42: Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for

LID Lname Salary Dept Subject Credit SID Sname GPA

E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75E5001 Dusit 28700 EE Electronic 1 4 S7 Vichu 3.15E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25E9001 Pattara 18500 CPE Data Structure 3 S1 Preeda 2.85E9001 Pattara 18500 CPE Data Structure 3 S2 Panu 2.45E9001 Pattara 18500 CPE Data Structure 3 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Service 4 S3 Vallapa 3.02E9001 Pattara 18500 CPE Web Services 4 S1 Preeda 2.85E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45

NULL NULL NULL NULL NULL NULL S999 Luxana NULL

E9999 Thana 17500 CPE NULL NULL NULL NULL NULL

NULL NULL NULL CPE GIS 3 NULL NULL NULL