Upload
vuduong
View
256
Download
2
Embed Size (px)
Citation preview
Lecture5 Functional Dependencies and
Normalization for Relational Databases
Ref. Chapter14 - 15
1
Coll
ege
of
Com
pute
r an
d I
nfo
rmat
ion S
cien
ces
- In
form
atio
n S
yst
ems
Dep
t.
I S220: Dat abase Fundament a l s
Prepared by L. Nouf Almujally & Aisha AlArfaj Last Example added by : L.Kholoud Baselm
How to produce a good relation schema?
Lec
ture
5
2
Top-down approach Such as ER modeling
Bottom-up approach Use normalization to create set of relations
Set of well-designed relations
ER model is mapped to a set of relations
Bottom-up approach
STEPS:
1. Start with a set of relations.
2. Define the functional dependencies for the relation to specify the PK.
3. Transform relations to normal form.
3
Lec
ture
5
Functional Dependencies
• Describes the relationship between attributes in a relation.
• If A and B are attributes of relation R, B is functionally
dependent on A, denoted by A B, if each value of A is
associated with exactly one value of B. B may have several
values of A.
Determinant Dependent
4
A B B is functionally
dependent on A
Lec
ture
5
Functional Dependencies
X Y
• X -> Y holds if whenever two tuples have the same value for X, they must have the same value for Y
• For any two tuples t and u in any relation instance r(R): If t[X]=u[X], then t[Y]=u[Y]
Lec
ture
5
5
X Y
t
u
If t & u agree here Then they must agree here
Functional Dependencies
6
Example
StaffNo position Position is functionally
dependent on Staffno
position StaffNo StaffNo is NOT functionally
dependent on position
SL21 Manager
Manager SL21
SG5
1:1 or M:1
relationship
between
attributes in a
relation
1:M
relationship
between
attributes in a
relation
Lec
ture
5
Examples of FD constraints
• Social security number determines employee name • SSN -> ENAME
• Project number determines project name and location • PNUMBER -> {PNAME, PLOCATION}
• Employee ssn and project number determines the hours per week that the employee works on the project • {SSN, PNUMBER} -> HOURS
Lec
ture
5
7
Identifying the PK
• Purpose of functional dependency, specify the set of integrity constraints that must hold on a relation.
• The determinant attribute(s) are candidate of the relation, if:
• 1:1 relationship between determinant & dependent.
• All attributes that are not part of the CK should be functionally dependent on the key: CK all attributes of R
• Hold for all time.
• Primary Key (PK) is the candidate attribute(s) with the minimal set of functional dependency.
8
Lec
ture
5
Identifying the PK
• If a relation schema has more than one key, each is called a candidate key.
• One of the candidate keys is arbitrarily designated to be the primary key, and the others are called Alternate keys.
• A Prime attribute must be a member of some candidate key
• A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.
Lec
ture
5
9
The Purpose of Normalization
• Normalization is a bottom-up approach to database design that begins by examining the relationships between attributes. It is performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form.
• Purpose:
- Guarantees no redundancy due to FDs
- Guarantees no update anomalies
• Normal Forms:
• First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Boyce-Codd Normal Form (BCNF) 10
Lec
ture
5
Normal Forms Defined Informally
• 1st normal form
• All attributes depend on the key
• 2nd normal form
• All attributes depend on the whole key
• 3rd normal form
• All attributes depend on nothing but the key
Lec
ture
5
11
Lec
ture
5
12
First Normal Form (1NF)
First Normal Form (1NF)
13
Unnormalized form (UNF): A relation that contains one or more
repeating groups.
First normal form (1NF): A relation in which the intersection of each row
and column contains one & only one value.
1NF Disallows:
• multivalued attribute • nested relations; attributes whose values for an individual tuple
are non-atomic (repeating group)
Lec
ture
5
First Normal Form (1NF)
14
ClientNo
CR76
PropertyNo
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4
PG36
PG16
CR56 Aline Stewart
Unnormalized form (UNF)
Lec
ture
5
Not in the 1NF because there are Multivalued attribute in
the table (PropertyNo)
Multivalued
• Divide the relation to two relations:
• Relation 1 :
• Contains all the attributes except the multivalued attribute.
• Relation 2 :
• Contains the primary key of the first relation and the multivalued
attribute
• Primary key becomes the combination of primary key and multivalued
attribute.
UNF 1NF
15
Lec
ture
5
UNF 1NF
16
Lec
ture
5
ClientNo
CR76
PropertyNo
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4
PG36
PG16
CR56 Aline Stewart
Unnormalized form (UNF)
ClientNo
CR76
Name
John Key
CLIENT
CR56 Aline Stewart
1NF relations
CLIENT_PROPERTY
ClientNo PropertyNo
CR76
CR76
CR56
CR56
CR56
PG4 PG16
PG4
PG36
PG16
UNF (multivalued) 1NF
Lec
ture
5
17
Multivalued
Unnormalized form (UNF)
1NF relations
UNF (nested relations) 1NF
• First normal form also disallows multivalued attributes that are themselves composite. These are called nested relations because each tuple can have a relation within it.
Lec
ture
5
18
Nested relation
Unnormalized form (UNF)
UNF (nested relations) 1NF
Lec
ture
5
19
Ssn Pnumber Hours
123456789 1 32.5
123456789 2 7.5
666884444 3 40.0
453453453 1 20.0
453453453 2 20.0
Nested relation
Unnormalized form (UNF)
1NF relations
EMPLOYEE EMP_PROJECT
UNF (nested relations) 1NF
DPT_NO MG_NO EMP_NO EMP_NM
D101 12345
20000
20001
20002
Carl Sagan Mag James Larry Bird
D102 13456 30000 30001
Jim Carter
Paul Simon
20
Nested relation
Unnormalized form (UNF)
1NF relations Lecture5
DPT_NO EMP_NO EMP_NM
D101 20000 Carl Sagan
D101 20001 Mag James
D101 20002 Larry Bird
D102 30000 Jim Carter
D102 30001 Paul Simon
DPT_NO MG_NO
D101 12345
D102 13456
Dept Dept_Emp
Summary : first normal form (1NF)
We say a relation is in 1NF if:
• all values stored in the relation are single-valued and atomic.
• There are no Multivalued Attributes.
• There are no repeating groups or nested relations in the table.
Lec
ture
5
21
Lec
ture
5
22
Second Normal Form (2NF)
Second Normal Form (2NF)
• Prime attribute: An attribute that is member of the primary key K
• Full functional dependency: when an non-prime attribute is determined by the whole of a COMPOSITE primary key.
Lec
ture
5
23
CUSTOMER
Cust_ID Name Order_ID
101 AT&T 1234
101 AT&T 156
125 Cisco 1250
Full Functional
Dependency
Second Normal Form (2NF)
24
• Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key.
CUSTOMER
Cust_ID Name Order_ID
101 AT&T 1234
101 AT&T 156
125 Cisco 1250
Partial
Dependency Lec
ture
5
Summary: Second Normal Form (2NF)
• A relation R is in 2NF if
1. R is 1NF , and
2. All non-prime attributes are fully dependent on the primary key.
3. no partial dependency in 2NF.
Lec
ture
5
25
Second Normal Form (2NF)
To transfer the relation to 2NF:
1) Remove partial dependencies by placing the functionally dependent attributes in a new relation along with a copy of their determinants.
26
Le
ctu
re5
Example1: 1NF 2NF
Lec
ture
5
27 Remove partial dependencies by placing the functionally dependent
attributes in a new relation along with a copy of their determinants.
Partial
Dependencies
Example2: Second normal form -2NF
Lec
ture
5
28
Product_ID Supplier_ID Cost Supplier Address
There are two non-key fields. So, here are the questions: • If I know just Product_ID, can I find out Cost? No, because we have more
than one supplier for the same product. • If I know just Supplier_ID , and I find out Cost? No, because I need to
know what the product is as well. Therefore, Cost is fully, functionally dependent upon the whole PK
(Product_ID , Supplier_ID).
Inventory
Example 2: Second normal form -2NF
Lec
ture
5
29
• If I know just Product_ID, can I find out Supplier Address? No, because we have more than one supplier for the same product. • If I know just Supplier_ID, and I find out Supplier Address? Yes. The Address does not depend upon the product. Therefore, Supplier Address is NOT functionally dependent upon the
whole PK (Product_ID , Supplier_ID).
Product_ID Supplier_ID Cost Supplier Address
Inventory
The above relations are now in 2NF
30
Le
ctu
re5
Example 2: Second normal form -2NF
Product_ID Supplier_ID Cost
Inventory
Supplier_ID Supplier Address
supplier
Lec
ture
5
31
Third Normal Form (3NF)
Third Normal Form (3NF)
• Transitive functional dependency
X, Y, Z are attributes of a relation, such that:
• If X Y and Y Z, then Z is transitively dependent on X via Y.
Lec
ture
5
32
EMPLOYEE
Emp_ID F_Name L_Name Dept_ID Dept_Name
111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Transitive
Dependency
Third Normal Form (3NF)
Examples: • SSN -> DMGRSSN is a transitive FD
• Since SSN -> DNUMBER and DNUMBER -> DMGRSSN
• SSN -> ENAME is non-transitive • Since there is no set of attributes X where SSN -> X and X -> ENAME L
ectu
re5
33
Third Normal Form (2)
• A relation schema R is in third normal form (3NF) if : 1. R in 2NF and
2. There is NO Transitive Dependency : when a non-key attribute determines another non-key attribute.
Lec
ture
5
34
Third Normal Form (2)
• NOTE:
• In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only if Y is not a candidate key.
• When Y is a candidate key, there is no problem with the transitive dependency .
• E.g., Consider EMP (SSN, Emp#, Salary ). • Here, SSN -> Emp# -> Salary and Emp# is a candidate key.
Lec
ture
5
35
Third Normal Form (3NF)
To transfer the relation to 3NF:
1) Meet all the requirements of the 1NF
2) Meet all the requirements of the 2NF
3) Place transitive dependent attributes in a new relation along with a copy of their determinants.
36
Le
ctu
re5
Example: 2NF 3NF
Lec
ture
5
37 If transitive dependencies exist, place transitively dependent attributes
in a new relation along with a copy of their determinants.
Transitive Dependencies
3NF 3NF
2NF
Book_ID Author's Name Author's Address # of Pages
Book_ID Author's Name # of Pages
• If I know # of Pages, can I find out Author's Name? No. Can I find out Author's Address? No.
• If I know Author's Name, can I find out # of Pages? No. Can I find out
Author's Address? YES. Therefore, Author's Address is transitively dependent upon Author's
Name, not the PK for its existence.
Lec
ture
5
38
Example : Third normal form -3NF
Books
Author's Name Author's Address
Books
Authors
Lec
ture
5
39
Important Notes
40
To normalize a relation (table) :
• Make sure you follow the correct order (1NF, 2NF, 3NF) o Unnormalized – There are multivalued attributes or repeating groups
o 1st NF: no multivalued attribute/repeated groups.
o 2nd NF: 1st NF + no partial dependency.
o 3rd NF: 2nd NF +no transitive dependency.
• For each NF, write full justification
• why or why not the table in the NF –
• Give full and correct reason (spelling is important)
• You should give your new relations (tables) names.
• Make sure you specify the primary key for new tables.
Lec
ture
5
Lec
ture
5
41
EXTRA EXAMPLES
Example 1: Determine NF
• ISBN Title
• ISBN Publisher
• Publisher Address
BOOK
ISBN Title Publisher Address
No multivalued attributes ;
therefore, the relation is at
least in 1 NF
Lec
ture
5
42
Example 1: Determine NF
• ISBN Title
• ISBN Publisher
• Publisher Address
BOOK
ISBN Title Publisher Address
There is no COMPOSITE
primary key, therefore there
can’t be partial dependencies.
Therefore, the relation is at least
in 2NF
Lec
ture
5
43
Example 1: Determine NF
• ISBN Title
• ISBN Publisher
• Publisher Address
BOOK
ISBN Title Publisher Address
Publisher is a non-prime attribute,
and it determines Address, another
non-prime attribute. Therefore,
there is a transitive dependency,
which means that the relation is
NOT in 3 NF.
Lec
ture
5
44
Example 1: Determine NF
• ISBN Title
• ISBN Publisher
• Publisher Address
BOOK
ISBN Title Publisher Address
We know that the relation is at
least in 2NF, and it is not in 3
NF. Therefore, we conclude that
the relation is in 2NF.
Lec
ture
5
45
Example 1: Determine NF
• ISBN Title • ISBN Publisher • Publisher Address
BOOK
ISBN Title Publisher Address
In your solution you will write the
following justification:
1) No M/V attributes, therefore at
least 1NF
2) No partial dependencies,
therefore at least 2NF
3) There is a transitive
dependency (Publisher
Address), therefore, not 3NF
Conclusion: The relation is in 2NF
Lec
ture
5
46
Example 1: Determine NF
BOOK
ISBN Title Publisher
To convert it to 3NF : place transitive
dependent attributes in a new relation
along with a copy of their determinants.
- Now in the 3NF
Publisher
Publisher Address
Lec
ture
5
47
• Product_ID Description
ORDER
Order_No Product_ID Description
Example 2: Determine NF
No multivalued attributes;
therefore, the relation is at least in
1 NF Lec
ture
5
48
• Product_ID Description
Example 2: Determine NF
ORDER
Order_No Product_ID Description
The relation is at least in 1NF.
There is a COMPOSITE Primary Key (PK)
(Order_No, Product_ID), therefore there can
be partial dependencies. Product_ID, which is
a part of PK, determines Description; hence,
there is a partial dependency. Therefore, the
relation is not 2NF.
Lec
ture
5
49
• Product_ID Description
Example 2: Determine NF
ORDER
Order_No Product_ID Description
We know that the relation is at
least in 1NF, and it is not in 2 NF.
Therefore, we conclude that the
relation is in 1 NF. Lec
ture
5
50
• Product_ID Description
Example 2: Determine NF
ORDER
Order_No Product_ID Description
In your solution you will write the following
justification:
1) No M/V attributes, therefore at least 1NF
2) There is a partial dependency
(Product_ID Description), therefore
not in 2NF
Conclusion: The relation is in 1NF
Lec
ture
5
51
Example 2: Determine NF
Product
Product_ID Description
Remove partial dependencies by placing the
functionally dependent attributes in a new relation
along with a copy of their determinants.
- Now in the 2NF
- No Transitive therefore in the 3NF
ORDER
Order_No Product_ID
Lec
ture
5
52
Example 3: Determine NF
53
IS Emp_info in the 1NF? 2NF? 3NF ? Explain why?
EmpNum EmpPhone EmpDegrees
123 233-9876
333 233-1231 BA, BSc, PhD
679 233-1231 BSc, MSc
BA
Lec
ture
5
54
EmpNum EmpPhone
123 233-9876
333 233-1231
679 233-1231
There are Multivalued attributes; therefore, not in 1NF
EmpNum EmpDegrees
123 BA
333 BA
333 BSc
333 PhD
679 BSc
679 MSc
Determine 1 NF ?
Emp_phone (1NF)
Emp_Degree (1NF)
Lec
ture
5
55
EmpNum EmpPhone
123 233-9876
333 233-1231
679 233-1231
There is no partial dependency, therefore in 2NF
EmpNum EmpDegrees
123 BA
333 BA
333 BSc
333 PhD
679 BSc
679 MSc
Determine 2NF ?
Emp_phone (2NF)
Emp_Degree (2NF)
Lec
ture
5
56
EmpNum EmpPhone
123 233-9876
333 233-1231
679 233-1231
There is no transitive dependency, therefore in 3NF
EmpNum EmpDegrees
123 BA
333 BA
333 BSc
333 PhD
679 BSc
679 MSc
Determine 3NF?
Emp_phone (3NF)
Emp_Degree (3NF)
Lec
ture
5
References
• “Database Systems: A Practical Approach to Design, Implementation and Management.” Thomas Connolly, Carolyn Begg. 5th Edition, Addison-Wesley, 2009.
Lec
ture
5
57
Normalization Example
Lec
ture
5
58 UNF
Normalization Example
FD’s
Lec
ture
5
59
NF ?1
• NO .. The relation is not in 1NF because it contain a repeating group.
• The structure of the repeating group is: Repeating group=(propertyNo ,pAddress ,rentStart ,rentFinish ,rent ,ownerNo ,oName)
• We remove the repeating group by placing the repeating data along with a copy of the PK in a separate relation.
Lec
ture
5
60
UNF
NF ?1
Lec
ture
5
61
1NF
Lec
ture
5
62
2NF
Lec
ture
5
63
2NF
Lec
ture
5
64
• The resulting 3NF relations have the form:
Lec
ture
5
65
Lec
ture
5
66