Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Institute of Southern Punjab, Multan
Mr. Muhammad Nouman Farooq
BSC-H (Computer Science)
MS (Telecomm. and Networks)
Honors:
Magna Cumm Laude Honors Degree
Gold Medalist!
Blog Url: noumanfarooqatisp.wordpress.com
E-Mail: [email protected]
Advance Database Systems
Lecture# 6
Advanced Normalization
Lecture 6: Advanced Normalization
Normalization (Brief Overview)
Functional Dependencies and Keys
1st Normal Form
2nd Normal Form
3rd Normal Form
3.5 Normal Form (Boyce Codd Normal Form-BCNF)
4th Normal Form
5th Normal Form (Project-Join Normal Form-PJNF)
Domain Key Normal Form (DKNF)
6th Normal Form
Class Activity
3
4
Normalization (Brief Overview)
Normalization (Brief Overview)
The main goal of Database Normalization is to restructure the
logical data model of a database to:
1. Eliminate Redundancy.
2. Organize Data Efficiently.
3. Reduce the possibility of Data Anomalies/Irregularities.
5
Continued…
Data anomalies are inconsistencies in the data stored in a database as a result of an operation such as update, insertion, and deletion.
Such inconsistencies may arise when have a particular record stored in multiple locations and not all of the copies are updated.
We can prevent such anomalies by implementing 7 different level of normalization called Normal Forms (NF)
We’ll only look at the first Three.
6
Continued…
Database Normalization was first proposed by Edgar F. Codd.
Codd defined the first three Normal Forms, which we’ll look into,of the 7 known Normal Forms.
In order to do normalization we must know what the requirementsare for each of the three Normal Forms that we’ll go over.
One of the key requirements to remember is that Normal Formsare progressive. That is, in order to achieve 3rd NF we must have2nd NF and in order to have 2nd NF we must have 1st NF.
7
Levels of Normalization
8
Levels of normalization based on the amount of redundancy
in the database.
Various levels of normalization are:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF) or 3.5 NF
Fourth Normal Form (4NF)
Fifth Normal Form (Project-Join Normal Form-PJNF)
Domain Key Normal Form (DKNF)
Sixth Normal Form (6NF)
Redundancy
Decre
ase
s
Num
ber o
f Table
s Incre
ase
s
Data Anomalies
1. Insertion Anomaly:
Cannot make a record of Jone’s address because he is not taking any classes.
10
Continued…
2. Update Anomaly:
Clearly, Name and Address are redundant (larger relation and we have to update 3 rows to update the Address)
11
Continued…
3. Delete Anomaly:
Cannot delete Jones’ enrolment without loosing his address as well
12
Functional Dependencies and Keys
13
Functional Dependencies and Keys
14
. Example is shown on next slide
15
16
Prime and Non-Prime Attributes
17
In Normalization Terminology, Any attribute that is completely or at least
part is a member of a Primary Key is known as a Prime Attribute instead of
the more common term Key Attribute.
So, a Non-Prime attribute or Non-Key attribute, is not part of any
Candidate Key.
Closure Set of Attributes
18
Trivial means D and E contains only its own value
Finding Candidate Keys using Closure Set of Attributes
19
So:
There are 15 possible Keys;
We have to find Candidate
key from these possible keys
Continued..
20
There are 15 possible Keys;
We have to find Candidate
key from these possible keys
Continued..
21
Finding Candidate Keys using Closure Set of Attributes
22
23
Continued..
1st Normal Form
24
1st Normal Form
The requirements to satisfy the 1st NF are:
1) Each table has a Primary Key: minimal set of attributes which
can uniquely identify a record
2) The values in each column of a table are Atomic (No multi-
value attributes allowed eg. stdCellNo; stdEmailAddress)
3) There are no repeating groups: two columns do not store
similar information in the same table.
25
2nd Normal Form
28
2nd Normal Form
The requirements to satisfy the 2nd NF:
All requirements for 1st NF must be met
Any partial functional dependencies have been removed (i.e., non-keys are identified by the Whole Primary Key).
bookISBN --> bookTitle, bookFirstAuthorName, bookPublisher.
In the given above example; The title of a book, the name of the firstauthor, and the publisher are functionally dependent on the book'sInternational Standard Book Number (ISBN). In other words all listedattributes are identified uniquely by the whole Primary Key which is“bookISBN” and this relation/file/table/ENTITY is in 2nd Normal Form.
29
30
31
33
34
3rd Normal Form
35
Continued…
We have seen how Database Normalization can decrease redundancy,
increase efficiency and reduce anomalies by implementing three of
seven different levels of normalization called Normal Forms.
The first Three Normal Forms (3-NF) are usually sufficient for small to
medium size applications.
39
3.5 Normal Form (Boyce Codd Normal
Form-BCNF)
40
42
44
45
46
4th Normal Form
50
Conditions that Satisfies a Relation R is in 4-NF
A relation R is in 4-NF if and only if the following Conditions satisfied:
1) R must be in 3.5 NF (BCNF).
2) It must not contain Multi Valued Dependencies (MVD’s).
Note: Multi Valued Attributes and Multi Valued Dependencies are different concepts.
Multi-Valued Dependency (MVD)
It is the dependency where one attribute value
is potentially a ‘Multi Valued fact’ about
another:
Important Points:
1) There must be three or more attributes exists
in a Relation (In example three Attributes
exists which are: Person, Mobile and
Food_Like)
2) Attributes or Subset of Attributes must be
independent of each other (In example Mobile
and Food_Like attributes are Independent of
each other).
Overview of Definitions of 4-NF and Multivalued Dependencies (MVD)
Example:
Anomalies in a Relation that is not in 4NF
Class Activity:
a) If not in 4NF than Decompose the given Relation into 4NF and Design an ERD
in Information Engineering Standard.
b) Write a complete Code of an ERD of part a) in MySQL using XAMPP as
Simulator and Show Relational Schema Diagram having Database Name: 4NF
5th Normal Form (Project-Join Normal
Form-PJNF)
58
Why we Required Decomposition of a Relation
Redundancy can be reduced
Data Independence can be maintained
Anomalies can be reduced
In Natural Join Operation; After Joining tables; Repeated Records inrows are Eliminated
In Additive Lossy Join; spurious RECORD/TUPPLE/INSTANCE createdafter Natural Joining on two or more than two tables (It can be called asInformation/record retrieve that was missing in original table)
In Non-Additive Lossless Join; We are not losing any Information andnot creating any SPURIOUS/FALSE RECORDS/INSTANCES/DATA/TUPPLES ina Relation/Table/File but repetition in Records must be removed
Additive Lossy Join means EXTRA
RECORD/TUPPLE/INSTANCE created
after Natural Joining on two or more
than two tables
In Natural Join; After Joining
multiple tables; Repeated values
Eliminated as shown in figures
Projection of R1 Natural Join Projection of R2 Natural Join Projection of R3 ….. Natural Join Projection of Rn = R
R = R1 (R) ⋈ R2 (R) ⋈ …… ⋈ Rn(R)
Conditions that Satisfies a Relation R is in 5-NF A relation R is in 5-NF if and only if:
1) R must be in 4-NF.
2) If Join Dependency (JD) not exists.
a) If Non-Additive Lossless Join exists under one condition that is;decomposition of main table/relation into smaller relations that musteliminate redundant records with out losing original information and without creating false/spurious records.
b) If Additive Lossy Join exists than it will also be in 5-NF under onecondition that is; gaining meaningful information that was missing inoriginal relation (not called as spurious record in certain situation)
67
Lossy and Lossless Join Examples under Two Conditions
Condition 1: When Data is given in a Relation
Condition 2: When Data is Not given in a Relation
The following given Relation is Lossy Join or Lossless Join?Condition 1: When Data is Given in a Relation
R1 is a Super Set of R2:Condition 1: Example 2
R1 ⋈ R2 ⋈ R3
Continued..
R1 ⋈ R2 ⋈ R3
Continued..
R1 (R) ⋈ R2 (R) ⋈ R3 (R)
Condition 1: Example 3
Condition 2: When Data is not given in a Relation
Solution: -
Step 1: Write data in Tabular Form: -
Step 2:
1. Apply Functional Dependencies as provided in question:
2. We will select column of these determinants containing minimum of
two stars.
Step 2 (Continued..)
Now, We will apply functional dependency where b c and c d
Step 3:
Check number of Stars in each Rows. If a Row contains stars in all
Columns than Lossless Join otherwise Lossy Join.
81
Domain Key Normal Form (DKNF)
82
6th Normal Form
86
Class Activity
91
Class Activity 1: Answer the following Questions
92
Class Activity 2: Answer the following Questions
93
a) Indicate the Level of Normal Form for the given below Relation.
b) If the Relation is not in 4-NF, Than decompose it into 4-NF Relations.
c) Design an ERD of 4-NF Relations in Information Engineering Standard.
Recommended Readings
Chapter 5 from:
Modern Database Management-8th Edition by Jeffrey
A. Hoffer, Mary B. Presscott & Fred R. McFadden
(Page No. 211-219)
Advanced Normal Forms from:
Modern Database Management-8th Edition by Jeffrey
A. Hoffer, Mary B. Presscott & Fred R. McFadden
(Page No. 605-610)
Recommended Readings
Chapter 14 from:
Database Systems-A Practical Approach to Design,
Implementation and Management by Thomas Connolly
and Carolyn Begg, 4th Edition (Page No. 481-485)
Chapter 4 from:
Database Systems-A Pragmatic Approach by Elvis C.
Foster, Shirpad V. Godbole (Page 72-79)
Recommended Readings
Chapter 6 from:
Database Systems - Design, Implementation and
Management by Carlos Coronel, Steven Morris and Peter
Rob, 9th Edition(Page No. 204-246)
Summary of Lecture
97
Lecture 6➦
Normalization (Brief Overview)
Functional Dependencies and Keys
1st Normal Form
2nd Normal Form
3rd Normal Form
3.5 Normal Form (Boyce Codd Normal Form-BCNF)
4th Normal Form
5th Normal Form (Project-Join Normal Form-PJNF)
Domain Key Normal Form (DKNF)
6th Normal Form
Class Activity
END OF LECTURE 6
98