22
RDBMS Concepts/ Session 3 / 1 of 22 Objectives In this lesson, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form (BCNF) Appreciate the need for denormalization

RDBMS Concepts/ Session 3 / 1 of 22 Objectives In this lesson, you will learn to: Describe data redundancy Describe the first, second, and third

Embed Size (px)

Citation preview

RDBMS Concepts/ Session 3 / 1 of 22

Objectives

In this lesson, you will learn to: Describe data redundancy Describe the first, second, and third

normal forms Describe the Boyce-Codd Normal Form

(BCNF) Appreciate the need for denormalization

RDBMS Concepts/ Session 3 / 2 of 22

Normalization

The logical design of the database, including the tables and the relationships between them, is the core of an optimized relational database.

A good logical database design can lay the foundation for optimal database and application performance. A poor logical database design can impair the performance of the entire system.

RDBMS Concepts/ Session 3 / 3 of 22

Normalizing a logical database design involves using formal methods to separate the data into multiple, related tables.

A greater number of narrow tables (with fewer columns) is characteristic of a normalized database. A few wide tables (with more columns) is characteristic of an nonnomalized database.

RDBMS Concepts/ Session 3 / 4 of 22

Understanding Data Redundancy Redundancy means repetition of data Redundancy increases the time involved

in updating, adding, and deleting data It also increases the utilization of disk

space and hence, disk I/O increases

RDBMS Concepts/ Session 3 / 5 of 22

Understanding Data Redundancy (Contd.)

Redundancy can lead to the following problems: Update anomalies—Inserting,

modifying, and deleting data may cause inconsistencies

Inconsistencies—Errors are more likely to occur when facts are repeated

Unnecessary utilization of extra disk space

RDBMS Concepts/ Session 3 / 6 of 22

Definition of Normalization Normalization is a scientific method of breaking

down complex table structures into simple table structures by using certain rules

It allows you to reduce redundancy in a table and eliminate the problems of inconsistency and disk space usage

Normalization results in the formation of tables that satisfy certain specified rules and represent certain normal forms

RDBMS Concepts/ Session 3 / 7 of 22

Normal Forms The most important and widely used

normal forms are: First Normal Form (1 NF) Second Normal Form (2 NF) Third Normal Form (3 NF) Boyce Codd Normal Form (BCNF)

RDBMS Concepts/ Session 3 / 8 of 22

First Normal Form A table is said to be in the 1 NF when each cell

of the table contains precisely one value Functional Dependency

The normalization theory is based on the fundamental notion of functional dependency

Given a relation R, attribute A is functionally dependent on attribute B if each value of A in R is associated with precisely one value of B

RDBMS Concepts/ Session 3 / 9 of 22

Un-Normalised Data Employee No Employee Name Branch Code Branch Name Branch Location Certification ID 1….n Certification Name 1….n Certification done at Marks obtained

RDBMS Concepts/ Session 3 / 10 of 22

Rule 1

Eliminate repeating groups: Make a separate table for each set of

repeated attributes and give each table a primary key.

RDBMS Concepts/ Session 3 / 11 of 22

FNF

Employee No Employee Name Branch Code Branch Name Branch Location

Employee No Certification ID Certification Name Certification done at Marks obtained

RDBMS Concepts/ Session 3 / 12 of 22

Second Normal Form (2NF) A table is said to be in 2 NF when it is in 1 NF

and every attribute in the row is functionally dependent upon the whole key, and not just part of the key

To ensure that a table is in 2 NF, you should: Find and remove attributes that are

functionally dependent on only a part of the key and not on the whole key and place them in a different table

Group the remaining attributes

RDBMS Concepts/ Session 3 / 13 of 22

Rule 2

Eliminate Redundant Data If an attribute depends only on part of

a multi-valued key, move it to separate table.

The certification Name appears redundantly.(It also depends only on a part of the multi-valued key).

RDBMS Concepts/ Session 3 / 14 of 22

SNF

Employee Certifications Emp Certifications

Employee No

Employee Name

Branch Code

Branch Name

Branch Location

Certification ID

Certification Name

Employee No

Certification ID

Certification done at

Marks obtained

RDBMS Concepts/ Session 3 / 15 of 22

Third Normal Form (3NF) A relation is said to be in 3 NF when it is in 2

NF and every non-key attribute is functionally dependent only on the primary key

To ensure that a table is in 3 NF, you should: Find and remove non-key attributes that are

functionally dependent on attributes that are not the primary key and place them in a different table

Group the remaining attributes

RDBMS Concepts/ Session 3 / 16 of 22

Rule 3

Eliminate columns not dependent on Key Employee Table satisfies 1st & 2nd

normal forms. But the key is Employee No, and the

Branch name & location describe only a branch, Not a employee.

RDBMS Concepts/ Session 3 / 17 of 22

TNF

Employee Employee No Name Branch Code

Branch Branch Code Branch Name Location

Certification Cert. ID Cert. Name

Emp Certification Emp No Cert Id Cert. Done at Marks obtained

RDBMS Concepts/ Session 3 / 18 of 22

Boyce-Codd Normal Form The original definition of 3NF was inadequate in

some situations It was not satisfactory for the tables:

that had multiple candidate keys where the multiple candidate keys were

composite where the multiple candidate keys overlapped

Therefore, a new normal form—the Boyce-Codd Normal Form (BCNF) was introduced A relation is in the Boyce-Codd normal form

(BCNF) if and only if every determinant is a candidate key

RDBMS Concepts/ Session 3 / 19 of 22

Characteristics of a normalized database

Each table must have a key field. All field must contain small data. There must be no repeating fields. Each table must contain information

about a single entity. Each field in a table must depend on the

key field. All non-key fields must be mutually

independent.

RDBMS Concepts/ Session 3 / 20 of 22

Understanding Denormalization The end product of normalization is a set of

related tables that comprise the database However, in the interests of speed of response

to critical queries, which demand information from more than one table, it is sometimes wiser to introduce a degree of redundancy in tables

The intentional introduction of redundancy in a table to improve performance is called denormalization

RDBMS Concepts/ Session 3 / 21 of 22

Summary

In this lesson, you learned that: Normalization is used to simplify table structures. Normalization results in the formation of tables

that satisfy certain specified constraints, and represent certain normal forms. The normal forms are used to ensure that various types of anomalies and inconsistencies are not introduced in the database. A table structure is always in a certain normal form. Several normal forms have been identified.

RDBMS Concepts/ Session 3 / 22 of 22

Summary (Contd.) The most important and widely used of these are:

First Normal Form (1NF) Second Normal Form (2 NF) Third Normal Form (3 NF) Boyce Codd Normal Form (BCNF)

The intentional introduction of redundancy in a table in order to improve performance is called denormalization.

The decision to denormalize results in a trade-off between performance and data integrity.

Denormalization increases disk space utilization.