35
ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah | [email protected] | http://www.sitinur151.wordpress.com | | A2-3039 | ext:2561 | 012-7760562 | CHAPTER 5 Normalization Of Database Tables (Part I: Concept & Process)

ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Embed Size (px)

Citation preview

Page 1: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

ITS232Introduction To Database Management Systems

Siti Nurbaya Ismail

Faculty of Computer Science & Mathematics,

Universiti Teknologi MARA (UiTM), Kedah

| [email protected] | http://www.sitinur151.wordpress.com |

| A2-3039 | ext:2561 | 012-7760562 |

CHAPTER 5Normalization Of Database Tables

(Part I: Concept & Process)

Page 2: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables

5.0 Normalization Of Database Tables5.1 Database Tables And Normalization 5.2 The Need For Normalization 5.3 The Normalization Process 5.4 Normalization And Database Design 5.5 Denormalization

Page 3: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

3

Chapter 5: Normalization Of Database Tables5.1 Database Tables And Normalization

• Table is the basic building block of database design• ER Modeling, yields good table structures, but it is still possible to create

poor table structures even in a good database design• How do you recognize a poor tables structure, and how do you produce

good table?– Normalization

• A process of evaluating and correcting table structures to minimize data redundancies, thereby reducing the possibility of data anomalies

• Works through a series of stages called normal forms

Page 4: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

4

Chapter 5: Normalization Of Database Tables5.1 Database Tables And Normalization

• In Chapter 4, ER Modeling, we have adopted a top-down approach to database design that begins by identifying the entities and relationship

• Normalization is a bottom-up approach to database design that begins by examining the relationships between attributes

Page 5: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

5

Chapter 5: Normalization Of Database Tables5.1 Database Tables And Normalization

Conceptual Model

Entity

Attribute Attribute

Entity

Attribute Attribute

Top-down Bottom-up{ Normalization }{ ER Modeling }

Page 6: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

6

Chapter 5: Normalization Of Database Tables5.1 Database Tables And Normalization

employeeNO name position salary branchNO

S21 Johan Manager 3000 B005

S37 Ana Assistant 1200 B003

S14 Daud Supervisor 1800 B003

S9 Mary Assistant 900 B007

S5 Siti Manager 2400 B003

S41 Jani Assistant 900 B005

branchNO city

B005 Kepong

B007 Nilai

B003 PTP

EMPLOYEE

EMPLOYEE(employeeNO,name,position,salary,branchNO*)

BRANCH

BRANCH(branchNO,city)

EMPLOYEE

name

salarycity

branchNO

hasM 1

BRANCH

positionemployeeNO

branchNO*

Page 7: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

7

Chapter 5: Normalization Of Database Tables5.2 The Need For Normalization

• Normalization is typically used with ER Modeling• There are two common situations in which database designers use

normalization:

i. Designing a new database• When designing a new database structure based on the business

requirements of the end users, the database designer will construct a data model using technique such as Crow’s Foot notation ERDs

• After the initial design is complete, the designer can used normalization to analyze the relationships that exist among the attributes within each entity, to determine if the structure can be improved through normalization

Page 8: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

8

Chapter 5: Normalization Of Database Tables5.2 The Need For Normalization

• Normalization is typically used with ER Modeling• There are two common situations in which database designers use

normalization:

ii. Modifying existing data structures• Sometimes database designer are asked to modify existing data

structures that can be in form of flat files, spreadsheet, or older database structures

• Normalization process can be used to analyze the relationship among the attributes or fields in the data structures, to improve the existing data structure in order to create an appropriate database design

Page 9: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

9

Chapter 5: Normalization Of Database Tables5.2 The Need For Normalization

• There for in both situations describe, the need for normalization are:• to analyze the relationship among the attributes or fields in the data

structures• to improve the existing data structure in order to create an

appropriate database design

– Normalization• A process of evaluating and correcting table structures to minimize

data redundancies, thereby reducing the possibility of data anomalies

Page 10: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

10

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

• When we design a database, the main objective is to create an accurate representation of data, relationship between the data, and constrains on the data that is relevant.

• To achieve this objective, we have to identify suitable set of relations (table) by creating good table structure.

– Normalization• A process of evaluating and correcting table structures to minimize

data redundancies, thereby reducing the possibility of data anomalies

• Works through a series of stages called normal forms

Page 11: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

11

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

•The most commonly used normal forms:• First Normal Form (1NF)• Second Normal Form (2NF)• Third Normal Form (3NF)

•2NF is better than 1NF; 3NF is better than 2NF

•Highest normalization is not always desirable; joint vs redundancy

• For most business database design purposes, 3NF is as high as we need to go in normalization process

Page 12: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

12

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

• Every normal form is based on functional dependencies between attributes in a relationship

• Each relationship can be normalized into a specific form to avoid anomalies• Anomalies?

• anomaly = abnormality• ideally a field value change, should be made only in a single place • Data redundancy, however, fosters an abnormal condition by forcing

field value changes in many different locations• Insertion anomalies• Deletion anomalies• Modification anomalies

Page 13: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

13

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

• Formal technique for analyzing relations based on their Primary Key (or candidate keys) and functional dependencies

• The technique executed as a series of steps (stage). Each step corresponds to a specific normal form, that have specific characteristic

• As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to anomalies

Data Redundancies

0NF/UNF

1NF

2NF

3NF

Normalization Denormalization

Page 14: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

14

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

• Relationship between normalization form Denormalization

Normalization

Figure 1: Diagrammatic illustration of the relationship between the normal forms

Page 15: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

15

Users Users’ requirements specification

Forms/reports that are used or generated by the

enterprise

Sources describing the enterprise such as data

dictionary and corporate data model

Unnormalized Form (UNF)

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

Transfer attributes into table format

Remove repeating group

Remove partial dependencies

Remove transitive dependencies

Data Sources

Page 16: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

2NF

3NF

UNF1)Repeat Group

2)PK is not defined

1NF1)Remove Repeat Group

2)Defined PK composite PK consist of attributes

Test for partial dependency

If (exist)

(1 Table)

Test for transitive dependency

If (exist)

(1 or 2 Tables)

(2 or 3 Tables)

(more then 1 table)

(3 or 4 Tables)

(a b …. TD) 1

(a ……. TD) 2

(b ….… TD) 3

(a, b x, y)

(a c, d)

(b z)

(c d)

Normalization Process Relation/table Format

- Have repeating group-PK not defined

- No repeating group-PK defined-Test partial dependency

- No repeating group-PK defined-No partial dependency-Test transitive dependency

- No repeating group-PK defined-No partial dependency-No transitive dependency

Page 17: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

17

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Functional Dependencies

• An important concept associated with normalization is functional dependency which describes the relationship between attributes

• In this section, you will learn about functional dependency and then focus on the particular characteristics of functional dependency that are useful for normalization

Page 18: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

18

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• For the discussion on functional dependency, assume that a relational schema has attributes (A,B,C,….Z) and that the database is describe by a single universal relation called R=(A,B,C,…,Z). This assumption means that every attribute in the database has a unique name

Functional dependencies: Describe the relationship between attributes in a relation. For example, if A and B are attributes of relation R, B is functionally dependent on A

(denoted A B), if each value of A is associated with exactly one value of B. (A and B may each consist of one or more attributes.)

Page 19: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

19

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• Consider a relation with attributes A and B, where attribute B is functionally dependent on attribute A

• To describe the relationship between attributes A and B is to say that A functionally determines B

A BB is functionally

dependent on A

R(A,B)A B

Page 20: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

20

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• When a functional dependency exist, the attribute or group of attributes on the left-handed side of the arrow is called determinant

Determinant:Refers to the attribute, or a group of attributes, on the left handed side of the arrow of a functional dependency

A BB is functionally

dependent on A

Page 21: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

21

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

employeeNO name position salary branchNO

S21 Johan Manager 3000 B005

S37 Ana Assistant 1200 B003

S14 Daud Supervisor 1800 B003

S9 Mary Assistant 900 B007

S5 Siti Manager 2400 B003

S41 Jani Assistant 900 B005

branchNO city

B005 Kepong

B007 Nilai

B003 PTP

EMPLOYEE

EMPLOYEE(employeeNO,name,position,salary,branchNO*)

BRANCH

BRANCH(branchNO,city)

Characteristic Of Functional Dependencies

Page 22: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Unnormalized Form (UNF)/(0NF)

• Unnormalized Form (UNF)/(0NF)A table that contains one or more repeating groups

To create an unnormalized table – Transform the data from the information source (e.g. form) into table

format with columns and rows

Page 23: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

23

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• Consider the attributes employeeNO and position of the EMPLOYEE relation

• For a specific employeeNO(S21), we can determine the position of that member of employee as Manager

• employeeNO functionally determines position

S21 Manager

employeeNO positionemployeeNO functionally

determines position

Page 24: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

24

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• However the next figure illustrate that the opposite is not true, as position does not functionally determines employeeNO

• A member of employee holds one position; however, they maybe several members of employee with the same position

position employeeNOposition does not functionally

determine emlpoyeeNO

Manager S21

S5

Page 25: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

25

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• Functional dependency can be describe by two types:– Full functional dependency (Partial dependency)

• Will be used to transform 1NF 2NF – Transitive dependency

• Will be used to transform 2NF 3NF

Page 26: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• Functional dependency can be describe by two types:– Full functional dependency (Partial dependency)

Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A, if B is functionally dependent on A, but not on any proper subset of A

R(A,B,C)A B,C

Page 27: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Characteristic Of Functional Dependencies

• Functional dependency can be describe by two types:– Transitive dependencyA, B and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A through B (Provided that A is not functionally dependent on B or C)

R(A,B,C)A BB C

Page 28: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

First Normal Form (1NF)

• First Normal Form (1NF) A relation in which the intersection of each row and column contains one and only one value

A relation is in 1NF if every attribute for every tuple have a value and domain for each attribute can not be simplified anymore

Page 29: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Second Normal Form (2NF)

• Second Normal Form (2NF) A relation that is in 1NF and every non-PK attribute is fully functionally depends on the PK

Based on the concept of partial dependency (dependencies' based on only a part of composite PK)

2NF applies to relations with composite keys, that is, relations with PK composed of two or more attributes

A relation with a single-attribute PK is automatically in at least 2NF

Page 30: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

Third Normal Form (3NF)

• Third Normal Form (3NF) A relation that is in 1NF and 2NF and which no non-PK attribute is transitively dependent on the PK

Based on the concept of transitive dependency, where;

A, B and C are attributes of a relation such that if A B and B C then C is transitively dependent on A through B

(Provided that A is not functionally dependent on B or C)

3NF applies to relations with transitive dependencyA relation that have no transitive dependency are already in 3NF

Page 31: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

UNF To 1NF

• Nominate an attribute or group of attributes to act as the key for the unnormalized table

• Identify the repeating group(s) in the unnormalized table which repeats for the key attribute(s), remove the repeating group by– entering appropriate data into the empty columns of rows containing

the repeating data• Fill the blanks by duplicating the non repeating data, where

required• This approach is commonly referred to as ”flattening table”• This approach will produce redundancy in a relationship, but it can

be eliminated in higher normalization process

Page 32: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

1NF To 2NF

• Identify the primary key for the 1NF relation• Identify the functional dependencies in the relation.

• Draw functional dependencies diagram• Write functional dependencies in relational schema

• If partial dependencies exist on the primary key remove them by placing them in a new relation along with a copy of their determinant

Page 33: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.3 The Normalization Process

2NF To 3NF

• Identify the primary key in the 2NF relation• Identify functional dependencies in the relation• If transitive dependency exist on the PK, remove them by placing them in

a new relation along with a copy of their determinant• Guidelines:

R (A, B, C) with transitive dependencies:A B

B C

Simplified R to:R1 (A, B*)

R2 (B, C)

Page 34: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.4 Normalization & Database Design

• Normalization should be part of the design process.• Make sure the proposed entities meet the required normal form before

the table structures are created• Be aware of good design principles and procedures as well as

normalization procedures

• ERD: is created by interactive process, begin by identifying relevant entities, their attributes, and their relationship. ERD provides the big picture/macro view, of an organization’s data requirements & operations.

• Normalization: focuses on the characteristics of specific entities: that is represents macro view of the entities within the ERD

• Therefore the two techniques are used in an iterative and incremental process

Page 35: ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah

Chapter 5: Normalization Of Database Tables5.5 Denormalization

Data Redundancies

0NF/UNF

1NF

2NF

3NF

Normalization Denormalization