Chapter4 - Schema Refinement and Normalisation

Embed Size (px)

Citation preview

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    1/24

    SCHEMA REFINEMENT

    AND

    NORMALIZATION

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    2/24

    Objectives

    After completing this chapter, you will be able to:

    Define Normalization

    List reason for Normalization

    Refining a database

    Defining Normal Form

    Describe Types of Normal Forms

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    3/24

    Normalization is a step-by-step decomposition of complex

    records into simple records.It results in the formation of

    tables that satisfy certain specified constraints, and

    represent certain normal forms.

    Normalization reduces redundancy using the principle of non-

    loss decomposition. A fully normalized record consists of:

    A primary key that identifies an entity

    A set of attributes that describe the entity

    Normalization

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    4/24

    Does the design ensure that all database operations will be

    efficiently performed and that the design does not make

    the DBMS perform expensive consistency checks which

    could be avoided? Is the information unnecessarily replicated?

    Why Normalization?

    Unless these issues are properly handled, several difficulties like

    redundancy and loss of information may arise. There are several

    methods to avoid the above mentioned problems. One such

    method is database decomposition throughnormalization.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    5/24

    Redundancy is at the root of several problems associated

    with relational schemas

    More seriously, data redundancy causes several anomalies:

    Insert Update

    Delete

    Wastage of storage. Main refinement technique: decomposition (replacing

    ABCD with, say, AB and BCD, or ACD and ABD).

    The Evils of Redundancy

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    6/24

    Refining an E-R Diagram

    Department

    did

    dname

    budget

    since

    Works_inEmployee

    ssn

    name

    lot

    Before

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    7/24

    Works_in

    since

    Employee

    ssnname

    lot

    Department

    did

    dname

    budget

    After

    Refining continued...

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    8/24

    Normal Forms

    Forms are designed to logically address potential problems

    in referring to and working with information stored in a

    database.

    A database is said to be in one of the Normal Forms, if it

    satisfies the rules required by that Form.

    It also will not suffer from any of the problems addressed by

    the Form.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    9/24

    Types of Normal Forms

    Several normal forms have been identified, the most

    important and widely used of which are:

    First normal form (1NF)

    Second normal form (2NF)

    Third normal form (3NF)

    Boyce-Codd normal form (BCNF)

    Fourth normal form (4NF)

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    10/24

    First Normal Form (1NF)

    A table is in 1NF, if every row contains exactly one value foreach attribute.

    Disallow multivalued attributes, composite attributes and their

    combinations.

    1NF states that :

    Domains of attributes must include only atomic (simple,

    indivisible) values and that value of any attribute in a

    tuple must be a single value from the domain of thatattribute.

    By definition, any relational table must be in 1NF.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    11/24

    Functional Dependency (FD)

    They provide a formal mechanism to express

    constraints between attributes.

    Given a relation R, attribute Y of R isfunctionally dependent on the attribute X of R

    if and only if each X-value in R has associated

    with it precisely one Y-value in R.The dependency is represented asY X.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    12/24

    Full Dependency

    An FD X Y is a full functional dependency if

    removal of any attribute A from X means that

    the dependency does not hold any more.

    An attribute B of a relation R is fully functional

    dependent on attribute A of R if it is

    functionally dependent on A and notfunctionally dependent on any proper subset of

    A.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    13/24

    Partial Dependency

    A FD X Y is a partial dependency if there is

    some attribute A X that can be removed from Xand the dependency will still hold.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    14/24

    123- 22- 3666 Attishoo

    231- 31- 5368

    131- 24- 3650

    434- 26- 3751612- 67- 4134

    Smiley

    Smethurst

    GulduMadayan

    48

    22

    35

    3535

    8

    8

    5

    58

    10

    10

    7

    710

    40

    30

    30

    3240

    S N L R W H

    5

    8

    7

    10

    R W

    123- 22- 3666 Attishoo

    231- 31- 5368131- 24- 3650

    434- 26- 3751

    612- 67- 4134

    SmileySmethurst

    Guldu

    Madayan

    48

    2235

    35

    35

    S N L

    40

    3030

    32

    40

    H

    8

    R

    85

    5

    8

    Constraints on Entity Set

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    15/24

    Second Normal Form (2NF)

    A relation schema R is in 2NF if:

    It is in 1NF

    Every non-prime attribute A in R is fully

    functionally dependent on the primary keyof R.

    2NF prohibitspartial dependencies.

    If a relation is not in 2NF, it can be further normalized

    into a number of 2NF relations.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    16/24

    Converting a Database to 2NF

    Find and remove attributes that are related to only a

    part of the key.

    Group the removed items in another table.

    Assign the new table a key that consists of that part of

    the old composite key.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    17/24

    Transitive Dependency

    An FD X Y in a relation schema R is a transitive

    dependency if:

    There is a set of attributes Z that is not a

    subset of any key of R.

    Both X Z and Z Y hold.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    18/24

    Emp{Eno, Dept, ProjCode, Hours}

    Primary key: {Eno, ProjCode}

    {Eno} -> {Dept}, {Eno, ProjCode} -> {Hours}

    Test of 2NF:

    {Eno} -> {Dept}:partial dependency.

    Emp is in 1NF, but not in 2NF.

    Decomposition:

    Emp {Eno, Dept}

    Proj {Eno, ProjCode, Hours}

    Example

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    19/24

    Third Normal Form (3NF)

    A relation schema R is in 3NF if whenever a functional

    dependency X A hold in R, then either:

    X is a super key of R

    A is a prime attribute of R.

    If 3NF is violated by X A, one of the following holds:

    X is a subset of some key K

    We store (X, A) pairs redundantly

    X is not a proper subset of any key

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    20/24

    Emp{Eno, Dept, Dept_Head}Primary key: {Eno}

    {Eno} -> {Dept}, {Dept} -> {Dept_Head}

    Test of 3NF

    {Eno} -> {Dept} -> {Dept_Head}: Transitive

    dependency.

    Emp is in 2NF, but not in 3NF.

    Decomposition:

    Emp {Eno, Dept}

    Dept {Dept, Dept_Head}

    Example

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    21/24

    Boyce-Codd Normal Form (BCNF)

    A relation R is said to be in BCNF, if and only if

    every determinant is a candidate key.

    3NF does not satisfactorily handle the case of a

    relation processing two or more composite or

    overlapping candidate keys.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    22/24

    Suppose that relation R contains attributes A1... An.

    A decomposition of R consists of replacing R by two

    or more relations such that:

    Each new relation scheme contains a subset of

    the attributes of R (and no attributes that do

    not appear in R).

    Every attribute of R appears as an attribute of

    one of the new relations.

    Decomposition of a Relation Schema

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    23/24

    Fourth Normal Form (4NF)

    Under fourth normal form (4NF):

    A record type should not contain two or more

    independent multi-valued facts about an entity.

    The record must satisfy 3NF.

  • 7/28/2019 Chapter4 - Schema Refinement and Normalisation

    24/24

    Summary

    You now should be able to:

    Define Normalization

    List reason for Normalization

    Refining a database

    Defining Normal Form

    Describe Types of Normal Forms