37
Normalization ISYS 464

Normalization

  • Upload
    vahe

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

Normalization. ISYS 464. Database Design Based on ERD. Strong entity: Create a table that includes all simple attributes Composite Weak entity: add owner primary key Multi-valued attribute: Create a table for each multi-valued attribute Key + attribute Relationship: 1:1, 1:M - PowerPoint PPT Presentation

Citation preview

Page 1: Normalization

Normalization

ISYS 464

Page 2: Normalization

Database Design Based on ERD • Strong entity: Create a table that includes all simple attributes

– Composite• Weak entity: add owner primary key• Multi-valued attribute: Create a table for each multi-valued

attribute– Key + attribute

• Relationship: – 1:1, 1:M

• Relationship table: for partial participation to avoid null• Foreign key

– M:M: relationship table– N-ary relationship: relationship table– Recursive relationship

• Attribute of relationship• Superclass and subclass• Note: The database designed according to these

rules will meet the 3NF requirements.

Page 3: Normalization

Top-Down vs Bottom-Up

• Top-Down database design:– Users’ requirements: forms,reports,views, etc.– Data model: ERD– Relational database design

• Bottom-Up database design:– Users’ requirements: forms,reports,views, etc.– Universal relation: A relation contains all attributes

required by users and relationships between attributes.• View aggregation

– Relational database design

Page 4: Normalization

ExampleEmployee/Dependent report:

EmpID: E101 Ename: Peter

Address: 123 XYZ St

DependentName Relationship DOB

Nancy Daughter 1/1/95

Alan Son 12/25/03

EmpDependent Table:

EmpID EmpName Address DepName Relation DepDOBE101 Peter 123 XYZ St Nancy D 1/1/95

E101 Peter 123 XYZ St Alan S 12/25/03

Note: This database is able to produce the report, but has duplicated data.

Page 5: Normalization

Update Anomalies Due To Duplication

• Modification anomaly:– Inconsistent data

• Insertion Anomalies:– Enter an employee with no dependent– Null

• Deletion Anomaly:– If Nancy and Alan become independent.

Page 6: Normalization

Example

• Employee Table:– SSN, Ename, Sex, DOB, Phone– Employee may have more than 1 phone.

• If Peter has two phones, 7890 and 7892:– SSN, Ename, Sex, DOB, Phone– 1234 Peter M 7/4/75 7890– 1234 Peter M 7/4/75 7892

Page 7: Normalization

Normalization• Decompose unsatisfactory relation into smaller

relations with desirable properties.– No duplication

• The original relation can be recovered by applying natural join to the smaller relations.– So that no information is lost in the process.

• Keys and function dependency:– Which field is the key field of the EMpDependent

Table?• EmpID + DepName

Page 8: Normalization

Function Dependency

• Relationship between attributes

• X -> Y– The value of X uniquely determines the value

of Y.– Y is functionally dependent on X.– A value of X is associated with only one value

of Y.

Page 9: Normalization

Example• Employee table:

– SSN Ename Sex DOB– S1 Peter M 1/1/75– S2 Paul M 12/25/80– S3 Mary F 7/4/72

• Function Dependencies:– SSN -> Ename, SSN ->Sex, SSN -> DOB– SSN -> Ename, Sex, DOB

• Any other FD:– Ename -> SSN ?– Ename -> Sex ?– DOB -> SSN ?

Page 10: Normalization

• What is the key of Employee table:– SSN

• Observations:– All non-key fields are functionally dependent on SSN.

– There is no other FD.

– The only FD is the key dependency.

– There is no data duplication in the Employee table.

Page 11: Normalization

If we mix multivalue attribute with regular attributes in one table

• Employee Table:– SSN, Ename, Sex, DOB, Phone– Employee may have more than 1 phone.

• FD:– SSN -> Ename, Sex, DOB– SSN -> Phone ?

• Key: • Duplication ?

Page 12: Normalization

Example 2

• EmpDependent table:– EmpID, Ename, Address, Depname, Relation,

DepDOB

• FD:– EmpID ->Ename, Address

• Key: EmpID + Depname

Page 13: Normalization

If we mix two entities with 1:M relationship in one table

• FacultyStudent table:– Faculty Advise Student: 1:M relationship– FID, Fname, SID, Sname, SAddress

• FD:– FID -> Fname– SID -> Sname, SAddress, FID, Fname

• Key: SID

Page 14: Normalization

If we mix two entities with M:M relationship in one table

• BankAccount table:– Acct#, AcctType, CID, Cname, Address, OpenDate,

Balance• Function Dependencies:

– Acct# -> AcctType Y– Acct# -> CID N– Acct# -> Cname N– Acct# -> OpenDate Y– Acct# ->Balance Y– CID -> Cname, Address Y

• Key: Acct# + CID• Duplication ? Y

Page 15: Normalization

Normalization Process

• Inputs: – A “universal relation”

– Function dependencies

• Output: Normalized tables• Process:

– Decompose the unnormalized relation into smaller relations such that in each relation the non key fields are functionally dependent on the key, the whole key, and nothing but the key. So help me Codd!

Page 16: Normalization

First Normal Form

• The fields of a relation are all simple attribute.– All relational database tables meet this

requirement.

• EmpDependent table:– EmpID, Ename, Address, Depname, Relation, DepDOB

– First normal form? Yes

– Second normal form?

Page 17: Normalization

Second Normal Form

• The non-key fields are functionally dependent on the key, and the whole key.– FD:

• EmpID ->Ename, Address

– Key: EmpID + Depname – Ename and Address depend on part of the key.

• Every non-key field is fully functionally dependent on the key.

• Decompose the EMpDependent table into two tables:– EmpID, Ename, Address– EmpID, Depname, Relation, DepDOB

Page 18: Normalization

• Employee Table:– SSN, Ename, Sex, DOB, Phone– Employee may have more than 1 phone.

• FD:– SSN -> Ename, Sex, DOB,– SSN -> Phone ?

• Key: SSN + Phone• 2NF? No• Decompose into two tables:

– SSN, Ename, Sex, DOB– SSN, Phone

Page 19: Normalization

• FacultyStudent table:– Faculty Advise Student: 1:M relationship– FID, Fname, Office, SID, Sname, SAddress

• FD:– FID -> Fname, Office– SID -> Sname, SAddress, FID, Fname, Office

• Key: SID• 2NF ? Yes• Duplication? Yes• Why?

– All non-key fields depend on the whole key, but not Nothing But the Key!

• SID -> FID, Fname, Office• FID -> Fname, Office

Page 20: Normalization

Transitive Dependency

• If X -> Y, and Y->Z then X -> Z.

• Z if transitively dependent on the key.

• SID -> FID, FID -> Fname, Office– SID -> Fname, Office– Fname and Office are transitively dependent on

SID.

Page 21: Normalization

Third Normal Form

• Every non-key field is:– Fully functionally dependent on the key, and– Non-transitively dependent on the key.

• Decompose:– FID, Fname, Office– SID, FID, Sname, SAddress

Page 22: Normalization

ExampleCustomer/Orders report:

CID: C101 Cname: Peter

Address: 123 XYZ St

OID Odate SalesPerson Amount

O25 1/1/04 John 125

O30 2/25/04 Alan 500

CustomerOrders Table:CID CName Address OID Odate SalesPerson Amount

C101 Peter 123 XYZ St O25 1/1/04 John 125

C101 Peter 123 XYZ St O30 2/25/04 Alan 500

Page 23: Normalization

Example

• Key: OID• FD:

– OID -> CID, Cname, Address, Odate, SalesPerson, Amount– CID -> Cname, Address

• 2NF? Yes• 3 NF? No• Decompose:

– CID, Cname, Address– OID, CID, Odate, SalesPerson, Amount

Page 24: Normalization

Example with 1:M Relationship

• FacultyStudent table:– Faculty Advise Student: 1:M relationship– FID, Fname, SID, Sname, SAddress

• FD: – FID -> Fname– SID -> Sname, Saddress

• Key: SID• 2NF? Yes• 3NF? No, because SID ->FID, FID -> Fname• Decompose:

– Table 1: FID, Fname– Tablw 2: SID, FID, Sname, SAddress

Page 25: Normalization

Example with M:M Relationship

• BankAccount table:– Acct#, AcctType, CID, Cname, Address, OpenDate,

Balance• Function Dependencies:

– Acct# -> AcctType, OpenDate, Balance– CID -> Cname, Address

• Key: Acct# + CID• 2NF? No

– Decompose: • Table 1: Acct# -> AcctType, OpenDate, Balance• Table 2: CID -> Cname, Address• Table 3: Acct# , CID

• 3NF? Yes

Page 26: Normalization

Is It Really A Ternary Relationship?

Supplier Project

Part

Qty

If each part is supplied by only one supplier?

Page 27: Normalization

• Relationship Table:– SID, PrtID, PjID, Qty

• Key: PrtID + PjID -> SID, Qty• FD: PrtID SID• 2ND NF? No• Decompose:

– Table1: PrtID, SID– Table2: PrtID, PjID, Qty

Page 28: Normalization

Database Design Based on ERD • Strong entity: Create a table that includes all simple attributes

– Composite• Weak entity: add owner primary key• Multi-valued attribute: Create a table for each multi-valued

attribute– Key + attribute

• Relationship: – 1:1, 1:M

• Relationship table: for partial participation to avoid null• Foreign key

– M:M: relationship table– N-ary relationship: relationship table– Recursive relationship

• Attribute of relationship• Superclass and subclass• Note: The database designed according to these

rules will meet the 3NF requirements.

Page 29: Normalization

Boyce-Codd Normal Form

• 3NF: All non-key attributes are dependent on the key and nothing else.

• Boy-Codd Normal Form: – A relation is in 3NF, and– Key attributes cannot be dependent on non-key

attributes.• Whenever X -> A holds, then X is a key.

Page 30: Normalization

Example

• Relation: A, B, C

• FD:– A, B C– C B

• Key: A, B

• It is in 3 NF, but not in BC NF.

Page 31: Normalization

Employee Part

Warehouse

Stock

M

MM1

We also know each part in a warehouse is managed by one employee, and an employee may manage many parts.

Page 32: Normalization

• Table:– WID, PID, EID, Stock– LA P1 E5 40– LA P5 E6 50– LA P7 E5 45

• Key: WID + PID EID, Stock• FD: EID WID• 3 NF? Yes• BC NF? No• Decompose:

– Table 1: EID, WID– Table 2: EID, PID, Stock

Page 33: Normalization

Alternative Approach

Key: EID + PID WID, Stock

FD: EID WID

2NF? No

Table 1: EID, WID

Table 2: EID, PID, Stock

Potential violation of BCNF may occur when:1. The relation has more than 1 composite candidate keys;2. The candidate keys have at least one attribute in common.

1. WID + PID2. EID + PID

Page 34: Normalization

Multi-Valued Dependency

• A -- >> B

• A -- >> C

• For each value of A there is a set of values for B and a set of values for C. However, the set of values for B and C are independent of each other.

Page 35: Normalization

Fourth Normal Form

• A relation is in Boyce-Codd normal form and does not contain multi-valued dependencies.

Page 36: Normalization

• Student: A student may have many majors and many phones.– SID, Sname, Major, Phone– S1 Peter CIS 1234– S1 Peter CIS 3456– S1 Peter Acct 1234– S1 Peter Acct 3456

• Key: SID + Major + Phone• FD: SID Sname• 2NF? No• Decompose:

– Table 1: SID, Sname– Table 2: SID, Major, Phone

• Table 2 in 3NF? Yes; in 4 NF? No– SID, Major– SID, Phone

Page 37: Normalization

Denormalization

• The refinement to the relational schema such that the degree of normalization for a modified relation is less than the degree of at least one of the original relations.

• Objective:– Speed up processing