26
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms Chapter 19

  • Upload
    cameo

  • View
    62

  • Download
    5

Embed Size (px)

DESCRIPTION

Schema Refinement and Normal Forms Chapter 19. The Evils of Redundancy. Redundancy causes several problems associated with relational schemas: Redundant storage Insert anomalies Delete anomalies Update anomalies What are the problems in the following schema? - PowerPoint PPT Presentation

Citation preview

Page 1: Schema Refinement and  Normal Forms Chapter 19

CSC 411/511: DBMS Design

Dr. Nan Wang 1

Schema Refinement and Normal Forms

Chapter 19

Page 2: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

The Evils of Redundancy

• Redundancy causes several problems associated with relational schemas:– Redundant storage– Insert anomalies– Delete anomalies– Update anomalies– What are the problems in the following schema?

Hourly_Emps (ssn, name, lot, rating, hourly_wages, hours_worked)

Page 3: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Example

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

• Hourly_Emps (ssn, name, lot, rating, hourly_wages, hours_worked)

• Abbreviate an attribute name to a single letter• Refer to a relation schema by a string of letters, one per

attribute.• In the example, hourly_wages is determined by rating.

This IC is an example of a functional dependency (RW).• What does it lead to?

Page 4: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Example

• Problems:– Redundant storage

– Update anomaly: Can we change W in just the 1st tuple of SNLRWH?

– Solutions to the anomalies?

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Page 5: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Example

• Problems:– Insertion anomaly:

What if we want to insert an employee and don’t know the hourly wage for his rating?

– Solution?• Null values• What if the s is

unknow?

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Page 6: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Example

• Problems:– Deletion anomaly: If

we delete all employees with rating 5, we lose the information about the wage for rating 5!

– Solutions?

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Page 7: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Fighting the Evils of Redundancy• Redundancy is at the root of evil - true or not?• Integrity constraints, in particular functional

dependencies, can be used to identify schemas with such problems and to suggest refinements.

• RW• Decomposition(replace SNLRWH with SNLRH

and RW)S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7

Hourly_Emps2

Wages

Page 8: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Fighting the Evils of Redundancy• RW• Decomposition(replace SNLRWH with SNLRH

and RW)• Do we solve all this problems?

– Redundant storage– Insert anomalies– Delete anomalies– Update anomalies

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40R W

8 10

5 7

Hourly_Emps2

Wages

Page 9: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Fighting the Evils of Redundancy• Main refinement technique: decomposition

(replacing ABCD with, say, AB and BCD, or ACD and ABD).

• Problems related to decomposition:– Is there reason to decompose a relation?

• Normal forms help us to decide whether or not to decompose a relation in further.

• How?– What problems (if any) does the decomposition cause?

• Lossless join decomposition• Dependency-preservation decomposition

Page 10: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Properties of decomposition• Lossless join decomposition

– Enable us to recover any instance of the decomposed relation from corresponding instances of the smaller relations.

• Dependency-preservation decomposition– Enable us to enforce any constraints on the original relation

by simple enforcing some constraints on each of smaller relations.

– Key constrainsS N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7

Hourly_Emps2

Wages

Page 11: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Good?

11

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7

– List N, W, H– How?

Page 12: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Good DB designer

• Normal forms and what problems they do or do not alleviate?

• Decomposition and what problems with decompositions.

• Questions:– Is a relation in a given normal form?– Is a decomposition dependency preserving?

12

Page 13: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Functional Dependencies (FDs)

• A functional dependency X Y holds over relation R if, for every allowable instance r of R:– t1 r, t2 r, t1.X =t2.X implies t1.Y =t2.Y– i.e., given two tuples in r, if the X values agree, then the Y values

must also agree. (X and Y are sets of attributes.)

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

Page 14: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Functional Dependencies (FDs)

• An FD is a statement about all allowable relations.– Must be identified based on semantics of application.– Given some allowable instance r1 of R, we can check if it

violates some FD f, but we cannot tell if f holds over R!• K is a candidate key for R means that K R

– However, KR does not require K to be minimal!

Page 15: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

FDs- example

15

FD: ABCQuestion:

Do all tuples in the relation satisfy the Functional Dependents ABC

Can we add a tuple <a1,b1,c2,d1> to the relation? Why?

Page 16: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

exercise

• List all the functional dependencies that this relation instance satisfies.

• Z → Y, X → Y, and XZ → Y

16

Page 17: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Example: Constraints on Entity Set

• Consider relation obtained from Hourly_Emps:– Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked)

• Notation: We will denote this relation schema by listing the attributes: SNLRWH– This is really the set of attributes {S,N,L,R,W,H}.– Sometimes, we will refer to all attributes of a relation by using

the relation name. (e.g., Hourly_Emps for SNLRWH)• Some FDs on Hourly_Emps:

– ssn is the key: SSNLRWH – rating determines hrly_wages: RW

Page 18: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Example (Contd.)

• Problems due to RW :– Update anomaly: Can

we change W in just the 1st tuple of SNLRWH?

– Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating?

– Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5!

S N L R W H

123-22-3666 Attishoo 48 8 10 40

231-31-5368 Smiley 22 8 10 30

131-24-3650 Smethurst 35 5 7 30

434-26-3751 Guldu 35 5 7 32

612-67-4134 Madayan 35 8 10 40

S N L R H

123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40

R W

8 10

5 7Hourly_Emps2

Wages

Will 2 smaller tables be better?

Page 19: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Reasoning About FDs

• Given some FDs, we can usually infer additional FDs:– For example:– Workers(ssn,name,lot,did,since)– Ssndid holds, – did lot holds, implies ssnlot

• An FD f is implied by a set of FDs F if f holds whenever all FDs in F hold.– F+ = closure of F is the set of all FDs that are implied by a

given set F of FDs.

Page 20: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Closure of a set of FDs

• Infer or compute the closure of a given set of F of FDs– Following three rules (Armstrong’s Axioms)

• Armstrong’s Axioms (X, Y, Z are sets of attributes):– Reflexivity: If X Y, then YX – Augmentation: If XY, then XZYZ for any Z– Transitivity: If XY and YZ, then XZ

• These are sound and complete inference rules for FDs!

Page 21: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Soundness and Completeness

• Completeness: given F, the rules allows us to determine all dependencies in

• Soundness: we cannot generate FDs not in

• Addition rules:– Union: if XY, XZ then XYZ– Decomposition: if XYZ, then XY and XZ

F

F

Page 22: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Soundness and Completeness• R=(A,B,C) and F={AB, BC}

– Reflexivity: if Y is subset of X, then XY

– Transitivity : if XY and YZ then XZ• AC

– Augmentation : if XY then XZYZ for any Z• ACBC• ABBC• ABAC

– ABCF+ ={A B, BC, AC, ACBC,ABAC. …}

Page 23: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Reasoning About FDs

• FD is a statement about the world as we understand it– Ssndid, did lot implies ssnlot

• Is it correct to state that“Since X functionally determines Y, if we know X, we know Y.”Or “if XY, then X identifies Y”

Page 24: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Reasoning About FDs (Contd.)

• Couple of additional rules– Union: If X Y and X Z, then XYZ– Decomposition: If XYZ, then XY and XZ

• Example: Contracts(cid,sid,jid,did,pid,qty,value), and:

– C is the key: CCSJDPQV– Project purchases each part using single contract: JPC– Dept purchases at most one part from a supplier: SDP

Page 25: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Reasoning About FDs (Contd.)

• C is the key: CCSJDPQV• Project purchases each part using single contract:

JPC• Dept purchases at most one part from a supplier:

SDP

• JPC, CCSJDPQV imply JPCSJDPQV• SDP implies SDJJP• SDJJP, JPCSJDPQV imply

SDJCSJDPQV

Page 26: Schema Refinement and  Normal Forms Chapter 19

Dr. Nan Wang

Exercise

• Consider a relation R with five attributes ABCDE. You are given the following dependencies: A→B, BC→E, and ED→A.

• List all keys for R.

• CDE, ACD, BCD

26