30
1 Lecture 6: Schema refinement: Functional dependencies www.cl.cam.ac.uk/Teaching/current/ Databases/

1 Lecture 6: Schema refinement: Functional dependencies

Embed Size (px)

Citation preview

Page 1: 1 Lecture 6: Schema refinement: Functional dependencies

1

Lecture 6: Schema refinement: Functional

dependencies

www.cl.cam.ac.uk/Teaching/current/Databases/

Page 2: 1 Lecture 6: Schema refinement: Functional dependencies

2

Recall: Database design lifecycle

• Requirements analysis– User needs; what must database do?

• Conceptual design– High-level description; often using E/R model

• Logical design– Translate E/R model into relational schema

• Schema refinement– Check schema for redundancies and anomalies

• Physical design/tuning– Consider typical workloads, and further optimise

Next twolectures

Page 3: 1 Lecture 6: Schema refinement: Functional dependencies

3

Today’s lecture

• Why are some designs bad?

• What’s a functional dependency?

• What’s the theory of functional dependencies?

• (Next lecture: How can we use this theory to classify redundancy in relation design?)

Page 4: 1 Lecture 6: Schema refinement: Functional dependencies

4

Not all designs are equally good

• Why is this design bad? Data(sid,sname,address,cid,cname,grade)

• Why is this one preferable?

Student(sid,sname,address) Course(cid,cname) Enrolled(sid,cid,grade)

Page 5: 1 Lecture 6: Schema refinement: Functional dependencies

5

An instance of our bad design

sid sname address

cid

cname grade

124 Britney USA 206 Database A++

204 Victoria Essex 202 Semantics C

124 Britney USA 201 S/Eng I A+

206 Emma London 206 Database B-

124 Britney USA 202 Semantics B+

Page 6: 1 Lecture 6: Schema refinement: Functional dependencies

6

Evils of redundancy• Redundancy is the root of many problems associated

with relational schemas– Redundant storage– Update anomalies– Insertion anomalies– Deletion anomalies– LOW TRANSACTION THROUGHPUT

• In general, with higher redundancy, if transactions are correct (no anomalies), then they have to lock more objects thus causing greater contention and lower throughput

• (Aside: Could having a dummy value, NULL, help?)

Page 7: 1 Lecture 6: Schema refinement: Functional dependencies

7

Decomposition

• We remove anomalies by replacing the schema Data(sid,sname,address,cid,cname,grade)

with Student(sid,sname,address) Course(cid,cname) Enrolled(sid,cid,grade)

• Note the implicit extra cost here• Two immediate questions:

1. Do we need to decompose a relation?2. What problems might result from a decomposition?

Page 8: 1 Lecture 6: Schema refinement: Functional dependencies

8

Functional dependencies

• Recall:– A key is a set of fields where if a pair of tuples

agree on a key, they agree everywhere

• In our bad design, if two tuples agree on sid, then they also agree on address, even though the rest of the tuples may not agree

Page 9: 1 Lecture 6: Schema refinement: Functional dependencies

9

Functional dependencies cont.

• We can say that sid determines address– We’ll write this

sid address

• This is called a functional dependency (FD)

• (Note: An FD is just another integrity constraint)

Page 10: 1 Lecture 6: Schema refinement: Functional dependencies

10

Functional dependencies cont.

• We’d expect the following functional dependencies to hold in our Student database– sid sname,address– cid cname– sid,cid grade

• A functional dependency X Y is simply a pair of sets (of field names)– Note: the sloppy notation A,B C,D rather than

{A,B} {C,D}

Page 11: 1 Lecture 6: Schema refinement: Functional dependencies

11

Formalities

• Given a relation R=R(A1:1, …, An:n), and X, Y ({A1, …, An}), an instance r of R satisfies XY, if– For any two tuples t1, t2 in R, if t1.X=t2.X then

t1.Y=t2.Y

• Note: This is a semantic assertion. We can not look at an instance to determine which FDs hold (although we can tell if the instance does not satisfy an FD!)

Page 12: 1 Lecture 6: Schema refinement: Functional dependencies

12

Properties of FDs

• Assume that X Y and Y Z are known to hold in R. It’s clear that X Z holds too.

• We shall say that an FD set F logically implies X Y, and write F [X Y– e.g. {X Y, Y Z} [ X Z

• The closure of F is the set of all FDs logically implied by F, i.e.

F+ @ {XY | F [ XY}

• The set F+ can be big, even if F is small

Page 13: 1 Lecture 6: Schema refinement: Functional dependencies

13

Closure of a set of FDs

• Which of the following are in the closure of our Student FDs?

– addressaddress– cidcname– cidcname,sname– cid,sidcname,sname

Page 14: 1 Lecture 6: Schema refinement: Functional dependencies

14

Candidate keys and FDs

• If R=R(A1:1, …, An:n) with FDs F and X{A1, …, An}, then X is a candidate key for R if– X A1, …,An F+

– For no proper subset YX is Y A1, …,An F+

Page 15: 1 Lecture 6: Schema refinement: Functional dependencies

15

Armstrong’s axioms

• Reflexivity: If YX then F \ XY– (This is called a trivial dependency)– Example: sname,addressaddress

• Augmentation: If F \ XY then F \ X,WY,W– Example: As cidcname then cid,sidcname,sid

• Transitivity: If F \ XY and F \ YZ then F \ XZ– Example: As sid,cidcid and cidcname, then sid,cidcname

Page 16: 1 Lecture 6: Schema refinement: Functional dependencies

16

Consequences of Armstrong’s axioms

• Union: If F \ XY and F \ XZ then F \ XY,Z

• Pseudo-transitivity: If F \ XY and F \ W,YZ then F \ X,WZ

• Decomposition: If F \ XY and ZY then F \ XZ

Exercise: Prove that these are consequences of Armstrong’s axioms

Page 17: 1 Lecture 6: Schema refinement: Functional dependencies

17

Proof of Union Rule

Suppose that F \ XY and F \ XZ.

By augmentation we have

F \ XX,Y

since X U X = X. Also by augmentation

F \ X,YZ,Y

Therefore, by transitivity we have

F \ XZ,YQED

Page 18: 1 Lecture 6: Schema refinement: Functional dependencies

18

Functional Dependencies Can be useful in Algebraic Reasoning

Suppose R(A,B,C) is a relation schema

with dependency AB, then

)(, RR BAπ= )(, RCAπA

(This is called Heath’s rule.)

Page 19: 1 Lecture 6: Schema refinement: Functional dependencies

19

Proof of Heath’s Rule

)(, RCAπA

First show that

Suppose

then

and

Since

we have )(, RCAπA

A

Page 20: 1 Lecture 6: Schema refinement: Functional dependencies

20

Proof of Heath’s Rule (cont.)

A

In the other direction, we must show that

Suppose Then there must exist records

and

There must also exist

Therefore, we have

so that

QED

But the functional dependency tells us that

Page 21: 1 Lecture 6: Schema refinement: Functional dependencies

21

Equivalence

• Two sets of FDs, F and G, are said to be equivalent if F+=G+

• For example:

{(A,BC), (AB)} and

{(AC), (AB)}

are equivalent

• F+ can be huge – we’d prefer to look for small equivalent FD sets

Page 22: 1 Lecture 6: Schema refinement: Functional dependencies

22

Minimal cover

• An FD set, F, is said to be minimal if1. Every FD in F is of the form XA, where A is

a single attribute

2. For no XA in F is F-{XA} equivalent to F

3. For no XA in F and ZX is

(F-{XA}){ZA} equivalent to F

1. For example, {(AC), (AB)} is a minimal cover for {(A,BC), (AB)}

Page 23: 1 Lecture 6: Schema refinement: Functional dependencies

23

More on closures

• FACT: If F is an FD set, and XYF+ then there exists an attribute AY such that XAF+

Page 24: 1 Lecture 6: Schema refinement: Functional dependencies

24

Why Armstrong’s axioms?

• Soundness– If F \ XY is deduced using the rules, then

XY is true in any relation in which the dependencies of F are true

• Completeness– If XY is is true in any relation in which the

dependencies of F are true, then F \ XY can be deduced using the rules

Page 25: 1 Lecture 6: Schema refinement: Functional dependencies

25

Soundness

• Consider the Augmentation rule:– We have XY, i.e. if t1.X=t2.X then t1.Y=t2.Y

– If in addition t1.W=t2.W then it is clear that t1.(Y,W)=t2.(Y,W)

Page 26: 1 Lecture 6: Schema refinement: Functional dependencies

26

Soundness cont.

Consider the Transitivity rule:– We have XY, i.e. if t1.X=t2.X then t1.Y=t2.Y

(*)

– We have YZ, i.e. if t1.Y=t2.Y then t1.Z=t2.Z (**)

– Take two tuples s1 and s2 such that s1.X=s2.X then from (*) s1.Y=s2.Y and then from (**) s1.Z=s2.Z

Page 27: 1 Lecture 6: Schema refinement: Functional dependencies

27

Completeness

• Exercise – (You may need the fact from slide 23)

Page 28: 1 Lecture 6: Schema refinement: Functional dependencies

28

Attribute closure

• If we want to check whether XY is in a closure of the set F, could compute F+ and check – but expensive

• Cheaper: We can instead compute the attribute closure, X+, using the following algorithm:

• Then F \ XY iff Y is a subset of X+

Try this with sid,snamecname,grade

closure:= X;repeat until no change{ if UVF, where Uclosure then closure:=closureV };

Page 29: 1 Lecture 6: Schema refinement: Functional dependencies

29

Preview of next lecture: Goals of normalisation

• Decide whether a relation is in “good form”

• If it is not, then we will “decompose” it into a set of relations such that– Each relation is in “good form”– The decomposition has not lost any

information that was present in the original relation

• The theory of this process and the notion of “good form” is based on FDs

Page 30: 1 Lecture 6: Schema refinement: Functional dependencies

30

Summary

You should now understand:• Redundancy and various forms of anomalies• Functional dependencies• Armstrong’s axioms

Next lecture: Schema refinement: Normalisation