Download ppt - Chapter 5 Integrity Constraints 5.1 Domain ConstraintsDomain Constraints 5.2 Referential IntegrityReferential Integrity 5.3 AssertionsAssertions 5.4 TriggersTriggers

Chapter 5 Integrity Constraints

5.1 Domain Constraints

5.2 Referential Integrity

5.3 Assertions

5.4 Triggers

5.5 Functional Dependencies

5.6 Exercises

Chapter 5 Integrity Constraints

The term integrity refers to the accuracy or correctness of data in the database. Integrity constraints provide a means of ensuring that changes made to the database by authorized users do not result in a loss of data consistency. Thus, integrity constraints guard against accidental damage to the database. A given database might be subject to any number of integrity constraints, of arbitrary complexity.

We classify integrity constraints in general into four broad categories:

type(domain), attribute, relvar, database constraints


We have seen that a domain of possible values must be associated with every attribute. Domain constraints are the most elementary form of integrity constraint. They are tested early by the system whenever a new data item is entered into the database.

Defined domain:

create domain Dollars numeric(12,2)

create domain Pounds numeric(12,2)

drop domain alter domain

cast r.A as Pounds


Essentially, domain constraints is (or is logically equivalent to) just an enumeration of the legal values of the domain.

Examples: (check clause)

① A check clause can ensure that an hourly wage domain allows only values greater than a specified value.

create domain hourly-wage numeric(5,2)constraint wage-value-test

② the check clause can also be use to restrict a domain to not contain any null values.

create domain account-number char(10)constraint account-number-null-test

check (value not null)

check (value>=4.00)


③ The domain can be restricted to contain only a specified set of values by using the in clause.

create domain account-type char(10)

constraint account-type-test

check (value in(“checking”, “saving”))

④ check (branch-name in(select branch-name form branch))


1.Basic Concepts Foreign keys: loosely speaking, a foreign key is a set of attributes of one relvar R2 whose values are required to match values of some candidate key of some relvar R1.

Let R2 be a relvar, then a foreign key in R2 is a set of attributes of R2 , say FK, such that:

a. there exists a relvar R1 (R1 and R2 not necessarily distinct) with a candidate key CK, and

b. for all time, each value of FK in the current values of R2 is identical to the values of CK in some tuple in the current values of R1


branch-namebranch-cityassets

branch account-numberbranch-namebalance

account

loan-numberbranch-nameamount

loan

customer-name

account-number

depositorcustomer-namecustomer-streetcustomer-city

customer

customer-name

loan-number

borrower


Points arising:

① The definition requires every value of a given foreign key to appear as a values of the matching candidate key. Note, however, that the converse is not a requirement.

② A foreign key is simple or composite according as the candidate key it matches is simple or composite.

③ Each attribute of a given foreign key must have the same name and type as the corresponding component of the matching candidate key.

④ A foreign key value represents a reference to the tuple containing the matching candidate key value(the referenced tuple)


ⅰThe problem of ensuring that the database does not include any invalid foreign key values is therefore known as the referential integrity problem.

ⅱ The constraint that values of a given foreign key must match values of the corresponding candidate key is known as the a referential constraint.

ⅲ We refer to the relvar that contains the foreign key as the referencing relvar and the relvar that contains the corresponding candidate key as the referenced relvar.


⑤ Referential diagrams: consider depositor and account. We can represent the referential constraints that exist in that database by means of the following referential diagram:

depositor account account-number

⑥ A given relvar can of course be both referenced and referencing, as in the case of R2 here.

R3 R2 R1


referential path: Let relation Rn , R(n-1)……R2 , R1 be such that here is a referential constraint form Rn to R(n-1) , a referential constraint from R(n-1) to R(n-2) , …… and a referential constraint form R2 to R1:

Rn R(n-1) R(n-2) …… R2 R1

Then the chain of arrows form Rn to R1 represents a referential path from Rn to R1.

referential cycle:

Rn R(n-1) R(n-2) …… R2 R1 Rn


2. Referential integrity The database must not contain any unmatched foreign key values.

The term “unmatched foreign key value” here simply means a foreign key value in some referencing relvar for which there does not exist a matching value of the relevant candidate key in the relevant referenced relvar.

Here then is the syntax for defining a foreign key:

FOREICN KEY {<item commalist>} REFERENCES<relvar name>

ⅰEach <item> is either an <attribute name> of the referencing relvar or an expression of the form

RENAME <attribute name> AS <attribute name>

ⅱThe <relvar name> identifies the referenced relvar


Database modification can cause violations of referential integrity.3. Database Modification

FK(R2) CK(R1)

① Insert : if a tuple t2 is inserted into R2 , the system must ensure that there is a tuple t1 in R1 such that t1[CK]= t2[FK]. That is, t2[FK]

∈ CK(R1)

② Delete : if a tuple t1 deleted from R1, the system must compute

the set of tuples in R2 that reference t1: FK=t1[CK](R2)

③ Update: we must consider two cases for update updates to the

referencing relation(R2), and updates to the referenced relation (R1)


If a tuple t2 is update in relation R2, and the update modifies values for the foreign key FK, then a test similar to the insert case is made. Let t2’ denote the new value of tuple t2, the system must ensure that

t2’[FK] ∈ CK(R1) If a tuple t1 is update in relation R1, and the update modifies

values for the foreign key CK, then a test similar to the delete case is made. The system must compute using the old value of

t1:FK=t1[CK](R2)

If this set is not empty, the update is rejected as an error, or the update is cascaded in a manner similar to delete.


4. Referential Integrity in SQL

Primary and candidate keys and foreign keys can be specified as part of the SQL create table statement (primary key clause, unique clause, foreign key clause)

Example:create table account

( account-number char(10) not null branch-name char(10)

balance integer primary key (account-number) foreign key (branch-name) references branch check (balance>=0))


Referential actions:

Problem:delete from branch

where branch-name=“Perryridge”

Assume this delete does exactly what is says—i.e., it deletes the branch tuple for branch-name Perryridge, no more and no less. Assume too that the database does include some account for branch-name Peryridge, and the application does not go on to delete those account. When the system checks the referential constrain from account to branch, then, it will find a violation, and an error will occur.


Solution: The obvious compensating action would be for the system to delete the account for branch-name Perryridge “automatically” we can achieve the effect by extending the foreign key definition as follows:

creat table account(……foreign key(branch-name) references branch

on delete cascadeon update cascade

……)


The specification on delete cascade defines a delete rule for this particular foreign key, and the specification cascade is the referential action for that delete rule.

cascade:

restrict:

In the case at hand, restrict would mean that delete operations are “restricted” to the case where there are no matching account.


Omitting a referential action for a particular foreign key is equivalent to the specifying the “action” on action, which means what it says—the delete is performed exactly as requested, no more and no less.

deal with null values: ① All attributes of the primary key are implicitly declared to be not

null

② Attributes of a unique declaration are allowed to be null, provide that they have not otherwise been declared to be nonnull.

③ Attributes of foreign keys are allowed to be null, provided that they have not otherwise been declared to be nonnull.

5.3 Assertions

Assertions is general constraints. An assertion is a predicate expressing a condition that we wish the database always to satisfy. Domain constrains and referential-integrity constraints are special forms of assertions.

Assertions are defined by means of create assertion syntax:

create assertion <assertion-name>check <predicate>

5.3 Assertions

Examples: ① The sum of all loan amounts for each branch must be less than th

e sum of all account balance at the branch.

create assertion sum-constraint check

( not exists ( select * from branch

where (select sum(amount) form loan

where loan.branch-name=branch.branch-name)

>=(select sum(amount) from account

where loan.branch-name=branch.branch-name)))

5.3 Assertions

② Every loan has at least one customer who maintains an account with a minimum balance of $1000,00

create assertion balance-constraint check

where not exists (select *

where loan.loan-number=borrower.loan-numberand borrower.customer-name=depositor.customer-name

and depositor.account-number=account.account-number

( not exists ( select * from loan

and account.balance>=1000)))

from borrower, depositor, account

5.4 Triggers

A trigger is a statement that is executed automatically by the system as a side effect of a modification to the database.

① Specify the conditions under which the trigger is to be executed

② Specify the actions to be taken when the trigger executes.

To design a trigger mechanism, we must meet two requirements:

5.4 Triggers

Example: (overdrafts)Steps:① insert a new tuple s in the loan relation with

s[branch-name]=t[branch-name]

② insert a new tuple u in the borrower relation with

s[loan-number]=t[account-number]

s[amount]= - t[balance]

u[customer-name]=“Jones”

u[loan-number]=t[account-number]

③ set t[balance] to 0

5.4 Triggers

Using SQL to write the account-overdraft triggercreate trigger overdraft-trigger after update on account referencing new row as nrowfor each rowwhen nrow.balance<0begin atomic

insert into borrower(select customer-name,account-numberfrom depositor

insert into loan valueswhere nrow.account-number=depositor.account-number);

(nrow.account-number,nrow.branch0name,-nrow.balance);update account set balance=0

where account.account-number=nrow.account-numberend

5.4 Triggers

create trigger overdraft-trigger on account for updateasif nrow.balance<0begin

insert into borrower(select customer-name,account-numberfrom depositor

insert into loan valueswhere inserted.account-number=depositor.account-number);

(inserted.account-number, inserted. branch0name,- inserted..balance);

update account set balance=0

where account.account-number= inserted. account-number

end

from account,inserted

5.4 Triggers

create trigger setnull-trigger before update on r referencing new row as nrow

for each row

when nrow.phone-number=‘’

set nrow.phone-number=null

5.5.1 Introduction

Basically, a functional dependency is a many-to-one relationship from one set of attributes to another within a given relvar.

Example:

There is a functional dependency from the set of attributes {branch-name} to the set of attributes {assets}.

a. For any given value for the pair of attributes branch-name, there is just one corresponding value of attribute assets, but

b. Many distinct values of the pair of attributes branch-name can have the same corresponding value for attributes assets.

5.5.2 Basic conceptions

Now, it is very important in this area—as in so many others! —to distinguish clearly between (a) the value of a given relvar at a given point in time and (b) the set of all possible values that the given relvar might assume at different times.

Here then is the definition for case (a):

Let r be a relation, and let X and Y be arbitrary subsets of the set of attributes of r. Then we say that Y is functionally dependent on X —in symbols, XY


Consider the branch relvar. A possible value for relvar branch is shown in Fig A.

branch-name branch-city assets

Downtown Brooblyn 9000000

Redwood Palo Alto 4000000

Perryridge Horseneck 3000000

Mianus Horseneck 11000000

Example:The relation shown in Fig A satisfies the FD.

{branch-name} {branch-city}


Here then is the definition for case (b):

Let r be a relation variable, and let X and Y be arbitrary subsets of the set of attributes of r. Then we say that Y is functionally dependent on X —in symbols, XY —if and only if , in every possible legal value of r, each X value has associated with it precisely one Y value. In other words, in tuples agree on their X value, they also agree on their Y value.

branch-name branch-city

assets branch-name


We now observe that if X is a candidate key of relvar r, then all attributes Y of relvar r must be functionally dependent on X.

Example: For the parts relvar customer

customer-name# {customer-name, customer-street, customer-city}

In fact, if relvar r satisfies the FD A B and A is not a candidate key, then r will involve some redundancy.


Problem:

Now, even if we restrict our attention to FDs that hold for all time, the complete set of FDs for a given relvar can still be very large.

Solution: Given a particular set S of FDs, therefore, it is desirable to find some other set T that is much smaller than S and has the property that every FD in S is implied by the FDs in T. if such a set T can be found, it is sufficient that the DBMS enforce just the FDs in T, and the FDs in S will then be enforced automatically.

5.5.3 Trivial and Nontrivial Dependencies

One obvious way to reduce the size of the set of FDs we need to deal with is to eliminate the trivial dependencies. A dependency is “trivial” if it cannot possibly not be satisfied.

Example: branch-name branch-name

In fact, an FD is trivial if and only if the right-hand side is a subset(not necessarily a proper subset) of the left hand side.

5.5.4 Closure of a Set of Functional Dependencies

We shall see that, given a set F of functional dependencies, we can prove that certain other functional dependencies hold. We say that such functional dependencies are logically implied by F.

Example:

Given a relation schema R={A,B,X,G,H,I} and the set of functional dependencies

A B A C CG H CG I B H

The functional dependency A H is logically implied.

5.5.4 Closure of a Set of Functional Dependencies

F+: Let F be a set of functional dependencies. The closure of F is the set of all functional dependencies logically implied by F. we denote the closure of F by F+.

Armstrong’s axioms: We adopt the convention of suing Greek letters(,,,…) for sets of attributes, and uppercase Roman letters from the beginning of the alphabet for individual attributes. We use to denote ∪.

Armstrong’s axioms

① Reflexivity rule: if is a set of attributes and , then holds.

② Augmentation rule: if holds and is a set of attributes, then holds.

③ Transitivity rule: if holds and holds, then holds.

These rules are sound, because they do not generate any incorrect functional dependencies. The rules are complete, because, for a given set F of functional dependencies, they allow us to generate all F+

④ Union rule: if hold and hold, then holds


⑤ Decomposition rule: if holds, then holds and hold

⑥ Pseudotransitivity rule: if holds and holds, then holds.


Example: Given a relation schema R={A,B,X,G,H,I} and the set of functional dependencies

A B A C CG H CG I B H

① Using Armstrong’s axioms to show that A H, CG HI, AG I

A H : A B B H (transitivity rule)

CG HI: CG H and CG I (union rule)

AG I: A C and CG I ( pseudotransitivity rule)


② Suppose we are given a relation schema R=(A,B,C,D,E,F) and the set of function dependencies.

A BC B E CD EF Prove AD F

A BC A C (Decomposition rule)

AD CD (Augmentation rule)

CD EF AD EF (Transitivity rule)

AD F (Decomposition rule)

5.5.5 Closure of attribute Sets

Compute a certain subset of the closure:

Given a relvar R, a set Z of attributes of R, and a set S of FDs that hold for R, we can determine the set of all attributes of R that are functionally dependent on Z —the so-called closure Z+ of Z under S.


Algorithm:closure[Z,S]:=Zdo “forever”

for each FD X Y in Sdo;

If X<= closure[Z,S]

then closure[Z,S]:= closure[Z,S] Y∪end;

if closure[Z,S] did not change on this iteration

then leave loop;end;


Example: Suppose we are given relvar R with attributes A,B,C,D,E,F and FDs.

A BC E CF B E CD EF We now compute the closure {A,B}+ of the set of attribute {A,B} under this set of FDs.

1. We initialize the result closure[Z,S] to {A,B} .

2. Go round the inner loop fours times, once for each of the given FDs. On the first ineration(for the FD A BC ) we add C to closure[Z,S] which has the value {A,B,C}

3. On the second ineration(for the FD E CF ) the closure[Z,S] remains unchanged.


4. On the third ineration(for the FD B E ) we add E to closure[Z,S] , which now has the value {A,B,C,E}

5. On the fourth ineration(for the FD CD EF) the closure[Z,S] remains unchanged.

6. Go ground the inner loop four times again. On the first iteration, the result does not change; on the second, it expands to {A,B,C,E,F}; on the third and fourth, it does not change.

7. Go round the inner loop fours times again. closure[Z,S] does not change, and so the whole process terminates With {A,B}+ ={A,B,C,E,F}


An important corollary of the foregoing is as following : Given a sets of FDs, we can easily tell whether a specific FD X Y follows from S, because that FD will follow if and only if Y is a subset of the closure X+ of X under S.

Another important corollary is the following. The superkeys for a given relvar R are precisely those subsets K of the attributes of R such that the FD K A holds true for every attribute A of R.

5.5.6 Irreducible Sets of dependencies

Let S1 and S2 be two sets of FDs. If every FD implied by S1 is implied by S2—i.e. if S1+ is a subset of S2+ —we say that S2 is a cover for S1. What this means is that if the DBMS enforces the FDs in S2, then it will automatically be enforcing the FDs in S1.


Now we define a set S of FDs to be irreducible if and only if it satisfies the following three properties:

1. The right-hand side of every FD in S involves just one attribute.

2. The left-hand side of every FD in S is irreducible in turn—meaning that no attribute can be discarded from the determinant without changing the closure S+. We will say that such an FD is left -irreducible.

3. No FD in S can be discarded from S without changing the closure S+.


Example:

① S1=

ABD

A D

A E

② S2=

ABC

A B

A D

A C

③ S3=

AA

A B

A C

A D


Algorithm:

Example:

Relvar R{A,B,C,D,E,G} satisfies the following FDs:

R=

ABC D EG

C A BE C

BCD CG BD

ACD B CE AG

Find an irreducible equivalent for this set of FDs.


① Using decomposition rule to rewrite the FDs such that each has a singleton right-hand side:

R=

ABC D E CG D

C A D G CE A

BCD BE C CE G

ACD B CG B

② For each FD f in R, if deleting f from R has no effect on the closure R+ , we delete f from R.

ABC {AB}+={AB} C {a,b}


ACDB {ACD}+={ABCDEG} B {ABCDEG} delete ACDB

BCD {BC}+={BCA} D {BCA}

CA {C}+={C} A {C}

CA {C}+={C} A {C}

DE {D}+={DG} E {DG}DG {D}+={DE} G{DE}

BEC {BE}+={BE} C{BE}CGB {CG}+={CGDA} B{CGDA}

CGD {CG}+={CGBADE} D{CGBADE} delete CGD


CEA {CE}+={CEGABD} A {CEGABD} delete CEA

CEG {CE}+={CEA} G{CEA}

R1=

ABC D G

C A BE C

BCD CG B

D E CE G


③ For each FD f in S, we examine each attribute A in the left-hand side of f; if deleting A from the left-hand side of f has no effect on the closure R+, we delete A from the left-hand side of f.

ABC {A}+={A} {B}+={B} ……

R1 is a irreducible sets of dependencies.


If we compute step three before the step two,then:

ACDB {CD}+={CDAEGB} B {CDAEGB}

ACDB CDB

R2=

ABC D E CG D

C A D G CE A

BCD BE C CE G

CD B CG B


Compute step two:

CGB {CG}+={CGDBAE} B{CGDBAE} delete CGB

CEA {CE}+={CEGADB} A {CEGADB} delete CEA

R=

ABC D E CE G

C A D G

BCD BE C

CD B CG D

Exercises

1.Relvar R{A,B,C,D,E,F,G} satisfies the following FDs:

AB

BC DE

AEFG Compute the closure {AC}+ under this set of FDs. Is the FD ACFDG implied by this set?

Exercises

2.Relvar R{A,B,C,D,E,F} satisfies the following FDs:

ABC BE C

C A CE FA

BCD CF BD

ACD B D EF


Exercises

2.Relvar R{A,B,C,D,E,G,H,P} satisfies the following FDs:

ABCE CDE P

A C HB P

GPB D HG

EP A ABC PG