71
Chapter :- 3 Database Designing

Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Embed Size (px)

Citation preview

Page 1: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Chapter :- 3Database Designing

Page 2: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

What is Database Designing?

Each relational schema consist of a number of attributes, and the relational database schema consist of a number of relational schemas.

So far we have assumed that the attributes are grouped to formed a relation schema by using a common sense of the database designer or by mapping the database schema design for a conceptual model.

However we still need some formal measure of why one grouping of attributes into relation schema may be better than another.

Page 3: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

There are two levels at which we can discuss the goodness of relation schema.1. Logical / Conceptual Level2. Implementation / Storage Level

Logical Level :-

How users interprets the relation schemas and the meaning of their attributes.

Heaving a good relation schema at this level enable users to understand clearly the meaning of the data in the relations, and hence to formulate their queries correctly.

Page 4: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Implementation Level :-

How the tuples in a base relation are stored and updated.

This level applies only to schemas of the base relations which will be physically stored as file whereas at logical level we are interested in schemas of both base relation or view (Virtual Relation).

As with many design problems database design may be performed using two approaches :1. Bottom-Up Design Methodology2. Top-Down Design Methodology

Page 5: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Bottom-Up Design Methodology :-

This methodology considers the basic relationship among individual attributes as a starting point and uses those to construct relation schema.

This approach is not very popular in practice because it suffer from the problem of heaving to collect a large number of binary relationships among attributes as the string point.

Page 6: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Top-Down Design Methodology :-

Also called a Design By Analysis. Starts with a numbers of groupings of

attributes into relations that exist together naturally, for example, on an Invoice, a form or a report.

The relations are then analyzed individually and collectively.

Theory describe applicable to both the top-down and bottom-up design approaches, but is more practical when used with the top-down approaches.

Page 7: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Functional Dependency

A functional dependency is a constraint between two sets of attributes from the database.

Suppose our relational database schema has N attributes like A1, A2, A3… AN;

Let us think of whole database as being describe by a single “Universal Relational Schema” R = {A1, A2, A3… AN}

We don’t imply that we will actually store the database as a single table.

That’s why we have to divide our database as a collection of tables.

For retrieving a value of one attribute base on another attribute value we have to follow some constraints (Normalization).

Count…

Page 8: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Definition :-

“A Functional Dependency denote by X Y between two sets of attributes X and Y that are subset of R specifies the constraints on the possible tuples that can form a relation state r of R. The constraints is that for any two tuples t1 and t2 in r that have t1[X] = t2[X] they must also have ……… t1[Y] = t2[Y].”

This means that value of Y component of a tuple in r depends on or are determine by the value of X component.

Alternatively the value of X component uniquely determine the value of Y component.

Count…

Page 9: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

We also say that there is a functional dependency from X to Y, or that Y is functional dependent on X.

The abbreviation of Functional Dependency is FD or f.d.

The set of attribute X is called the Left-hand Side of FD.

The set of attribute Y is called the Right-hand Side of FD.

Thus X functionally determines Y in relational schema if and only if whenever two tuples of r(R) agree on their X values, they must agree on their Y value.

Count…

Page 10: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

If a constraint on R state that there can not be more then one tuple with a given X-value in any relation instance r(R) that is X is a candidate key of R. This implies X Y for any subset of attributes Y of R.

A functional dependency is a property of the semantics/meaning of attributes.

The database designer will use their understanding of the semantics of the attributes of R that is how they are related to each other to specify the FD that should hold on all relation states r of R.

Count…

Page 11: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Whenever the semantic of two sets of attributes in R indicate that a FD should hold, we specify the dependency as a constraint.

The relation extension r(R) that satisfy the FD constraints are called Legal Relation State or Legal extension.

Page 12: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Example :-Consider the relation schema EMP_PROJ

A. SSN EnameB. Pnumber {Pname, Plocation}C. {SSN, Pnumber} Hours

These functional dependencies specify that… (A) The value of an Employee’s Social Security

Number (SSN) uniquely determine the Employee Name (Ename).

(B) The value of a Project Number (Pnumber) uniquely determine the Project Name (Pname) and Project Location (Plocation).

(C) A combination of SSN and Pnumber values uniquely determine the number of Hours.

SSN Pnumber

Hours Ename Pname Plocation

Page 13: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Inference Rules for FD :-

We denote by F the set of functional dependencies that are specified on relation schema R.

Typically schema designer specify the FDs that are semantically obvious; usually, however, numerous other FDs hold in all legal relation instances among sets of attributes that can be derived from and satisfy the dependency in F.

In real life it is impossible to specify all possible FDs for a given situation.

Page 14: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

For Example :-

If each department has one manager, so that Dept_No uniquely determine Mgr_SSN (Dept_No Mgr_SSN) and a manager has a Unique phone number called Mgr_Phone (Mgr_SSN Mgr_Phone) then these two dependencies together imply that Dept_No Mgr_Phone.

This is an inference FD.Therefore it useful to define a concept called closure

that include all possible dependencies that can be inferred from the given set F.

Definition :-“Formally the set of all dependencies that include F

as well as all dependencies that can be inferred from F is called CLOSURE of F; it is denoted by F+.”

Page 15: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

For Example :-

Suppose that we specify the following set F of obvious FDs on the relation schema EMP_PROJ.

F = { SSN{Ename, Bdate, Address, Dnumber}, Dnumber{Dname, Dmgr_SSN} }

Some of the additional dependencies that we can infer from F are the following :

SSN {Dname, Dmgr_SSN}SSN SSNDnumber Dname

Page 16: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

An FD X Y is inferred from a set of dependencies F specified on R if X Y holds in every legal relation state r of R; that is whenever r satisfied all the dependencies in F, X Y also hold in r.

The closure F+ of F is the set of all FDs that can be inferred from F.

To determine a semantic way to infer dependencies, we must discover a set of Inference Rules that can be used to infer new dependencies from a given set of dependencies.

We use the notation F |= X Y to denote that FD X Y is inferred from the set of FDs F.

We use an abbreviated notation when discussing FDs. We concatenate attribute variables and drop the commas for convenience.

Page 17: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Hence, the FD {X,Y} Z is abbreviated to XY Z, and the FD {X,Y,Z} {U,V} is abbreviated to XYZ UV

Following six rules IR1 through IR2 are well-known inference rules for FDs.

(RAT DUP)

1. IR1 Reflexive Rule ------------- If X Y, then X Y2. IR2 Augmentation Rule ------- {X Y} |= XZ YZ3. IR3 Transitive Rule ------------- {X Y, Y Z} |= X Z4. IR4 Decomposition ------------ {X YZ} |= X Y5. IR5 Union or Additive Rule --- {X Y, X Z} |= X

YZ6. IR6 Pseudotransitive Rule ---- {X Y, WY Z} |=

WX Z

Page 18: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

IR1 Reflexive Rule If X Y, then X Y

Proof of IR1 :-

Suppose that X Y and that two tuples t1 and t2 exist in some relation instance r or R such that t1[X] = t2[X]. Then t1[Y] = t2[Y] because X Y; hence , X Y must hold in r.

Page 19: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

IR2 Augmentation Rule {X Y} |= XZ YZ

Proof of IR2 :-Assume that X Y hold in a relation instance r of R but that XY YZ does not hold. Then there must be exist two tuples t1 and t2 in r such that (1) t1[X] = t2[X], (2) t1[Y] = t2[Y], (3) t1[XZ] = t2[XZ], (4) t1[YZ] ≠ t2[YZ].

This is not possible because (5) t1[Z] = t2[Z] from (1) and (3),(6) t1[YZ] = t2[YZ] from (2) and (5) contradicting (4)

Page 20: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

IR3 Transitive Rule {X Y, Y Z} |= X Z

Proof of IR3 :-

Assume that (1) X Y and (2) Y Z both hold in relation r.

Then for any two tuples t1 and t2 in r such that t1[X]=t1[X] we must have (3) t1[Y]=t2[Y] from assumption (1)Hence, we must also have (4) t1[Z]=t2[Z], from (3) and assumption (2)Hence, X Z must hold in r.

Page 21: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

IR4 Decomposition Rule {X YZ} |= X Y

Proof of IR4 :-

1. X YZ (given)2. YZ Y (using IR! And knowing that YZ Y).3. X Y Using IR3 on 1 and 2)

Page 22: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

IR5 Union Rule {X Y, X Z} |= X YZ

Proof of IR5 :-

1. X Y (given)2. X Z (given)3. X XY (using IR2 on 1 by assuming XX = X)4. XY YZ (using IR2 on 2 by augmenting with

Y)5. X YZ (using IR3 on 3 and 4)

Page 23: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

IR6 Pseudotransitive Rule {X Y, WY Z} |= WX Z

Proof of IR6 :-

1. X Y (given)2. WY Z (given)3. WX WY (using IR2 on 1 by assuming with

W)4. XY YZ (using IR2 on 2 by augmenting with

Y)5. WX Z (using IR3 on 3 and 2)

Page 24: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Armstrong’s Inference Rules

It has been shown by Armstrong (1947) that inference rules IR1 Through IR3 are sound and complete.

By Sound, we mean that given a set of FDs F specified on a relation schema R , any dependency that we can infer from F by using IR1 through IR3 holds in every relation state r of R that satisfied the dependency in F.

By Complete, we mean that using IR1 through IR3 repeatedly to infer dependencies until no more dependency can be inferred result in the complete set of all possible dependencies that can be inferred from F.

Hence Infer Rules IR1 through IR3 are known as Armstrong’s Rule.

Page 25: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Normalization of Relation

Initially Dr. E. F. Codd proposed three normal forms, which he called First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF). A stronger definition of 3NF is called Boyce-Codd Normal Form(BCNF).

These normal forms are based on a single analytical tool: the Functional Dependency among the attributes of relation.

Later the 4NF and 5NF were proposed based on concept of multivalued dependency and join dependency respectively.

Page 26: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Normalization of Data

It can be consider the process of analyzing the relation schemas based on their FDs and primary key to achieve the desirable properties of :Minimizing Redundancy andMinimizing the Insertion, Deletion and

Update Unsatisfied relation schemas that do not

meet certain conditions – the Normal Form Tests – are decomposed into smaller relation schemas that meet the test and hence posses the desirable properties.

Page 27: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Thus, the normalization producer provides database designers with the following:The formal framework for analyzing

relation schemas base on their keys and on the functional dependencies among their attributes.

A series of normal forms tests that can be carried out on individual relation schemas so that the relational database can be normalized to any desired degree.

Definition : The NF of the relation refers highest NF condition that is meets, and hence indicate the degree to which it has been normalized.

Page 28: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Normal Forms when consider in isolation from, other factors, do not guarantee a good database design.

It is generally not sufficient to check separately that each relation schema in the database is, say in BCNF or in 3NF.

Rather, the purpose of normalization through decomposition must also confirm the existence of additional properties that the relational schemas, taken together, should process.

Page 29: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

This would include two properties :1. Lossless Join or Nonadditive Join

Which guarantees that the spurious tuple generation problem.

2. Dependency PreservationWhich ensure that each functional dependency is represented in some individual relation resulting after decomposition.

The Nonadditive join property is extremely critical and must be achieve at any cost, whereas the dependency preservation, although desirable, is sometimes sacrificed.

Page 30: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

First Normal Form (1NF)

1NF is now considered to be part of formal definition of a relation in the basic relational model.

It was disallow Multivalued attributes, composite attributes and their combination.

It state the domain of an attributes must include only atomic value (Simple, Indivisible values) and that the value of any attribute in a tuple must be a singled value from the domain of that attribute.

The only attribute values permitted by 1NF are single atomic or indivisible value

Page 31: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Consider the Department Relation Schema..

Dname Dnumber Dmgr_ssn Dlocation

Department

Domain of Dlocation can have an atomic values, but some tuples can have a set of this values.

In this case, Dlocation is not functionally dependent on primary key Dnumber.Dname Dnumber Dmgr_ssn Dlocation

Research 5 333445555 {Surat, Baroda, Rajkot }

Production

4 987654321 { Ahmadabad }

Sales 1 888665555 { Bhavnagar }

The Department relation is not in 1NF.

Page 32: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

There are three main techniques to achieve the 1NF.

1. Remove the attribute Dlocation that violate 1NF and place it in a separate relation along with the primary key Dnumber.

2. Expand the key so that there will be a separate tuple in the original Department relation for each location of Department as shown bellow…Dname Dnumber Dmgr_ssn Dlocation

Research 5 333445555 Surat

Research 5 333445555 Baroda

Research 5 333445555 Rajkot

Production 4 987654321 Ahmadabad

Sales 1 888665555 Bhavnagar

Page 33: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

3. If a maximum number of values is known for the attribute-for example, if it is known that at most three locations can exist for a department-replace the Dlocation attribute by three atomic attributes:Dlocation1, Dlocation2, Dlocation3.

• This solution has a disadvantages of introducing NULL values if most departments have few then three locations.

• Of the three solution the first is generally considered best because it does not suffer from redundancy .

• 1NF also disallow multivalued attributes that are themselves composite. These are called Nested Relation.

Page 34: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Emp_Proj

SSN Ename Pnumber Hours

Projs

SSN Ename Pnumber Hours

1 Raj 12

3315

2 Ram 3 10

3 Jay 23

1012

4 Vijay 3010

1616

Using second technique we can define the same relation schema in two separate relation schemas as bellow

Emp_Proj_1

Emp_Proj_1

SSN Ename

SSN Pnumber Hours

Page 35: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Second Normal Form (2NF)

2NF is based on the concept of Full Functional Dependency.

A FD X Y is full functional dependency is removal of any attribute A from X means that the dependency does not holds any more; that is for any attribute A € X ,(X-{A}) does not functionally determine Y.

A functional dependency XY is Partial Dependency if some attribute A € X ,(X-{A}) Y if some attribute A € X can be removed from X and the dependency still hold.

Page 36: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Definition :-

“A relation schema R is in 2NF is every nonprime attribute A in R fully functionally dependent on the prime key of R.”

The test for 2NF involves testing for functional dependencies whose left-hand side attributes are the part of the primary key.

If the primary key contains a single attribute, the test need not be applied at all.

Page 37: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Emp_Proj

SSN Pnumber Hours Ename Pname Plocation

FD1

FD3

FD2

2nd Normal Form :-

Emp_Proj_1

Emp_Proj_2

Emp_Proj_3

SSN Pnumber Hours

Pnumber Pname Plocation

SSN Ename

Page 38: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Third Normal Form (3NF)

3NF is based on the concept of Transitive Dependency.

A functional Dependency X-->Y in a relation schema R is a transitive dependency if there is a set of attributes Z that is neither a candidate key nor a subset of any key of R, and both X-->Z and Z-->Y holds.

The dependency SSN->Dmgr_ssn is transitive through Dnumber in EMP_DEPT.

Because both the dependencies SSN->Dnumber and Dnumber->Dmgr_ssn hold and Dnumber is neither a key itself nor a subset of key of EMP_DEPT.

Page 39: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Definition :-“According to Codd’s original definition, a relation schema R is in 3NF if it satisfied 2NF and no nonprime attribute of R is the transitively dependent on the prime key.”

The given relational schema EMP_DEPT is in 2NF, since no partial dependencies on a key exist.

However EMP_DEPT is not in 3NF because of the transitive dependency of Dmgr_ssn on SSN via Dnumber.

We can normalize EMP_DEPT by decomposing it in to two 3NF relations ED1 And ED2 represent the independent entity facts about Employee and Department .

A JOIN operation on ED1 and ED2 will recover the original relation EMP_DEPT.

Page 40: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

EMP_DEPT

SSN Ename Bdate Address Dnumber Dname Dmgr_ssn

3rd Normal Form :-

SSN Ename Bdate Address DnumberED1

Dnumber Dname Dmgr_ssnED2

Page 41: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

LOTS

Property_ID# Country_name Lot# Area Price Tax_Rate

Property_ID# Country_name

Lot# Area Price

FD1

FD3

FD2

FD4

1st Normal Form

2nd Normal Form

LOTS1 LOTS2

Country_name Tax_Rate

FD1

FD2

FD4

FD3

Page 42: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Property_ID# Country_name

Lot# Area

FD1 FD3

FD2

FD4

3rd Normal Form

LOTS1A

LOTS2

Country_name Tax_Rate

LOTS1B

Area Price

LOTS

LOTS1 LOTS2

LOTS1A

LOTS1B LOTS2

1NF

2NF

3NF

Page 43: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Boyce-Codd Normal Form (BCNF)

BCNF was proposed as a simpler form of 3NF, but it was found to be stricter than 3NF.

That is every relation in BCNF is also in 3NF; However, a relation in 3NF is not necessary in BCNF.

We can see the need for stronger NF than 3NF in example of LOTS relation schema with its four FDs FD1 to FD4.

Suppose we have a thousands of lots for only two countries India and USA.

Suppose also that the lots size in India are of 1.0,1.0,1.2 up to 2.0.

Page 44: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Where as the lots size in USA are of 2.1,2.2 up to 3.0.

In such a situation we have an additional functional dependency FD5 : Area --> Country.

The Area of LOTS can determine the Country.This relation can generate the redundancy by

repeating the same value “India” in Country for Area = 1.0,1.1,1.2 up to the 2.0. For these ten values the value for Country name would be same as “India”.

And same situation in Country USA.So that there is again need for decomposition

of the relation schema LOTS1A in to two different schemas LOTS1AX and LOTS1AY.

Page 45: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Definition :-A relation schema R is in BCNF if whenever a nontrivial functional dependency X --> A (3NF) holds in R, then X is super key of R.

Means if we want to check that the schema is in BCNF or not for that the schema must be in 3NF.

Then after if any FD in that schema is holds that any nonprime attribute can determine the value of any prime attribute then that will violate the second pre condition for BCNF.

So that to convert that schema in to BCNF again we have to decomposed that relation schema. Like ..:

Page 46: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Property_ID# Country_name

Lot# Area

FD1 FD3

FD2

FD4

3rd Normal Form

LOTS1A

LOTS2

Country_name Tax_Rate

LOTS1B

Area Price

FD5

For this FD5 Area is nonprime attribute and Country_name is our prime attribute so that FD5 violate the 2nd precondition for BCNF.So that once again we have to decompose the relation schema LOTS1A in to two different relation schema LOTS1AX and LOTS1AY then after LOTS1AX, LOTS1AY, LOTS1B, LOTS2 will the BCNF conversion of LOTS relation schema.

Page 47: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Property_ID# Country_name

Lot#

FD1

FD3

FD2

FD4

Boyce Codd Normal Form

LOTS1AX

LOTS2

Country_name Tax_Rate

LOTS1B

Area Price

LOTS1AY

Area Country_name

FD5

Page 48: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

LOTS

LOTS1 LOTS2

LOTS1A

LOTS1B LOTS2

1NF

2NF

3NF

LOTS1AX LOTS1AY BCNFLOTS1B LOTS2

Page 49: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Relational Decomposition

A single Universal Relation Schema R={A1,A2,…An} that include all the attributes of the database.

We implicitly make the universal relation assumption, which state that every attribute name is unique.

The set of functional dependency F that should be hold on the attributes of R is specified by the database designer

Using the FDs, decompose the universal relation schema R in to a set of relation schemas D={R1,R2,R3,…..Rm} that will become the relation schema; D is called the Decomposition of R .

Page 50: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Properties of Relational Decomposition

1. Dependency Preservation Property :-

It would be useful if each FD X-->Y specified in F either appeared directly in one of the relation schema Ri in D or could be inferred from the FDs that appear in some Ri.

Informally this is the Dependency Preservation Condition.

It is not necessary that the exact dependency specified in F appear themselves in individual relations of the decomposition D.

It is sufficient that the union of the FDs that hold on the individual relations in D be equivalent to F (set of FDs which are hold in Universal Schema R).

Page 51: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Definition :-

Given a set of dependencies F on R, the Projection of F in Ri where Ri is a subset of R, is the set of FDs X --> Y in F+ such that attributes in X u Y are all contain in Ri. Hence the projection of F on each relation schema Ri in the decomposition D is the set of FDs F+ the closure of F, such that all their Left-hand side and Right-hand side attributes are in Ri.

Page 52: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

• We say that a decomposition D={R1,R2,R3…,Rm} of R is Dependency-Preserving with respect to F if the union of projection of F on each Ri in D is equivalent to F+.

• If a decomposition is not dependency-preserving, some dependency is Lost in the decomposition.

• To check a Lost dependency holds, we must take the JOIN of to or more relations.

Page 53: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

2. Nonadditive Join / Lossless Join :-

Another property that decomposition D should possess is the nonadditive join property which ensures that no spurious tuples are generated hen a NATURAL JOIN operation is applied to the relations in the decomposition.

Definition :-

Formally a decomposition D = {R1,R2,R3…..,Rm} of R has the Lossless Join Property with respect to the set of dependencies F on R if, for every relation state r of R that satisfied F, the following holds, where * is the NATURAL JOIN of all the relation in D: *(R1(r),R2(r)…..Rm(r)) = r

Page 54: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

The word loss in Lossless refers to loss of information not to loss of tuples.

If decomposition does not have the lossless join property, we may get additional spurious tuples after the NATURAL JOIN operation is applied: these additional tuples represented the erroneous or invalid information.

Page 55: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Multivalued Dependency

So far we have discussed only functional dependency, which is by far the most important type of dependency in relational database design theory.

However, in many cases relations have constraints that can not be satisfied as FD.

In this section, we discuss the concept of Multivalued Dependency (MVD) and defined Fourth Normal Form which is based on MVD.

Page 56: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

If we have two or more Multivalued independent attributes in the same relation schema, we get into a problem of having to repeat every value of one of the attributes with every value of the other attribute to keep the relation state consistent and to maintain the independence among the attributes involved.

Example :- Consider the relation EMP. A tuple in this EMP relation represents the fact

that an employee whose name is Ename works on the project whose name is Pname and has a dependent whose name is Dname.

An employee may work on several projects and may have several dependents, and the employee’s projects and dependents are independent from one another.

Page 57: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

To keep the relation state consistent, we must have a separate tuple to represent every combination of an employee’s dependent and an employee’s project.

This constraint is specified as an Multivalued dependency on EMP relation.

EMP Ename Pname Dname

Smith, Smith X, Y John

Smith, Smith X, Y Anna

Ename Pname Dname

Smith X John

Smith Y Anna

Smith X Anna

Smith Y John

Two MVDs :-1. Ename -->> Pname2. Ename -->> Dname

Page 58: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Formal Definition for MVD :-

A Multivalued dependency X -->> Y specified on relational schema R, where X and Y are both subset of R, specifies the following constraints on any relation state r of R: If two tuples t1 and t2 exist in r such that t1[x] = t2[x], then two tuples t3 and t4 should also exist in r with the following properties, where we use Z to denote ( R – ( X u Y ) ).• t3[x] = t4[x] = t1[x] = t2[x]• t3[y] = t1[y] and t4[y] = t2[y]• t3[z] = t2[z] and t4[z] = t1[z]

Page 59: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

• The formal definition specifies that given a particular value of X, the set of values of Y determined by this value of X is completely determined by X alone and does not depend on the values of the remaining attribute Z of R.

• Hence, whenever two tuples exist that have distinct values of Y but the same value of X, these values of Y must be repeated in separate tuples with every distinct value of Z that occurs with that same value of X.

• An MVD X -->> Y in R is called Trivial MVD if :a) Y is a subset of X or,b) X u Y = R.

• An MVD that satisfied neither (a) nor (b) is called Nontrivial MVD.

Page 60: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Fourth Normal Form (4NF)

We now present the definition for 4NF, which is violated when a relation has undesirable MVDs, and hence can be used to identify and decomposed such a relation schema.

Definition :-

A relation schema R is in 4NF with respect to a set of dependences F (that include FDs and MVDs) if for every nontrivial MVD X -->> Y in F+, X is a super key for R.

Page 61: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Ename Pname Dname

Smith X John

Smith Y Anna

Smith X Anna

Smith Y John

Brown W Jim

Brown X Jim

Brown Y Jim

Brown Z Jim

Brown W Joan

Brown X Joan

Brown Y Joan

Brown Z Joan

Brown W Bond

Brown X Bond

Brown Y Bond

Brown Z Bond

Emp :- 4th Normal Form :-

Ename Pname

Smith X

Smith Y

Brown W

Brown X

Brown Y

Brown Z

Ename Dname

Smith John

Smith Anna

Brown Jim

Brown Joan

Brown Bond

1. Emp_Project:-

2. Emp_Dept:-

Page 62: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Ename Pname Dname

Ename Pname

Ename Dname

Emp :- 4th Normal Form :-

1. Emp_Project:-

2. Emp_Dept:-

X -->> Y

X -->> Z

MVD1

MVD2

We can also write like :X -->> Y | Z

MVD1 X -->> Y

MVD2 X -->> Z

Page 63: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Join Dependency and Fifth Normal Form

There may be no FD in R that violate any NF up to BCNF, and there may be no nontrivial MVD present in R that violate 4NF.

We then resort to another dependency called Join Dependency and, if it is present, carry out a multiway decomposition into 5NF.

Such a dependency is very difficult to detect in practice; therefore normalization in 5th Normal Form is very rarely done in practice.

Page 64: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

• Definition :-A Join Dependency (JD), Denote by JD(R1,R2,R3…..,Rn), specified on relation schema R, specified a constraint on the states r of R. The constraint state that every legal state r of R should have a nonadditive join decomposition into R1,R2,….Rn; that is, for every such r we have:*(R1(r),R2(r)…..Rn(r)) = r

• 5th Normal Form :-

A relation schema R is in 5NF (or Project Join Normal Form) (PJNF) with respect to a set F of FDs, MVDs, JDs if, for every nontrivial Join Dependency JD(R1,R2….,Rn) in F+ (that is, implied by F), every Ri is a super key.

Page 65: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Design Phases

For small Application it may be feasible fro a database designer who understand the application requirements to decide directly on the relations to be created, their attributes and the constraints on the relations.

However such a direct design process is difficult for real-world applications. Often no one can understand the complete data need of an application.

The database designer must interact with the users of the application to understand the needs of the application, represent them in a high-level fashion that can be understand by the users.

Page 66: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

And finally translate the requirement into a lower levels of the design.

The data is the requirement of the users and database structure fulfills these requirements.Following are the sequential tasks to be

performed while designing database :-1. Collect Database Requirements.2. Decide the Number of Entities and their

Attributes.3. Make Relationships between Entities.4. Conceptual Designing.5. Logical Designing.6. Physical Designing.

Page 67: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

The main phases of database designing :

1. Requirements and Data :-

The initial phase of the database designing is to characterized fully the data needs of the database users. The database designer needs to interact extensively with domain experts and users to carry out this task.

The outcome of this phase is specification of user requirements.

Page 68: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

2. Choose Data Model :-

Next, the designer chooses the data model and by applying the concept of the data model, translate these requirements into a conceptual schema of the database. Schema developed at this conceptual-design phase provides a detail overview of the enterprise. The entity relationship model is typically used to represent the conceptual design.Typically, the conceptual-design phase result in the creation of the E-R diagram that provide the graphic representation of the schema.

Page 69: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

3. Functional Requirements :-

A fully developed conceptual schema also indicates the functional requirements of the enterprise.

In a specification of Functional Requirements user describe the kinds of operations (Transactions) that will be performed on the data like :

• Modifying and Updating Data.• Searching and Retrieving Data.• Deleting Data.

Page 70: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

Process of moving from an abstract data model to the implementation of the database processes in two final design phases :

• Logical-design phase• Physical-design phase

1. Logical-design Phase :-The designer maps the high-level conceptual schema into the implementation data model of the database system that will be used.The implementation data model is typically the relational data model, and this step consists of mapping the conceptual schema defined using the entity-relationship model into a relational schem.

Page 71: Database Designing. What is Database Designing? Each relational schema consist of a number of attributes, and the relational database schema consist of

2. Physical-design Phase :-

Finally the database designer use the resulting system specific database schema in the subsequent physical-design phase, in which the physical features of the database are specified. These features include the form of file organization and the internal storage structure.

• The physical schema of a database can be change easily after an application has been built. However changes to the logical schema are usually harder to carry out, since they may affect the number of queries and update scattered across application code.