Database Designing. What is Database Designing? Each relational schema consist of a number of...

Preview:

Citation preview

Chapter :- 3Database Designing

What is Database Designing?

Each relational schema consist of a number of attributes, and the relational database schema consist of a number of relational schemas.

So far we have assumed that the attributes are grouped to formed a relation schema by using a common sense of the database designer or by mapping the database schema design for a conceptual model.

However we still need some formal measure of why one grouping of attributes into relation schema may be better than another.

There are two levels at which we can discuss the goodness of relation schema.1. Logical / Conceptual Level2. Implementation / Storage Level

Logical Level :-

How users interprets the relation schemas and the meaning of their attributes.

Heaving a good relation schema at this level enable users to understand clearly the meaning of the data in the relations, and hence to formulate their queries correctly.

Implementation Level :-

How the tuples in a base relation are stored and updated.

This level applies only to schemas of the base relations which will be physically stored as file whereas at logical level we are interested in schemas of both base relation or view (Virtual Relation).

As with many design problems database design may be performed using two approaches :1. Bottom-Up Design Methodology2. Top-Down Design Methodology

Bottom-Up Design Methodology :-

This methodology considers the basic relationship among individual attributes as a starting point and uses those to construct relation schema.

This approach is not very popular in practice because it suffer from the problem of heaving to collect a large number of binary relationships among attributes as the string point.

Top-Down Design Methodology :-

Also called a Design By Analysis. Starts with a numbers of groupings of

attributes into relations that exist together naturally, for example, on an Invoice, a form or a report.

The relations are then analyzed individually and collectively.

Theory describe applicable to both the top-down and bottom-up design approaches, but is more practical when used with the top-down approaches.

Functional Dependency

A functional dependency is a constraint between two sets of attributes from the database.

Suppose our relational database schema has N attributes like A1, A2, A3… AN;

Let us think of whole database as being describe by a single “Universal Relational Schema” R = {A1, A2, A3… AN}

We don’t imply that we will actually store the database as a single table.

That’s why we have to divide our database as a collection of tables.

For retrieving a value of one attribute base on another attribute value we have to follow some constraints (Normalization).

Count…

Definition :-

“A Functional Dependency denote by X Y between two sets of attributes X and Y that are subset of R specifies the constraints on the possible tuples that can form a relation state r of R. The constraints is that for any two tuples t1 and t2 in r that have t1[X] = t2[X] they must also have ……… t1[Y] = t2[Y].”

This means that value of Y component of a tuple in r depends on or are determine by the value of X component.

Alternatively the value of X component uniquely determine the value of Y component.

Count…

We also say that there is a functional dependency from X to Y, or that Y is functional dependent on X.

The abbreviation of Functional Dependency is FD or f.d.

The set of attribute X is called the Left-hand Side of FD.

The set of attribute Y is called the Right-hand Side of FD.

Thus X functionally determines Y in relational schema if and only if whenever two tuples of r(R) agree on their X values, they must agree on their Y value.

Count…

If a constraint on R state that there can not be more then one tuple with a given X-value in any relation instance r(R) that is X is a candidate key of R. This implies X Y for any subset of attributes Y of R.

A functional dependency is a property of the semantics/meaning of attributes.

The database designer will use their understanding of the semantics of the attributes of R that is how they are related to each other to specify the FD that should hold on all relation states r of R.

Count…

Whenever the semantic of two sets of attributes in R indicate that a FD should hold, we specify the dependency as a constraint.

The relation extension r(R) that satisfy the FD constraints are called Legal Relation State or Legal extension.

Example :-Consider the relation schema EMP_PROJ

A. SSN EnameB. Pnumber {Pname, Plocation}C. {SSN, Pnumber} Hours

These functional dependencies specify that… (A) The value of an Employee’s Social Security

Number (SSN) uniquely determine the Employee Name (Ename).

(B) The value of a Project Number (Pnumber) uniquely determine the Project Name (Pname) and Project Location (Plocation).

(C) A combination of SSN and Pnumber values uniquely determine the number of Hours.

SSN Pnumber

Hours Ename Pname Plocation

Inference Rules for FD :-

We denote by F the set of functional dependencies that are specified on relation schema R.

Typically schema designer specify the FDs that are semantically obvious; usually, however, numerous other FDs hold in all legal relation instances among sets of attributes that can be derived from and satisfy the dependency in F.

In real life it is impossible to specify all possible FDs for a given situation.

For Example :-

If each department has one manager, so that Dept_No uniquely determine Mgr_SSN (Dept_No Mgr_SSN) and a manager has a Unique phone number called Mgr_Phone (Mgr_SSN Mgr_Phone) then these two dependencies together imply that Dept_No Mgr_Phone.

This is an inference FD.Therefore it useful to define a concept called closure

that include all possible dependencies that can be inferred from the given set F.

Definition :-“Formally the set of all dependencies that include F

as well as all dependencies that can be inferred from F is called CLOSURE of F; it is denoted by F+.”

For Example :-

Suppose that we specify the following set F of obvious FDs on the relation schema EMP_PROJ.

F = { SSN{Ename, Bdate, Address, Dnumber}, Dnumber{Dname, Dmgr_SSN} }

Some of the additional dependencies that we can infer from F are the following :

SSN {Dname, Dmgr_SSN}SSN SSNDnumber Dname

An FD X Y is inferred from a set of dependencies F specified on R if X Y holds in every legal relation state r of R; that is whenever r satisfied all the dependencies in F, X Y also hold in r.

The closure F+ of F is the set of all FDs that can be inferred from F.

To determine a semantic way to infer dependencies, we must discover a set of Inference Rules that can be used to infer new dependencies from a given set of dependencies.

We use the notation F |= X Y to denote that FD X Y is inferred from the set of FDs F.

We use an abbreviated notation when discussing FDs. We concatenate attribute variables and drop the commas for convenience.

Hence, the FD {X,Y} Z is abbreviated to XY Z, and the FD {X,Y,Z} {U,V} is abbreviated to XYZ UV

Following six rules IR1 through IR2 are well-known inference rules for FDs.

(RAT DUP)

1. IR1 Reflexive Rule ------------- If X Y, then X Y2. IR2 Augmentation Rule ------- {X Y} |= XZ YZ3. IR3 Transitive Rule ------------- {X Y, Y Z} |= X Z4. IR4 Decomposition ------------ {X YZ} |= X Y5. IR5 Union or Additive Rule --- {X Y, X Z} |= X

YZ6. IR6 Pseudotransitive Rule ---- {X Y, WY Z} |=

WX Z

IR1 Reflexive Rule If X Y, then X Y

Proof of IR1 :-

Suppose that X Y and that two tuples t1 and t2 exist in some relation instance r or R such that t1[X] = t2[X]. Then t1[Y] = t2[Y] because X Y; hence , X Y must hold in r.

IR2 Augmentation Rule {X Y} |= XZ YZ

Proof of IR2 :-Assume that X Y hold in a relation instance r of R but that XY YZ does not hold. Then there must be exist two tuples t1 and t2 in r such that (1) t1[X] = t2[X], (2) t1[Y] = t2[Y], (3) t1[XZ] = t2[XZ], (4) t1[YZ] ≠ t2[YZ].

This is not possible because (5) t1[Z] = t2[Z] from (1) and (3),(6) t1[YZ] = t2[YZ] from (2) and (5) contradicting (4)

IR3 Transitive Rule {X Y, Y Z} |= X Z

Proof of IR3 :-

Assume that (1) X Y and (2) Y Z both hold in relation r.

Then for any two tuples t1 and t2 in r such that t1[X]=t1[X] we must have (3) t1[Y]=t2[Y] from assumption (1)Hence, we must also have (4) t1[Z]=t2[Z], from (3) and assumption (2)Hence, X Z must hold in r.

IR4 Decomposition Rule {X YZ} |= X Y

Proof of IR4 :-

1. X YZ (given)2. YZ Y (using IR! And knowing that YZ Y).3. X Y Using IR3 on 1 and 2)

IR5 Union Rule {X Y, X Z} |= X YZ

Proof of IR5 :-

1. X Y (given)2. X Z (given)3. X XY (using IR2 on 1 by assuming XX = X)4. XY YZ (using IR2 on 2 by augmenting with

Y)5. X YZ (using IR3 on 3 and 4)

IR6 Pseudotransitive Rule {X Y, WY Z} |= WX Z

Proof of IR6 :-

1. X Y (given)2. WY Z (given)3. WX WY (using IR2 on 1 by assuming with

W)4. XY YZ (using IR2 on 2 by augmenting with

Y)5. WX Z (using IR3 on 3 and 2)

Armstrong’s Inference Rules

It has been shown by Armstrong (1947) that inference rules IR1 Through IR3 are sound and complete.

By Sound, we mean that given a set of FDs F specified on a relation schema R , any dependency that we can infer from F by using IR1 through IR3 holds in every relation state r of R that satisfied the dependency in F.

By Complete, we mean that using IR1 through IR3 repeatedly to infer dependencies until no more dependency can be inferred result in the complete set of all possible dependencies that can be inferred from F.

Hence Infer Rules IR1 through IR3 are known as Armstrong’s Rule.

Normalization of Relation

Initially Dr. E. F. Codd proposed three normal forms, which he called First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF). A stronger definition of 3NF is called Boyce-Codd Normal Form(BCNF).

These normal forms are based on a single analytical tool: the Functional Dependency among the attributes of relation.

Later the 4NF and 5NF were proposed based on concept of multivalued dependency and join dependency respectively.

Normalization of Data

It can be consider the process of analyzing the relation schemas based on their FDs and primary key to achieve the desirable properties of :Minimizing Redundancy andMinimizing the Insertion, Deletion and

Update Unsatisfied relation schemas that do not

meet certain conditions – the Normal Form Tests – are decomposed into smaller relation schemas that meet the test and hence posses the desirable properties.

Thus, the normalization producer provides database designers with the following:The formal framework for analyzing

relation schemas base on their keys and on the functional dependencies among their attributes.

A series of normal forms tests that can be carried out on individual relation schemas so that the relational database can be normalized to any desired degree.

Definition : The NF of the relation refers highest NF condition that is meets, and hence indicate the degree to which it has been normalized.

Normal Forms when consider in isolation from, other factors, do not guarantee a good database design.

It is generally not sufficient to check separately that each relation schema in the database is, say in BCNF or in 3NF.

Rather, the purpose of normalization through decomposition must also confirm the existence of additional properties that the relational schemas, taken together, should process.

This would include two properties :1. Lossless Join or Nonadditive Join

Which guarantees that the spurious tuple generation problem.

2. Dependency PreservationWhich ensure that each functional dependency is represented in some individual relation resulting after decomposition.

The Nonadditive join property is extremely critical and must be achieve at any cost, whereas the dependency preservation, although desirable, is sometimes sacrificed.

First Normal Form (1NF)

1NF is now considered to be part of formal definition of a relation in the basic relational model.

It was disallow Multivalued attributes, composite attributes and their combination.

It state the domain of an attributes must include only atomic value (Simple, Indivisible values) and that the value of any attribute in a tuple must be a singled value from the domain of that attribute.

The only attribute values permitted by 1NF are single atomic or indivisible value

Consider the Department Relation Schema..

Dname Dnumber Dmgr_ssn Dlocation

Department

Domain of Dlocation can have an atomic values, but some tuples can have a set of this values.

In this case, Dlocation is not functionally dependent on primary key Dnumber.Dname Dnumber Dmgr_ssn Dlocation

Research 5 333445555 {Surat, Baroda, Rajkot }

Production

4 987654321 { Ahmadabad }

Sales 1 888665555 { Bhavnagar }

The Department relation is not in 1NF.

There are three main techniques to achieve the 1NF.

1. Remove the attribute Dlocation that violate 1NF and place it in a separate relation along with the primary key Dnumber.

2. Expand the key so that there will be a separate tuple in the original Department relation for each location of Department as shown bellow…Dname Dnumber Dmgr_ssn Dlocation

Research 5 333445555 Surat

Research 5 333445555 Baroda

Research 5 333445555 Rajkot

Production 4 987654321 Ahmadabad

Sales 1 888665555 Bhavnagar

3. If a maximum number of values is known for the attribute-for example, if it is known that at most three locations can exist for a department-replace the Dlocation attribute by three atomic attributes:Dlocation1, Dlocation2, Dlocation3.

• This solution has a disadvantages of introducing NULL values if most departments have few then three locations.

• Of the three solution the first is generally considered best because it does not suffer from redundancy .

• 1NF also disallow multivalued attributes that are themselves composite. These are called Nested Relation.

Emp_Proj

SSN Ename Pnumber Hours

Projs

SSN Ename Pnumber Hours

1 Raj 12

3315

2 Ram 3 10

3 Jay 23

1012

4 Vijay 3010

1616

Using second technique we can define the same relation schema in two separate relation schemas as bellow

Emp_Proj_1

Emp_Proj_1

SSN Ename

SSN Pnumber Hours

Second Normal Form (2NF)

2NF is based on the concept of Full Functional Dependency.

A FD X Y is full functional dependency is removal of any attribute A from X means that the dependency does not holds any more; that is for any attribute A € X ,(X-{A}) does not functionally determine Y.

A functional dependency XY is Partial Dependency if some attribute A € X ,(X-{A}) Y if some attribute A € X can be removed from X and the dependency still hold.

Definition :-

“A relation schema R is in 2NF is every nonprime attribute A in R fully functionally dependent on the prime key of R.”

The test for 2NF involves testing for functional dependencies whose left-hand side attributes are the part of the primary key.

If the primary key contains a single attribute, the test need not be applied at all.

Emp_Proj

SSN Pnumber Hours Ename Pname Plocation

FD1

FD3

FD2

2nd Normal Form :-

Emp_Proj_1

Emp_Proj_2

Emp_Proj_3

SSN Pnumber Hours

Pnumber Pname Plocation

SSN Ename

Third Normal Form (3NF)

3NF is based on the concept of Transitive Dependency.

A functional Dependency X-->Y in a relation schema R is a transitive dependency if there is a set of attributes Z that is neither a candidate key nor a subset of any key of R, and both X-->Z and Z-->Y holds.

The dependency SSN->Dmgr_ssn is transitive through Dnumber in EMP_DEPT.

Because both the dependencies SSN->Dnumber and Dnumber->Dmgr_ssn hold and Dnumber is neither a key itself nor a subset of key of EMP_DEPT.

Definition :-“According to Codd’s original definition, a relation schema R is in 3NF if it satisfied 2NF and no nonprime attribute of R is the transitively dependent on the prime key.”

The given relational schema EMP_DEPT is in 2NF, since no partial dependencies on a key exist.

However EMP_DEPT is not in 3NF because of the transitive dependency of Dmgr_ssn on SSN via Dnumber.

We can normalize EMP_DEPT by decomposing it in to two 3NF relations ED1 And ED2 represent the independent entity facts about Employee and Department .

A JOIN operation on ED1 and ED2 will recover the original relation EMP_DEPT.

EMP_DEPT

SSN Ename Bdate Address Dnumber Dname Dmgr_ssn

3rd Normal Form :-

SSN Ename Bdate Address DnumberED1

Dnumber Dname Dmgr_ssnED2

LOTS

Property_ID# Country_name Lot# Area Price Tax_Rate

Property_ID# Country_name

Lot# Area Price

FD1

FD3

FD2

FD4

1st Normal Form

2nd Normal Form

LOTS1 LOTS2

Country_name Tax_Rate

FD1

FD2

FD4

FD3

Property_ID# Country_name

Lot# Area

FD1 FD3

FD2

FD4

3rd Normal Form

LOTS1A

LOTS2

Country_name Tax_Rate

LOTS1B

Area Price

LOTS

LOTS1 LOTS2

LOTS1A

LOTS1B LOTS2

1NF

2NF

3NF

Boyce-Codd Normal Form (BCNF)

BCNF was proposed as a simpler form of 3NF, but it was found to be stricter than 3NF.

That is every relation in BCNF is also in 3NF; However, a relation in 3NF is not necessary in BCNF.

We can see the need for stronger NF than 3NF in example of LOTS relation schema with its four FDs FD1 to FD4.

Suppose we have a thousands of lots for only two countries India and USA.

Suppose also that the lots size in India are of 1.0,1.0,1.2 up to 2.0.

Where as the lots size in USA are of 2.1,2.2 up to 3.0.

In such a situation we have an additional functional dependency FD5 : Area --> Country.

The Area of LOTS can determine the Country.This relation can generate the redundancy by

repeating the same value “India” in Country for Area = 1.0,1.1,1.2 up to the 2.0. For these ten values the value for Country name would be same as “India”.

And same situation in Country USA.So that there is again need for decomposition

of the relation schema LOTS1A in to two different schemas LOTS1AX and LOTS1AY.

Definition :-A relation schema R is in BCNF if whenever a nontrivial functional dependency X --> A (3NF) holds in R, then X is super key of R.

Means if we want to check that the schema is in BCNF or not for that the schema must be in 3NF.

Then after if any FD in that schema is holds that any nonprime attribute can determine the value of any prime attribute then that will violate the second pre condition for BCNF.

So that to convert that schema in to BCNF again we have to decomposed that relation schema. Like ..:

Property_ID# Country_name

Lot# Area

FD1 FD3

FD2

FD4

3rd Normal Form

LOTS1A

LOTS2

Country_name Tax_Rate

LOTS1B

Area Price

FD5

For this FD5 Area is nonprime attribute and Country_name is our prime attribute so that FD5 violate the 2nd precondition for BCNF.So that once again we have to decompose the relation schema LOTS1A in to two different relation schema LOTS1AX and LOTS1AY then after LOTS1AX, LOTS1AY, LOTS1B, LOTS2 will the BCNF conversion of LOTS relation schema.

Property_ID# Country_name

Lot#

FD1

FD3

FD2

FD4

Boyce Codd Normal Form

LOTS1AX

LOTS2

Country_name Tax_Rate

LOTS1B

Area Price

LOTS1AY

Area Country_name

FD5

LOTS

LOTS1 LOTS2

LOTS1A

LOTS1B LOTS2

1NF

2NF

3NF

LOTS1AX LOTS1AY BCNFLOTS1B LOTS2

Relational Decomposition

A single Universal Relation Schema R={A1,A2,…An} that include all the attributes of the database.

We implicitly make the universal relation assumption, which state that every attribute name is unique.

The set of functional dependency F that should be hold on the attributes of R is specified by the database designer

Using the FDs, decompose the universal relation schema R in to a set of relation schemas D={R1,R2,R3,…..Rm} that will become the relation schema; D is called the Decomposition of R .

Properties of Relational Decomposition

1. Dependency Preservation Property :-

It would be useful if each FD X-->Y specified in F either appeared directly in one of the relation schema Ri in D or could be inferred from the FDs that appear in some Ri.

Informally this is the Dependency Preservation Condition.

It is not necessary that the exact dependency specified in F appear themselves in individual relations of the decomposition D.

It is sufficient that the union of the FDs that hold on the individual relations in D be equivalent to F (set of FDs which are hold in Universal Schema R).

Definition :-

Given a set of dependencies F on R, the Projection of F in Ri where Ri is a subset of R, is the set of FDs X --> Y in F+ such that attributes in X u Y are all contain in Ri. Hence the projection of F on each relation schema Ri in the decomposition D is the set of FDs F+ the closure of F, such that all their Left-hand side and Right-hand side attributes are in Ri.

• We say that a decomposition D={R1,R2,R3…,Rm} of R is Dependency-Preserving with respect to F if the union of projection of F on each Ri in D is equivalent to F+.

• If a decomposition is not dependency-preserving, some dependency is Lost in the decomposition.

• To check a Lost dependency holds, we must take the JOIN of to or more relations.

2. Nonadditive Join / Lossless Join :-

Another property that decomposition D should possess is the nonadditive join property which ensures that no spurious tuples are generated hen a NATURAL JOIN operation is applied to the relations in the decomposition.

Definition :-

Formally a decomposition D = {R1,R2,R3…..,Rm} of R has the Lossless Join Property with respect to the set of dependencies F on R if, for every relation state r of R that satisfied F, the following holds, where * is the NATURAL JOIN of all the relation in D: *(R1(r),R2(r)…..Rm(r)) = r

The word loss in Lossless refers to loss of information not to loss of tuples.

If decomposition does not have the lossless join property, we may get additional spurious tuples after the NATURAL JOIN operation is applied: these additional tuples represented the erroneous or invalid information.

Multivalued Dependency

So far we have discussed only functional dependency, which is by far the most important type of dependency in relational database design theory.

However, in many cases relations have constraints that can not be satisfied as FD.

In this section, we discuss the concept of Multivalued Dependency (MVD) and defined Fourth Normal Form which is based on MVD.

If we have two or more Multivalued independent attributes in the same relation schema, we get into a problem of having to repeat every value of one of the attributes with every value of the other attribute to keep the relation state consistent and to maintain the independence among the attributes involved.

Example :- Consider the relation EMP. A tuple in this EMP relation represents the fact

that an employee whose name is Ename works on the project whose name is Pname and has a dependent whose name is Dname.

An employee may work on several projects and may have several dependents, and the employee’s projects and dependents are independent from one another.

To keep the relation state consistent, we must have a separate tuple to represent every combination of an employee’s dependent and an employee’s project.

This constraint is specified as an Multivalued dependency on EMP relation.

EMP Ename Pname Dname

Smith, Smith X, Y John

Smith, Smith X, Y Anna

Ename Pname Dname

Smith X John

Smith Y Anna

Smith X Anna

Smith Y John

Two MVDs :-1. Ename -->> Pname2. Ename -->> Dname

Formal Definition for MVD :-

A Multivalued dependency X -->> Y specified on relational schema R, where X and Y are both subset of R, specifies the following constraints on any relation state r of R: If two tuples t1 and t2 exist in r such that t1[x] = t2[x], then two tuples t3 and t4 should also exist in r with the following properties, where we use Z to denote ( R – ( X u Y ) ).• t3[x] = t4[x] = t1[x] = t2[x]• t3[y] = t1[y] and t4[y] = t2[y]• t3[z] = t2[z] and t4[z] = t1[z]

• The formal definition specifies that given a particular value of X, the set of values of Y determined by this value of X is completely determined by X alone and does not depend on the values of the remaining attribute Z of R.

• Hence, whenever two tuples exist that have distinct values of Y but the same value of X, these values of Y must be repeated in separate tuples with every distinct value of Z that occurs with that same value of X.

• An MVD X -->> Y in R is called Trivial MVD if :a) Y is a subset of X or,b) X u Y = R.

• An MVD that satisfied neither (a) nor (b) is called Nontrivial MVD.

Fourth Normal Form (4NF)

We now present the definition for 4NF, which is violated when a relation has undesirable MVDs, and hence can be used to identify and decomposed such a relation schema.

Definition :-

A relation schema R is in 4NF with respect to a set of dependences F (that include FDs and MVDs) if for every nontrivial MVD X -->> Y in F+, X is a super key for R.

Ename Pname Dname

Smith X John

Smith Y Anna

Smith X Anna

Smith Y John

Brown W Jim

Brown X Jim

Brown Y Jim

Brown Z Jim

Brown W Joan

Brown X Joan

Brown Y Joan

Brown Z Joan

Brown W Bond

Brown X Bond

Brown Y Bond

Brown Z Bond

Emp :- 4th Normal Form :-

Ename Pname

Smith X

Smith Y

Brown W

Brown X

Brown Y

Brown Z

Ename Dname

Smith John

Smith Anna

Brown Jim

Brown Joan

Brown Bond

1. Emp_Project:-

2. Emp_Dept:-

Ename Pname Dname

Ename Pname

Ename Dname

Emp :- 4th Normal Form :-

1. Emp_Project:-

2. Emp_Dept:-

X -->> Y

X -->> Z

MVD1

MVD2

We can also write like :X -->> Y | Z

MVD1 X -->> Y

MVD2 X -->> Z

Join Dependency and Fifth Normal Form

There may be no FD in R that violate any NF up to BCNF, and there may be no nontrivial MVD present in R that violate 4NF.

We then resort to another dependency called Join Dependency and, if it is present, carry out a multiway decomposition into 5NF.

Such a dependency is very difficult to detect in practice; therefore normalization in 5th Normal Form is very rarely done in practice.

• Definition :-A Join Dependency (JD), Denote by JD(R1,R2,R3…..,Rn), specified on relation schema R, specified a constraint on the states r of R. The constraint state that every legal state r of R should have a nonadditive join decomposition into R1,R2,….Rn; that is, for every such r we have:*(R1(r),R2(r)…..Rn(r)) = r

• 5th Normal Form :-

A relation schema R is in 5NF (or Project Join Normal Form) (PJNF) with respect to a set F of FDs, MVDs, JDs if, for every nontrivial Join Dependency JD(R1,R2….,Rn) in F+ (that is, implied by F), every Ri is a super key.

Design Phases

For small Application it may be feasible fro a database designer who understand the application requirements to decide directly on the relations to be created, their attributes and the constraints on the relations.

However such a direct design process is difficult for real-world applications. Often no one can understand the complete data need of an application.

The database designer must interact with the users of the application to understand the needs of the application, represent them in a high-level fashion that can be understand by the users.

And finally translate the requirement into a lower levels of the design.

The data is the requirement of the users and database structure fulfills these requirements.Following are the sequential tasks to be

performed while designing database :-1. Collect Database Requirements.2. Decide the Number of Entities and their

Attributes.3. Make Relationships between Entities.4. Conceptual Designing.5. Logical Designing.6. Physical Designing.

The main phases of database designing :

1. Requirements and Data :-

The initial phase of the database designing is to characterized fully the data needs of the database users. The database designer needs to interact extensively with domain experts and users to carry out this task.

The outcome of this phase is specification of user requirements.

2. Choose Data Model :-

Next, the designer chooses the data model and by applying the concept of the data model, translate these requirements into a conceptual schema of the database. Schema developed at this conceptual-design phase provides a detail overview of the enterprise. The entity relationship model is typically used to represent the conceptual design.Typically, the conceptual-design phase result in the creation of the E-R diagram that provide the graphic representation of the schema.

3. Functional Requirements :-

A fully developed conceptual schema also indicates the functional requirements of the enterprise.

In a specification of Functional Requirements user describe the kinds of operations (Transactions) that will be performed on the data like :

• Modifying and Updating Data.• Searching and Retrieving Data.• Deleting Data.

Process of moving from an abstract data model to the implementation of the database processes in two final design phases :

• Logical-design phase• Physical-design phase

1. Logical-design Phase :-The designer maps the high-level conceptual schema into the implementation data model of the database system that will be used.The implementation data model is typically the relational data model, and this step consists of mapping the conceptual schema defined using the entity-relationship model into a relational schem.

2. Physical-design Phase :-

Finally the database designer use the resulting system specific database schema in the subsequent physical-design phase, in which the physical features of the database are specified. These features include the form of file organization and the internal storage structure.

• The physical schema of a database can be change easily after an application has been built. However changes to the logical schema are usually harder to carry out, since they may affect the number of queries and update scattered across application code.

Recommended