92
Introduction to DBMS Database management has evolved from a specialized computer application to a control component of a modern computing environment. As such, knowledge about database system has become an essential part in computer science. Because in any organization, it requires accurate and reliable data for effective decision-making. In addition the Database Management exerts centralized control of the database, prevents fraudulent or unauthorized users from accessing the data, and ensures the privacy of the data. Data: Data is defined as the collection of meaningful facts which can be stored and processed by the computer. In other words, data is the material on which computer programs work upon. Data is raw facts or figures which can be processed. Data can be a number, letter of alphabets, words, special symbols and etc. but this information is unorganized. Information: Information is defined as the processed data that will help to make the further decision. Information will be generated after arranging the data into a suitable and meaningful form. The processed version of data is called information. ~ 1 ~ Data Process Information

Introduction of Rdbms

Embed Size (px)

Citation preview

Page 1: Introduction of Rdbms

Introduction to DBMS

Database management has evolved from a specialized computer application to a control component of

a modern computing environment. As such, knowledge about database system has become an essential

part in computer science. Because in any organization, it requires accurate and reliable data for

effective decision-making. In addition the Database Management exerts centralized control of the

database, prevents fraudulent or unauthorized users from accessing the data, and ensures the privacy

of the data.

Data: Data is defined as the collection of meaningful facts which can be stored and processed by the

computer. In other words, data is the material on which computer programs work upon. Data is raw

facts or figures which can be processed. Data can be a number, letter of alphabets, words, special

symbols and etc. but this information is unorganized.

Information: Information is defined as the processed data that will help to make the further

decision. Information will be generated after arranging the data into a suitable and meaningful form.

The processed version of data is called information.

File processing system: File based system is the manual database to keep the records in which

human being manages the database without the support of computer. This is the traditional approach to

store the data. This system stores permanent records in the various files. In the file processing system

whole data scattered into various paper and notebook. When we want to find some data, we have to

search through the system from starting to end. The file based system was developed in response to the

need of industry for more efficient data access and data storage. In the early system the records was

stored in the file which is called file system.

File:A file is a collection of letters, numbers and special characters which contain record.

~ 1 ~

Data Process Information

Page 2: Introduction of Rdbms

HIERARCHY OF DATABASE

Bit 0.1

Byte 10101011(8-bits)

Field (Attribute name like name, Age, Address)

Record(One or more rows in a table)

File (Table or collection of all records)

Database (Collection of files or tables)

E.g.

Student (Database name)

Filed name or attribute name

Personal table name Academic (Table name)

~ 2 ~

Page 3: Introduction of Rdbms

Record

Disadvantages of file processing system:

Duplicacy of data: In manual system repetion of data create duplication. Duplication is wasteful as it

requires additional storage space. Data duplication creates the problem of data inconsistency. There is

possibility of information getting duplicated, this redundancy is storing same data multiple times leads

to higher costs and wastage of space.

Integrity Problem: The problem of integrity is the problem of ensuring that the data in the database is

accurate. Several types of constants must be imposed in order to enter the right type of data in the

database. But in file system it is not possible to apply the constraints.

Security Problem: Data stored in the file system must be safe from outsider. Any unauthorized user

can destroy the data and it leads to data inconsistency. In file processing system security constraints

are not easy to enforce.

Data dependence: In files data is stored separately in each file. File system leads data dependence as

when the data changed in one file, the corresponding data in other have also to be changed.

Difficult to share data: File system gives the sharing problem because data is stored in different files

uses different format. Which are not compatible with each other. Multiple users cannot share a single

file.

~ 3 ~

Name Father Name Age

John Albert 24

Ramesh Suresh 18

Page 4: Introduction of Rdbms

Difficult to get quick response: In file system, user can’t get quickly the required output. To find the

data user have to search the whole file.

Disadvantages of file processing system:

Concurrent problem: When more than one user need to access the same file at a simultaneously, is

known as concurrent problem. When more than one user at the same time read and update andf modify

the files it will create the problem as it may have inaccurate data.

Enable to represent data modeling of real world: The file system approach has the inability to

design a database which shows the basic entities, relationships and models.

What is the database?

A database is a collection of related information stored so that it is available to many users for

different purposes. The content of a database is obtained by combining data from all the different

sources in data in an organization. So that date are available to all users and redundant data can be

eliminated or at least minimized. A computer database gives us some electronic filing system,

which has a large number of ways of cross-referencing, and this allows the user many different ways

to reorganize and retrieve data. A sales base can handle business inventory, accounting and filing &

use the information in its files to prepare summaries.

Characterstics of data in the database:

Data in the database are shared by different users and application.

Data in the database are permanent.

Data should be secured and protected from unauthorized users.

Whenever more than one data elements in a database represent related data the data values

should be consisted.

Data should not be duplicated in the database.

~ 4 ~

Page 5: Introduction of Rdbms

Data should be independent in the database.

Data should be well organized.

It should be flexible to change.

Data should be correct.

Data should be available when needed

Significance of database:

Database plays an important role in the real world as we were used database in the manual system of

keeping the records where large number of files was used to store the data in the database. To solve

the problems found in following examples we need database.

Consider the example of majority of people whose phone number changed from five digits to six

digits then it would be difficult to change all the phone numbers from the telephone directory as we

need to change all the phone number. To overcome this situation we can create a new directory. With

this option the job will become more complex.

It is quite clear from the examples that traditional approach works at great accuracy if there

is less number of data to be processed. The manual approach does not work well if we have to process

larger amount of data. To overcome the problem of the manual system the database approach is used.

Why a database?

Handling of a small shop’s database can be done manually but if you have a large data base and

multiple users than in that case you have to maintain computerized database. The advantages of a

database system over traditional, paper-based methods of record keeping will perhaps make it

more apparent. Here are some of them.

1. Compactness: No need for possibly voluminous paper files.

2. Speed: The machine can retrieve and change data faster than a human being.

3. Less drudgery: Much of the sheer tedium of maintaining files by hand is eliminated.

Mechanical tasks are always better done by machines.

~ 5 ~

Page 6: Introduction of Rdbms

4. Accuracy: Accurate, up-to-date information is available on demand at any time.

Database Management System

There can be a database, which store newspaper articles, magazines, books and comics. There is

already a well-defined market for specific information for highly selected group of users on

almost all subjects. MEDLINE is a well-known database service providing medical information

for doctors and similarly WESTLAW is computer based information service catering to the

requirements of lawyers. The key for making all this possible is the manner in which the

information in the database is managed. The management of data in database system is done by

means of a generally purpose software package called a database system is done by means of a

general purpose software package called a Database Management system. The Database

Management system is the major software component of a database system. Some commercially

available DBMS are Ingress, ORACLE, and Sybase. A Database Management system, therefore,

is a combination of hardware and software that can be used to set up and monitor a database, and

can manage the updation and retrieval of database that has been stored in it.

Advantages of using DBMS

The DBMS has a number of advantages as compared to traditional file system. The DBA must

keep in mind these benefits or capabilities during designing databases, coordinating and

monitoring the DBMS. Following are the advantages of using DBMS:-

Controlling redundancy: Centralized control of data by the DBA avoids unnecessary

duplication of data and reduces the storage space. In traditional file processing, every user

maintain its own files. Each user independently creates their own data so therefore most of the

data is stored twice or more.

Minimizing data inconsistency: Data inconsistency exists when different copies of the same

data exist in different places. By controlling the data redundancy, the data consistency is obtained.

~ 6 ~

Page 7: Introduction of Rdbms

Restricting unauthorized access: Each DBMS provide security and authorization. Database

system requires you to login and then process the data. Each user can process the data according

the rights given by the DBA.

Advantages of using DBMS

Improved data access: The DBMS provide quick answer to queries of the user. The DBMS

quickly sends back the answer of the query.

Providing compatibility with programming languages: DBMS provide the compatibility with

the programming languages like C, C++, JAVA etc.

Data administrator: Data administrator means managing and organizing data from a central

point by the DBA. Administration provides a user to easily access the data according to its

requirement.

Enforcing integrity constraints: DBMS enforce integrity constraints on the data. For example,

before inserting salary information for an employee, the DBMS can check that the data

department budget is not exceeded.

Providing backup and recovery: Each DBMS provide the facilities for recovering from

hardware or software failure, such as disk crash, power failure, software errors.

Disadvantages of using DBMS

Although there are many advantages of DBMS, the DBMS may also have some minor

disadvantages. These are:

Cost of hardware and software: A processor with high speed of data processing and memory of

large size is required to run the DBMS software. The cost of these hardware and software are very

high.

Cost of data conversion: When a computer file-based system is replaced with a database system,

the data stored into data file, must be converted to database file. It is very difficult and costly

~ 7 ~

Page 8: Introduction of Rdbms

method to convert data of data files into database. You have to hire database and system

designers. So a lot of money has to be paid for data conversion.

Cost of staff training: Most DBMS are often complex systems so the training for users to use the

DBMS is required. The organization has to be paid a lot of amount for the training of staff to run

the DBMS.

Appointing technical staff: The trained technical persons such as database administrator

application programmers, data entry operators etc. are required to handle the DBMS. Companies

have to pay high salaries to these persons.

Database damage: In most of the organization, all data is integrated into a single database. If

database is damaged due to electric failure or database is corrupted on the storage madia, then you

valuable data may be lost forever.

Privacy and security is reduced: When information is centrally stored and is made available to

remote users, the possibilities of destruction of data is often more than traditional system.

Database administrator

A database administrator(DBA) is a person who is responsible for the management of a database.

DBA is a person or group of persons who implements the policies of an organization. He is

responsible for authorizing access, monitoring database use, providing satisfactory response time,

backup and recovery from the system failure. DBA has all the powers of the database.

Duties of the DBA

The main duties of the DBA are as under:

Installation of new software: It is basic the job of the DBA to install new version of DBMS

software, application software, and other software related to DBMS administration.

~ 8 ~

Page 9: Introduction of Rdbms

Data dictionary management: Management & control of data dictionary is an important role of

the DBA.DBA creates the data definition, data validation rules, documentation of data dictionary

and etc.

Duties of the DBA

Scheme defination: The creation of the original database scheme. This involves data storage and

definition language technique.

Storage structure and access method definition: Writing a set of definition used by the data

storage and language compiler.

Define integrity rules: DBA is responsible for defining integrity rules for the database. These

rules help to maintain database security.

Configuration of hardware and software with the system administrator: DBA works with the

system administrator to take decision about hardware and software settings.

Security administration: DBA has to monitor and check DBMS security. This involves adding

and removing users and provide permission to them.

Data analysis:DBA has to analyze the data stored in the database and he has to give

recommendation relating to performance and efficiency of that data storage.

Database design: A DBA has to design database on each level. DBA creates the policy about

internal, conceptual, and external level.

Components of the DBMS

A database system is composed of four components;

Data

Hardware

Software

~ 9 ~

Page 10: Introduction of Rdbms

Users

Which coordinate with each other to form an effective database system.

Data: It is a very important component of the database system. Most of the organization generate,

store and process large amount of data. The data acts a bridge between the machine and the user.

User can directly access the data from the machine. Data may be of different types:

User Data : The user data is stored in rows and column in the form of tables.

Metadata: A description about the structure of the database is known as Metadata. It means “data

about data”. It stores in data dictionary.

Hardware: The hardware consists of any kind of physical storage device like secondary storage

devices like disks(hard disk, zip disk, floppy disk), optical disks(CD-ROM), magnetic tapes etc. on

which data is stored together with the Input/Output devices (mouse, keyboard, printers),

processors, main memory etc. which are used for storing and retrieving the data in a fast and

efficient manner.

~ 10 ~

users

Application program

Application program

Data and program

Application program

Application program

database

DBMS

users

Page 11: Introduction of Rdbms

Components of the DBMS

Software: The software acts as a bridge between the users and the database. Software interacts

with the users, application programs, and database and file system. Software helps to insert, delete,

update, and retrieve data.

Users: Users are those persons who need the information from the database. On the basis of the

job and requirement made by them they are provided access to the database totally or partially.

The various types of users which are access the database are:

Database administrator (DBA)

Database designer

End users

Application programmer

Database administrator: Administrator (DBA) is a person who is responsible to manage

implements the policies of an organization. He is responsible for authorizing access, monitoring

database use, providing satisfactory response time, back up and recovery from the system failure.

DBA has all the powers of the database.

Database designer: Database designer performs the duty to identify the data to be stored and also

choose the database structure.

End users: End users are those users who interact with the database and they use the information

stored in the database.

~ 11 ~

Page 12: Introduction of Rdbms

Application programmer: These are the professional users, who are responsible for writing the

application program. The application program could be written in a general purposes language

such as C, COBOL, and embedded in the program.

Structure of DBMS:

DBMS (Database Management System) acts as an interface between the user and the database.

The user requests the DBMS to perform various operation (insert, delete, update and retrieval) on

the database. The components of DBMS perform these requested operations on the database and

provide necessary data to the users. The various components of DBMS are shown below:

~ 12 ~

File manager

Data manager

Database

User User User

Query processor

DDL/DMLCompiler

Data dictionary

Query optimizer

Page 13: Introduction of Rdbms

DDL Comiler:

Data Description Language compiler processes schema definition specified in the DDL. It uses the

information such as name of the file, data items, storage details of each file etc. it works in

CREATE and ALTER query.

DML Compiler and Query optimizer:

The DML commands such as insert, delete, update can be processed through DML compiler. It

compiles your query and convert into the object code. The object code is the best way to execute a

query by the query optimizer and then send to the data manager.

Data Manager:

The Data Manager is the central software component of the DBMS also known as Database

Control System.

The main features of Data Manager are:-

o Convert operation in user’s queries.

o Control information access.

o It also control buffer in main memory.

o It also implements constraint.

o It also control the back up and recovery operations.

Data Dictionary:

Data Dictionary stores the description of data in the database. It contains information about

Data – names of the tables, names of the attributes of each table, length of attributes, and

numbers of rows in each table.

~ 13 ~

Page 14: Introduction of Rdbms

It stores various relation and links between different tables.

Constraints on data i.e. range of value permitted.

Detailed information on physical database design such as storage structure, access paths,

files and record sizes.

It stores access permission or rights on table.

Importance of data Dictionary:

Data dictionary is necessary in the data base due to following reason:

It improves the control of DBA over the information system.

It helps in documentation the database design process.

It helps in searching the view on the database.

It promotes data Independence.

Data Files:

It contain the data portion of the database.

Compiled DML:

The DML complier converts the high level query into low level commands known as complied

DML.

End User:

End user are those user who interact with the database & they use information store in the

database.

Architecture of the DBMS:

~ 14 ~

Page 15: Introduction of Rdbms

The three level architecture is also known as three Schema Architecture. The purpose of the three

level architecture is to separate the user application and the physical database. The three level

architecture is a convenient tool by which user can visualize data according to their level.

Schema: The view at each level is described by scheme. A scheme is an outline or plan that

describes the record and relationships existing in the view. Schema is basically used to define the

structure of the database.

The Three-Schema Architecture:

Objectives:

~ 15 ~

End users

External view viview

External view

External/conceptual mapping

Conceptual Schema

Conceptual/Internal mapping

Internal Schema

Stored database

Page 16: Introduction of Rdbms

1. It allows independent customized user views. Each user should be able to access the same

data in different views. These views should be independent.

2. It hides the physical storage details from the users.

3. The database administrator should be able to change the database storage structure without

affecting the user’s views.

Database Schemas:

There are three different types of schema corresponding to the three levels in the ANSI-SPARC

architecture.

The external schemas describes the different external views of the data.

The conceptual schema describes all the data items and relationships between them.

The internal schema defines the storage structure of data on physical device.

Levels

The three level architecture is divided into three view levels;

External view level

Conceptual view level

Internal view level

External data level

The external view is at the Highest Level of database Abstraction. The external view is closest to

the users. It describes how data can be viewed by individual users. The external view are defined

by the help of external schemas.

Conceptual view:

At this level of database abstraction all the entities and the relationship among them are included.

One conceptual view represent the entire database. This conceptual view is defined by the

~ 16 ~

Page 17: Introduction of Rdbms

conceptual schema. This view is also known as logical view. Logical view is basically concerned

with the DBA, who decide what information to be kept in the database.

Internal view:

The internal view is also referred to as physical view level. It describes the physical storage

structure of database. The internal view describes how the data is stored in the database. It works

with the operating system and DBMS for storing and retrieving data to and from the storage

devices.

Mapping

The process to convert the request(from external level) and the result between levels is called

mapping. The mapping defines the correspondence between three levels. The mapping description

is also stored in the data dictionary. The DBMS is responsible for mapping between these three

types of schemas.

Interfaces

External/ conceptual mapping functional APIs views

Object relations

conceptual/ internal mapping

data structures

files &records

There are two types of mapping.

External-conceptual mapping

Conceptual-internal mapping

~ 17 ~

External

Conceptual

Internal

Page 18: Introduction of Rdbms

External –Conceptual mapping

An external-conceptual mapping defines the correspondence between a particular external view

and the conceptual view. The external-conceptual mapping tells the DBMS which objects on the

requested on the conceptual level will adjust in external level in which form. If changes are made

to either an external view or conceptual view then mapping must be changed accordingly.

Conceptual-internal mapping

The conceptual-internal mapping defines the correspondence between the conceptual view and the

internal view. It describes how conceptual records are stored and retrieved to and from the storage

device. This means that conceptual-internal mapping tells the DBMS that how the conceptual!

Records are physically represented. If the structure of the stored database is changed, then the

mapping must be changed accordingly. It is the responsibility of DBA to manage such changes.

Data Independence

The ability to modify a scheme (structure) in one level without affecting a scheme definition in a

higher level is called data independence. Data independence is a form of database management

that keeps data separated from all programs that make use of the data.

There are two kinds of data independence:

Physical data independence

Logical data independence

Physical data independence:

The ability to change the physical schema without changing the logical schema is called physical

data independence. For example, a change to the internal schema such as storage devices, storage

~ 18 ~

Page 19: Introduction of Rdbms

location should be possible without having to change the conceptual or external schemas.

Alteration in the internal schema might include.

Using new storage devices.

Using different data structure

Switching from one access method to another

Using different file organization or storage structure.

Logical data independence:

The ability to change the logical (conceptual) schema without changing the physical schema or

external schema is called logical data independence. For example , the addition or removal of new

attributes, to the conceptual schema should be possible without having to change existing external

schemas. It means that logical data independence gives us the the freedom of changing the

conceptual schema without worrying about external schema.

Database Language

A DBMS is a software package that carries out different task such as giving the facilities to user to

access and modify the information in the database.

There are three types of database languages:

1. Data Definition Language(DDL)

2. Data Manipulation Language(DML)

3. Fourth Generation Language(4GL)

Data Definition Language:

~ 19 ~

Page 20: Introduction of Rdbms

A Data Definition Language (DDL) is a computer language for defining the type of data structure

used in the database. DDL statements are used to create, modify and remove database objects such

as tables, indexes, and users. Common DDL statements are CREATE, ALTER and DROP.

The main features of DDL are as under:

Define the data directory that contains metadata(data about data).

Define integrity constraints on various tables and fields.

Define the storage structure and access methods used by the database system are specified

by a set of definition in a special type of DDL called a data storage and definition language.

Data Manipulation Language (DML)

Data Manipulation Language (DML) is used to retrieve , insert, delete and update in a database.

Currently the most popular data manipulation language is SQL, which is used to retrieve and

manipulate data in a Relational database. A DML is a language which enables users to access and

manipulate data.

Data Manipulation performs following actions in a table:

Retrieval of information from the database

Insertion of new information in to the database

Deleting of information in the database

Modification of information in the database

Fourth-generation language

Fourth generation language consists of English-like words and phrases. When they are

implemented on microcomputers, some of these languages include graphics devices such as icons

onscreen push buttons for use during programming. Many fourth generation use Structured Query

Language (SQL) as the basis for operations. Examples of fourth –generation language include

PROLOG(an artificial intelligence language)

~ 20 ~

Page 21: Introduction of Rdbms

Type of DBMS

A data model is a collection of concepts that can be used to describe the structure of a database and

provides the necessary means to achieve this abstraction whereas structure of a database mean the data

type, relationships and constraints that should hold on the data. The various data models that have

been proposed fall into three different groups. Object based logical models, record-based logical

models and physical models.

Object-Based Logical Models: They are used in describing data at the logical and view levels.

They are characterized by the fact that they provide fairly flexible structuring capabilities and allow

data constraints to be specified explicitly. There are many different models and more are likely to

come. Several of the more widely known ones are:

The E-R model

The object-oriented model

The semantic data model

The functional data model

ENTITY RELATIONSHIP MODEL

~ 21 ~

Page 22: Introduction of Rdbms

The (E-R) data model is based on a perception of a real worker that consist of a collection

of basic objects, called entities, and or relationships among these objects.

i) Rectangles, which represent entity sets.

ii) Ellipses, which represent attributes.

iii) Diamonds, which represent relationships among entity sets.

iv) Lines, which link attributes to entity sets and entity sets to relationships.

E.g.suppose we have two entities like customer and account, then these two entities can be

modeled as follow:-

The entity-relationship model for data uses three features to describe data. These are:-

i) Entities:- Which specify distinct real-world items is an application.

ii) Relationships:- Which connect entities and represent meaningful dependencies between

them.

~ 22 ~

Customer name

Customer city

Customer Deposit Account

Account number

Balance

Page 23: Introduction of Rdbms

iii) Attributes:- Which specify properties of entities and relationships.

We illustrate these terms with an example. A vendor supplying items to accompany, for

example, is an entity

(Vendor code, Vendor name, address)

An item may be described by the attributes:

(item code, item name)

HIERARCHICAL MODEL

Hierarchical model was developed during 1970s. The traditional file system rapidly gave way to

hierarchical bases. In these type of databases organized in the form of child parent relationship

forming a tree structure. This model of data support one-to-many relationship.

Let us Look at the following facts about accounts of a bank:-

i) The bank has three types of accounts- Current, Saving, and fixed deposit.

ii) Each of these types of account may have many customer’s accounts.

iii) Each of these types of transactions may have many transactions.

ADVANTAGES OF HIERARCHICAL DATABASE

The advantages of offered by hierarchical databases over traditional file system of handling

data are listed below:-

~ 23 ~

Page 24: Introduction of Rdbms

i) Simplicity:- Data naturally have hierarchical relationship in most of the practical

situations. Therefore, it is easier to view data arranged in this manner. This makes this type

of database more suitable for the purpose. The design process, consequently, is simple.

ii) Security:- These database system can enforce varying degree of security features unlike

flat-file system.

iii) Database integrity:- Because of its inherent parent-child structure, database integrity is

highly promoted in these systems.

iv) Efficiency:- For 1:M type relationships, these types of databases are very efficient.

DISADVANTAGES OF HIERARCHICAL DATABASE

Although hierarchical database was indeed a technical breakthrough, it suffered from some

disadvantages as mentioned below. Due to these limitations, it rapidly lost its popularity

among users:-

i) Complexity of implementation:- The actual implementation of a hierarchical database

depends on the physical storage of data. This makes the implementation complicated.

ii) Difficulty in management:- The movement of a data segment from one location to another

causes all the accessing programs to be modifies making database management a complex

affair.

iii) Structural dependence:- The database has a rigidly defined relationship and hence any

change in any part of the structure of the database would require change in the program

accessing it. This makes maintenance very difficult.

iv) Complexity of programming:- Programming a hierarchical database is relatively complex

because the programmers must know the physical path of the data items.

v) Poor portability:- The databases are not easily portable mainly because there is little or no

standard existing for these types of databases. .

~ 24 ~

Page 25: Introduction of Rdbms

NETWORK MODEL

Network database was developed with the view to eliminate the problems faced by

hierarchical database during easily 1980s.

The structure of network databases resemble closely with that of a hierarchical database.

The major difference between two is that while in later each child record can have only one

parent record there can be more than one parent record in case of the former. This

facilitates the implementation of N:M relationship, which occurs more often than 1:M

relationship.

An example of network database structure is given below:-

Name Street City A/C no Balance

Rajan Vihar Del S-5353 5000

Mohan P.Nagar Mum' F-123 4600

Shyamkant T.Nagar Kol F-409 60

Harpal Salt lake Che E-99 4050

Tejbir A Pura Del F-783 10000

ADVANTAGES OF NETWORK DATABASE

Network database offer many advantages other than ones offered by hierarchical database. Some

of them are mentioned below:-

~ 25 ~

Page 26: Introduction of Rdbms

i) Simplicity:- Since most of the data relationship instances are naturally of N:M type, this

database structure is simple to design.

ii) Better relationship handling:- N:M relationship can accommodate most of the complex

relationships existing in actual data relationship sceneries.

iii) Flexibility in data access:- Data items can be navigated in more than one way providing

much desired flexibility of the data access.

iv) Standards:- Universal standards have been developed and enforced in these types of

databases.

DISADVANTAGES OF NETWORK DATABASE

Although Network database displayed significant improvement over hierarchical databases yet

they suffered from suffered from disadvantages as mentioned below:-

i) System Complexity:- As expected the implementation of these types of databases is not

simple.

ii) Structural dependence:- Since access depends on the navigational paths that exist in the

database at any time, the programs are not independent of the database structure and need

to be modified whenever database structure is modified.

iii) Due to these limitations, they rapidly lost their popularity among users and were

subsequently replaced by relational databases.

Relational Model

~ 26 ~

Page 27: Introduction of Rdbms

Dr E.F Codd first introduced the Relational database model in 1970. This data model was evolved

using rigorous theoretical considerations based on relation calculus. However, at that time its

implementation was not considered practical.

The relational modal allows data to be represented in a simple table like now row-column

format. A table, in the parlance of Relational Database, is referred to as a relation. Each data field is

considered as a column and each record is considered as a row

In a relational database data are arranged in files called data base files. A database file may

contain one or more table. A table is composed of a number of row and columns. A row is called a

record and a column

A record contains information regarding a single person or thing of interest. A field

stores data regarding a particular aspect of that person or thing. The organization is illustrated below:

(Attributes )

Student Roll no Name Age address phone

01 Ram 23 22,westend, N.Delhi 6456262

02 Mohan 24 12,jal vihar,N.Delhi 434221

03 Gita 45 25,Sec-34,Hisar 324545

04 Sita 35 34,Sec-32,Chandigarh 534421

05 Madhu 24 222,Sec-3,Karnal 122334

A bank account database may be represented by two tables as shown below:-

~ 27 ~

nk

Page 28: Introduction of Rdbms

Customer name Customer Street Customer City Account Number

Anurag Srivastava Green Park Delhi 42073

Bhavana Park Street Delhi 2345

Nitin Salt lake Calcutta 12345

Shweta Remington Nd 34567

Account Number Balance

42073 45000

2345 23455

45678 45678

34567 76543

ADVANTAGES OF RELATIONAL DATABASE

Relational Database models adds significant advantages to those offered by the previous

two types of databases. Some of these advantages are:-

i) Structural independence:- Since there is no need of navigational arrangement for the data

access, this database offers complete structural independence. Any change in the structure

of the data base does not affect the DBMS’s data access mechanism in any way. The

programs, therefore, do not need any modification even when the structure is modified.

ii) Simplicity of design and use:- Relational database posses data independence and

structural independence. Therefore design and implementation is easier than the other types

of databases.

~ 28 ~

Page 29: Introduction of Rdbms

iii) Advanced and capability:- Querying a relational database is very simple, highly efficient

and very powerful. It supports SQL because both have their foundation in the relational

algebra and relational calculus.

DISADVANTAGES OF RELATIONAL DATABASE

The advantages of the relational database are, however, not without cost.Some of the most

noted disadvantages are:-

i) Increased overhead:- RDBMS shoulders much of the users responsibility and therefore it

does more complex activities thereby increasing system overhead.

OBJECT ORIENTED DATABASE

Databases discussed so far were found utterly inadequate objects. With the popularity of object

oriented technologies, the need for object oriented database systems was soon felt.

When database capability are integrated with object programming languagecapabilities, the result

is an object database management system (ODBMS). An ODBMS makes database objects appear

as programming language objects in one or more object programming languages. An ODBMS

extends the language with transparently persistent data, concurrency control, data recovery,

associative queries, and other capability.

Object-oriented databases are designed to work well with object-oriented programming

language such java, c# and c++.

~ 29 ~

Page 30: Introduction of Rdbms

ODBMS provide the lowest cost for development and best performance combination when using

objects because they store objects on disk and have the transparent program integration with

object-oriented programming language.

ADVANTAGES OF OBJECT ORIENTED DATABASE

There are two primary benefits in using OOD. Both benefits reflect a basic-idea when an ODBMS

is used, the way you use your data is the way you store it.

The First benefit can be found in development. When you use an ODBMS, you write less code

than if you were writing to an RDBMS. The reason for the smaller amount of code is simple-when

you are using java or C++-you don’t have to translate into a database sub language such as SQL,

ODBC, or JDBC.

The second benefit is related to production. If you are working with complex data,

ODBMS can give you performance that is ten to a thousand times faster than RDBMS.

OBJECT ORIENTED DATABASE

As the name implies the object relational model supports both objects oriented and

relational concepts. It eliminates certain discrepancies in the relational model. In this model

it is possible to provide well-defined interfaces for the application. It facilitate reusability

of the database objects. A structure once created can be refused this in the fundamental

property of the OOP’s concept. By combining the object oriented and relational concept

Oracle now offers the best of both worlds. Under this model a data item is considered as an

object having attributes and methods and may optionally contain other objects.

ADVANTAGES OF OBJECT RELATIONAL DATABASE

~ 30 ~

Page 31: Introduction of Rdbms

The advantages of an object relational database are:-

i) Semantic content handling:- The object representation of data carries more meaning.

ii) Simplicity of design:- Because of its object orientation the object oriented design

approaches may be employed to design an object database which is not only easy but also

powerful.

iii) Database integrity:- The database integrity is ensured to the maximum level because of

the inherent safe character of objects.

iv) Structural and data independence:- Object relational database provides best of the two

worlds. There is complete structural and data independence. The database is most easy to

maintain.

DISADVANTAGES OF OBJECT RELATIONAL DATABASE

Object relational databases suffer from the following limitations :-

i) Lack of standard:- Object data model standard are yet to evolve. Therefore this part of the

databases is still mostly vendor-specific.

ii) Increased system overhead:- Because of the high system complexity the system overhead

is proportionally high. Sometimes, the results into retarded performance.

RELATIONAL DATABASE MANAGEMENT SYSTEMS

~ 31 ~

Page 32: Introduction of Rdbms

The relational database approach is relatively recent and begun with a theoretical paper of Codd

which proposed that by using a technique called normalization the entanglement observation in the

tree and network structure can be replaced by a relatively neat structure. Codd principles relate to

the logical description of the data and it is important to bear in mind that this is quite independent

and feasible way in which the data is stored.

The difference that arise in the relational approach is in setting up relationships between

different tables. This actually makes use to certain mathematical operations on the relational such

as projection, union, joins, etc .

These operations are from relational algebra and relational calculus. Similarly in order to achieve

the organization of the data in terms of the data in terms of tables in a satisfactory manner, a

technique called normalization is used.

Normalization is a technique, which helps in determining the most appropriate grouping of

data items into records, segments or tuples. This is necessary because in the relational model the

data items are arranged in tables, which indicate the structure, relationship and integrity in the

following manner:-

i) In an given column of a table, all items are of the same kind.

ii) Each item is a simple number or a character string.

iii) All rows of a table are distinct. In other words, no two rows are identical in every column.

iv) Ordering of rows within a table is immaterial.

v) The columns of a table is assigned distinct names and the ordering of these columns is

immaterial.

vi) If a table has N columns, it is said to be agree N. This is sometimes also referred to as the

carminative of the table. From a few base tables it is possible by setting up relations; create views;

which provide the necessary information to the different users of the same database.

ADVANTAGES OF RELATIONAL APPROACH

~ 32 ~

Page 33: Introduction of Rdbms

The popularity of the relational database approach has been apart from access availability

of a large variety of products also because it has certain inherent advantages.

i) Ease of use:- The revision of any information as table consisting of rows and columns is

quite natural and therefore, even first time users find it attractive

ii) Flexibility:- Different table from which information has to be linked and extracted can be

easily manipulated by operators such as project and joint to give information in the form in

which it is desired.

iii) Precision:- The usage of relational algebra and relational calculus in the manipulation of

the relations between the tables ensures that there is no ambiguity, which may otherwise

arise in establishing the linkage in a complicated network type database.

iv) Security:- Security control and authorization can also be implemented more easily by

moving sensitive attributes in a given table into a separate relation with its own

authorization controls.

v) Data independence:- Data independence is achieved more easily with normalization

structure used in a relational database than n the more complicated tree or network

structure.

vi) Data Manipulation Language:- The possibility of responding to ad-hoc query be means

of a language based on relational algebra and relational calculus easy in the relational

database approach. For data organized in other structure the query language either becomes

complex or extremely limited its capabilities.

DISADVANTAGES OF RELATIONAL APPROACH

One should not get carried away into believing that there ca be no alternative to the

RDBMS. This is not so. A major constraint and therefore, disadvantage in the use of relational

database system is machine performance. If the number of tables between which relationships to

be establishment are large and the tables themselves are voluminous, the performance in

~ 33 ~

Page 34: Introduction of Rdbms

responding to queries is definitely degraded. It must be appreciated that the simplicity in the

relational database approach arises in the logical view. With an interactive system, for example,

an operation like join would depend upon the physical storage also.

AN EXAMPLE OF A RELATIONAL MODEL

Let us see important features of a RDBMS through some examples:-

A relation has the following properties:-

i) Each column contain values about the same attribute, and each table cell value must be

simple (a single value).

ii) Each column has a district name (attribute name) and order of column is immaterial.

iii) Each row is distinct, i.e, one row cannot duplicate another row for selected key attribute

columns.

iv) The sequence of the row is immaterial.

Product Relation

Attributes

product# Description Price Quantity On Hand Relative Record#

0100 Table 50.00 42 1

0975 Wall Unit 750.00 0 2

1250 Chair 400.00 13 3

~ 34 ~

Page 35: Introduction of Rdbms

1775 Dresser 500.00 8 4

Primary key

Vendor Relation

Vendor Vendor_Name Vendor_City

26 Maple Hill Denvar

13 Cedar Crest Boulder

16 Oak Peak Frankfin

12 Cherry Mtn London

Supplies Relation

Vendor# Product Vendor_price

13 1775 250.00

16 0100 150.00

16 1250 200.00

26 1250 200.00

26 1775 275.00

~ 35 ~

Page 36: Introduction of Rdbms

As shown in the above example a tuple is the collection of values that compose one row of a

relation. A tuple is equivalent to a record instance. As an tuple is a tuple composed of n attribute

values, where n is called the degree of the relation is its cardinality.

A domain is the set of possible values for an attribute. For example, the domain for

QUANTITY_ON_HAND in the product relation is all integers greater than or equal to zero. The

domain for CITY in the VENDOR relation is a set of alphabetic character string restricted to the

names of U.S.cities

We can use a shorthand notation to abstractly represent relations (or table).

The three relations of the example can be written in this notation as:-

PRODUCT (PROUCT#, DESCRIPTION, PRICE, QUANTITY_ON_HAND,

RELATIVE RECORD)

VENDOR (VENDOR#, VANDOR_NAME, VENDOR_CITY_)

SUPPLIES (VENDOR#, VENDOR, VENDOR_PRICE)

CONCEPT OF A RELATIONAL MODEL

The relational model was propounded by E.F.Codd of the IBM in 1972. The basic concept

in the relational model is that of a relation.

A relation can be viewed as a table, which has the following properties:-

i) Property 1:- It is column homogeneous. In other words, in any given column of a table all

items are of the same kind.

~ 36 ~

Page 37: Introduction of Rdbms

ii) Property 2:- Each item is a simple number or a character string. That is, a table must be in

INF (first normal form), which will be introduced, in the next unit.

iii) Property 3:- All rows of a table are distinct.

iv) Property 4:- The ordering of rows within a table is immaterial.

v) Property 5:- The columns of a table are assigned distinct names and the

ordering of these columns is immaterial.

Example of valid relation

S# P# SCITY

10 1 Bangalore

10 2 Bangalore

11 1 Bangalore

11 2 Bangalore

The relational model is an abstract theory of data that is based on certain aspects of

mathematics (principally set theory and predicate logic). Relational model is a way of looking at

data. The relational model is concerned with three aspects of data structures, data integrity and

manipulation (for example join, projection etc.)

i) Structure Aspects:- The data in the database is perceived by the user as a table .It means

database arranged in tables and collection of tables called database. Structure means design

view of database like data type, its size etc.

ii) Integrity Aspects:- Those tables that satisfy certain integrity constraints like domain

constraints, entity integrity, referential integrity and operational constraints.

iii) Manipulate Aspect:- The operators available for the user for manipulating, those tables

into database e.g, for purpose of retrieval of data like projection, join and restrict.

~ 37 ~

Page 38: Introduction of Rdbms

THE CODD COMMANDMENTS

CODD’s 12 RULES:-

E.F.Codd has defined twelve rules that should be satisfied by any DBMS to be benchmarked as a

RDBMS. These twelve rules are the guidelines on which all the RDBMS like ORACLE, INGRES,

SYBASE, INFORMIX are based.

THE TWELVE RULES

Just as in the 12 rules that define the distributed product, there is a single overall rule, which in

some ways covers all others and is commonly called Rule0. It states that any truly relational

database must be manageable entirely through its own relational capabilities.

Rule One: The information rule

All information in a relational database is represented explicitly at the logical level and in

exactly one way-by values in tables.

Rule Two: The Guaranteed rule

Each and every datum in a relational database is guaranteed to be logically accessible by

resorting to a combination of table name, primary key value and column name.

Rule Three: The Systematic treatment of null values

Null values are supported for representing missing information and in applicable

information in a systematic way independent of data type.

Rule Four: The Dynamic On-Line catalog based on the relational model

~ 38 ~

Page 39: Introduction of Rdbms

The database description is represented at the logical level in the same way as ordinary data

so that authorized users can apply the same relational language to its interrogation as they

apply to the regular date.

Rule Five: The Comprehensive Data Sub-Language rule

A relational system may support several languages and various modes terminal use.

However, there must be at least one language whose statements can express all of the

following:-

Data definition, View definition, Data manipulation, Integrity Constraint, Authorization,

Transaction Boundaries.

Rule Six: View updating rule

All views that are theoretically updateable are also updateable by the system.

Rule Seven: High Level insert, Update and Delete

The capability of handling a base relation or a derived relation as a single operand applies

not only to the retrieval of the data but also to the insertion, updation and deletion

Rule Eight: Physical Data Independence

Application program and terminal activities remain logically unimpaired whenever any

changes are made in either storage representation or access methods.

Rule Nine: Logical Data Independence

Application program and terminal activities remain logically unimpaired when information

–preserving changes of any kind that theoretically permitted modifications are made to the

basic tables.

~ 39 ~

Page 40: Introduction of Rdbms

Rule Ten: Integrity Independence

Integrity constraints specific to a particular relational database must be definable in the

relational data sub-language and storable in the catalogue, not in the application programs.

Rule Eleven: Distribution Independence

The data manipulation data sub-language of a relational DBMS must enable application

programs and enquires to remain logically the same whether data is physically centralized

or distributed.

Rule Twelve: Non-subversion Rule

If a relational system has a low level language, that low level cannot be used to subvert or

bypass the integrity rules and constraints expressed in the higher-level relational languages.

NORMALIZATION

Normalization is the process of efficiently organizing data in a database.The basic objective of the

normalization is to reduce redundancy. Redundancy is unnecessary repetition of data, which can

cause problem with storage retrieval and updation of data.

In the process of normalization data are grouped together in the simplest way so

that easily changes can be made.

~ 40 ~

Page 41: Introduction of Rdbms

Normalization usually involves dividing a database into two or more tables and defining

relationships between the tables.

The normalization is a step by step process removing redundancies and dependencies of

columns in the table.

Normalization is typically a refinement process.

GOALS OF NORMALIZATION PROCESS

i) Eliminating redundant data.

ii) Ensuring data dependencies make sense.

iii) Eliminate the columns that are not dependent on key attribute.

NEEDS FOR NORMALIZATION

i) Improves database design with the help of normalization.

ii) It ensure no duplication of data.

iii) Reduces need to reorganize data when design is modified or enhanced.

iv) To make the database structure flexible. In other words it should be possible to add new

rows and data values without disturbing the database structure.

v) Data should be consistent through out the data base.

ADVANTAGES OF NORMALIZATION

i) Integrity:- Each entity is described only once, there are no repeating data.

ii) Removes duplicity:- Normalization makes everyone’s job easier due to the fact. Repeated

data is processed and simplified into single data. So there no chance of duplicacy in

normalization.

~ 41 ~

Page 42: Introduction of Rdbms

iii) Breaking the data into pieces:- Normalization breaks the database down into smaller

tables making it much easier for database users.

iv) Security:- Security will also become easier to control because the database administrator

can checked which users have access to tables.

v) Indexing:- Speed up the access of data and improves the performance.

vi) Referential integrity:- Referential integrity means that one columns in a table has to relate

a column in another table.

DISADVANTAGES OF NORMALIZATION

i) Normalization is a time consuming process.

ii) On normalizations the relation performance of the database degrades.

iii) Complexity increases.

iv) Careless normalizing makes the database design out of scope. So it will be difficult to.

handle database.

TYPES OF NORMALIZATION

There are various types of normalization forms. There forms are listed from one to five.

Numbers:-

~ 42 ~

Page 43: Introduction of Rdbms

i) First Normal forms (1NF):- The first normal form is the starting step of normalization in

this step the unnormalized data is converted into first normal form. It removes the

duplicacy of data.

Rules for 1NF

i) A table is in 1nf if it does not contain any repeating data.

ii) Each value in a table stores only once (each value should be atomic)

iii) Create separate table for each groups of data.

iv) Each row in a table must be uniquely identify primary key column.

ii) Second Normal forms (2NF):- A table which is in 1NF must meet additional criteria to

qualify for 2NF. 1NF table is in 2NF if and only if it given a candidate key. Every field in

2NF is functionally dependent on the keys columns (Primary + candidate key) in simple

words a table is in 2NF if all the columns (non key columns) are fully dependent on key

columns.

iii) Third Normal forms (3NF):- A table is in 3NF if it fulfills all the conditions of 2NF.

i) A table is in 3NF if all the non-key attributes are independent from other non key attributes.

ii) A table is in 3NF if all the columns of the table directly dependent on the primary key

column.

iii) Any column which is not dependent on primary key is moved out to a separate table.

iv) BCNF (BOYCE CODE NORMAL FORM):-

i) A relational table is set to be in BCNF if it is in third normal form (3NF).

~ 43 ~

Page 44: Introduction of Rdbms

ii) It should have multiple candidate key.

iii) No dependency between one multiattribute column and another multiattribute column.

iv) Forth Normal forms (4NF) :- A table is in forth normal form if it fulfills all the criteria of

BCNF.

i) A table is in 4NF if it has no multi value dependency.

ii) 4NF is a storage normal form than BCNF.

iii) It prevents table from containing multi value dependency. So it means 4NF is used to

remove multi values dependency column from the table.

v) Fifth Normal forms (5NF) :- 5NF is used to join dependency. It means it cannot divide

the table. It recombined the original table. It also called project join normal form (PJRF)

THE DATABASE DESIGN PROCESS:-

The process of database design can be stated as follows

Design the logical and physical structure of one or more databases to accommodate the

information needs of the users in an organization for a defined set of applications.

~ 44 ~

Page 45: Introduction of Rdbms

The goals of database design are multiple:-

i) Satisfy the information content requirement of the specified users and applications.

ii) Provide a natural and easy-to-understand structuring of the information.

iii) Support processing requirements and any performance objectives such as response time,

processing time and storage space.

These goals are very hard to accomplish and measure, and they involves an inherent

tradeoff: if one attempt to achieve more “naturalness” and “understand ability” of the

model, it may be at the cost of performance. The problem is aggravated because the

database design process often begins with informal and poorly defined requirements. It

contrast, the result of the design activity is a rigidly defined database schema that cannot be

modified easily once the database is implemented. We can identify six main phase of the

database design process:

Requirements collections and analysis.

Conceptual database design.

Data model mapping (also called logical database design)

Physical database design.

Database system implementation and tuning.

Phase 1: Requirements Data Requirements Processing

COLLECTIONS Requirements

And Analysis

~ 45 ~

Page 46: Introduction of Rdbms

Phase 2: Conceptual Conceptual Transaction and

Database Schema Design Application design

Design DBMS-independent (DBMS independent)

Phase 3: Choice of DBMS

Phase 4: Data Model

Mapping Logical Schema Frequencies

(Logical design) Add view Design Performance

(DBMS-dependent) Constraints

Phase 5: Physical Design Internal

Schema Design

(DBMS-dependent)

Phase 6: System

Implementation DDL Statement Transaction and

And tuning SDL Statement Application

Implementation

Phase of database design for large Databases

The data process consist of two parallel activities, as shown in figure above. The first activity

involves the design of the data content structure of the database; the second relates to the design of

~ 46 ~

Page 47: Introduction of Rdbms

database applications. To keep the figure simple, we have avoided showing most of the actions

between these two sides, but the two activities are closely intertwined.

Traditionally, database design methodologies have primarily focused on the first of these activities

whereas software design has focused on the second; this may be called data-driven versus process-

driven design.

i) Requirement collection & Analysis (Phase 1):- Before we can effectively design a

database we must know and analyze the expectations of the users and the intended uses of

the database in as much detail as possible.

Typically, the following activities are part of this phase:-

i) The major application areas and user groups that will use the database on whose work will

be effected layout are identified.

ii) Existing Documentation concerning the application is studied & analyzed.

iii) The current operating environment & planned use of the information is studied. This

included analysis of the types of transaction & their frequencies as well as of the flow of

information within the system.

iv) Written responses to sets of questions are some times collected from the potential database

users or user groups.

v) These questions involve the users properties & the importance they place an various

application.

ii) Conceptual database design (Phase 2):- The goal of this phase is to produce a conceptual

schema for the database that is independent of a specific DBMS.

i) We often use a high level data model o the known database applications during this phase.

~ 47 ~

Page 48: Introduction of Rdbms

ii) In addition we specify as many of the known database application on transactions as

possibly using a notation that is independent of any specific DBMS.

iii) Often the DBMS choice is already made for the organization, the intent of conceptual

design us still to keep, it as free as possible from implementation considerations.

iii) Choice of DBMS (Phase 3):- The choice of a DBMS is governed by a no. of factor, some

technical, other economic & still another concerned with the politics of the organization.

Technical factors are concerned with the suitability of the DBMS for the task at hand.

The economic & organizational factors the offer the choice of DBMS are:-

i) Software acquisition cost.

ii) Maintenance cost

iii) Database creation & conversion Cost

iv) Personnel cost

v) Training Cost

vi) Operating cost

iv) Data model mapping (Phase 4):- During this phase we map the conceptual schema from

the high level data model used in phase 2 into the model of the choice DBMS.

We can start this phase after choosing a specific type of DBMS:-

For example if we decided to use some relational DBMS but not yet decided

~ 48 ~

Page 49: Introduction of Rdbms

on which particular one.

We call the letter system-independent (but data model-dependent) logical

design. The mapping can proceed in two stages.

i) System independent mapping:- In this stage, the mapping doesn’t consider any specific

characteristics on special cases that apply to the DBMS implementation of the data model.

ii) Following the schema to specific DBMS.

Different DBMS implement a data model by using specific modeling features &

constraints.

v) Physical database design (Phase 5):- During the phase, we design the specifications for

the database in terms on physical storage structures, record placement & indexes.

The following criteria are often used to guide the choice of physical database design

options:

i) Response time:- This is the elapsed time between submitting a database transaction for

the execution and receiving a response of major influence of on response time that is under

the control of the DBMS is the database access time for data items referenced by the

transaction.

ii) Space Utilization:- This is the amount of storage space used by the database files & their

access path structure on disk including indexes and other indexes and other paths.

iii) Transaction Throughout:- This is the average no.of transaction that can be processed for

minute; it is critical parameter of transaction systems. The systems such as those used for

airline reservation or banking.

~ 49 ~

Page 50: Introduction of Rdbms

vi) Database system Implementation & tuning (Phase 6):- During this phase, the database

& application programs are implemented, tested and eventually deployed for service.

i) Various transactions and applications are tested individually and then in conjunction with

each other.

ii) Tuning:- Tuning is ongoing activity-a part of system maintenance that continues for the life

cycle of a database and application keep evolving and performance problems are detected.

RDBMS TECHNOLOGY

i) Attributes:- A kind of information that describes one aspect of a data object. For example,

“age” is an attribute of a person and salary” is an attribute of an employee. ”Attribute” is

also called “column”.

ii) Relation:- A data object defined by a set of attributes. For example “employee” is a

relation with various attributes that define the data object. “Relation” is also called “table”.

iii) Tuple:- An instance of a data object with specific values for all attributes of the relation.

For example, one tuple of the “course”. Relation is the operating system course with

“operating system” as the value of the “course name” attribute, and other values for other

attributes. “Tuple” is also called “record”.

iv) Table:- This is the structure defined to store data under it. It is also called as an ENTITY.

v) Fields:- It is also known as attributes in database terminology. It defines the property of an

entity . In other words it is called as columns in a table. For example if we have a company

where employee details and department details gathered. Employee details like emp name,

age, address and so on each of structure called as EMP table.

NORMALIZATION IS REQUIRED BECAUSE

i) To structure the data so that pertinent relationship between entities or tables can be

maintained.

~ 50 ~

Page 51: Introduction of Rdbms

ii) To permit simple retrieval of data in response to queries.

iii) For simple maintenance of data when updates, insertions and deflections take place.

iv) To reduce the need to restructure of reorganize data when new application requirement

arise.

In each level redundancy of data is reduced:-

i) JOIN:- We have seen that we can create number of tables. So a situation may arise where

rows of one table need to be joined with rows in another by common values in

corresponding columns.

ii) Referential integrity:- This is important terminology in relational Database management

system also called as RDBMS in short. It is created on a table with reference to another

table or tables.

iii) Foreign Key :- It is defined as a columns or set of columns or set of columns in a child

table to declare referential integrity constraint.

iv) Data integrity:- Data integrity refers to accuracy of data. This is very essential for a proper

maintenance and usage of database.

v) Concept of index:- This s vital in database. Indexing in database is similar to the concept

of pointers.

vi) Security:- It is very essential to have good security provided to database in order for the

data to be kept safe from improper access from improper updating of data.

vii) Users:- It is essential that only valid users must have to access to database. This achieved

in database management system by assigning user-id.

viii) Deadlock:- Transaction is unit of work done. So a database management system will have

number of transaction. There may be situations when two or more transactions are put in to

wait state simultaneously.

RELATIONAL DATA INTEGRITY

~ 51 ~

Page 52: Introduction of Rdbms

A major part of the database developer’s task is guaranteeing the integrity and consistency of data

stored in the tables. It may seem only logical that a field contains a date or that a numeric field

contains only numbers, but this requires planning, as well as the specification of predefined rules

Table Dept Each value in DNAME unique

DEPNO DNAME LOC

20

30

RESEARCH

SALES

DALLAS

CHICAGO

Each row must have a value

For the ENAME Column

TABLE EMP

EMPNO ENAME` Other

column

SAL COMM DEPTNo

6666

7329

7499

7521

7566

MULDER

SMITH

ALLEN

WARD

JONES

5500.00

9000.00

7500.00

5000.00

2972.00

100.00

200.00

400.00

20

20

30

30

30

Each row must have a value Each value in SAL column must be

For the EMPMNO column, and Less than 10,000.00

The value must be unique.

The Dept and EMP tables

Domain Integrity Constraint

These constraints set a range, and any violations that take place will prevent the user from

performing the manipulation that caused the breach. There are basically two types of Domain Integrity

Constraint.

~ 52 ~

Page 53: Introduction of Rdbms

Not Null constraint.

Check constraint.

By default the tables can contain null values. The enforcement of not null

constraint in a table ensures that the table contains value. The database will not validate the record

until this is satisfied. The other type of constraint available under this classification is the ‘check’

constraint. This can be defined to allow only a particular range of values. When the demarcation

specified in this range is violated the database rejects the records.

Entity Integrity Constraint

Entity Constraint Are of two types:

Unique constraints

Primary key constraints

The unique value rule/constraints mean that each value in a particular column is unique, such as

DEPTNO in the DEPT table. The database rejects duplication of records when the unique key

constraint is used.

The primary key value / Constraint specifies that each row of a table must be

indemnified by a unique value. It is almost similar to unique key Constraint. Its needs are best felt

when a relation has to be set between tables because in addition to preventing duplication it also

does not allow null values.

Referential Integrity Constraint

The Referential Integrity rule / Constraint determine that the value or group of fields

corresponds to the key fields of other table tables. For example, the DEPTNO column of the EMP

table accepts only the values that are registered in the DEPTNO column of the DEPT table. This

~ 53 ~

Page 54: Introduction of Rdbms

rule also allows you to specify the type of data manipulation that is allowed on the specified

values.

The referential integrity constraint enforces relationship between tables. It designates a column or

combination of column as a foreign key. The foreign key establishes a relationship with a specified

primary or unique key in another table, called the referenced key. In this relationship, the table,

containing the foreign key is called the child table and table containing the referenced key is called

the parent key. One can either enable or disable the constraint by default the constraint will be

enable.

Check Constraints

As mentioned previously, check constraints specify conditions that each row must specify. These

rules are governed by logical expressions or Boolean expressions. Check conditions cannot

subqueries. The following example will help us to understand the check constraints much better.

To cut down on storage costs and excess material in their warehouse Tom dick and Harry Spares

Inc decides to have a slab on the maximum storage level for each item. The item file table that was

created earlier should be dropped and recreated for imposing the constraint.

Introduction

~ 54 ~

Page 55: Introduction of Rdbms

This unit will introduce to the basic of Access-2000 in which you will learn how to play around

with the menus and the toolbar buttons of the Access dialog box. You will learn how to create

databases, tables, forms and how you can add, update and delete data using them.

Features of Access

Introduction to SQL

~ 55 ~

Page 56: Introduction of Rdbms

SQL was invented and developed by IBM in early 1970’s. SQL stands for Structured

Query Language. IBM was able to demonstrate how to control relational databases using SQL.

The SQL implemented by ORACLE CORPORATION is 100% compliant with the ANSI/ISO

standard SQL data language. Oracle’s database language is SQL, which is used for storing and

retrieving information. SQL allows users to access data in relational database management

systems, such as Oracle, Sybase, Informix, Microsoft SQL Server, Access, and others, by allowing

users to describe the data the user wishes to see. A table is a primary database object of SQL that is

used to store data. A table holds data in the form of rows and columns.

The following are the benefits of SQL:-

1 Non-procedural language, because more than one record can be accessed rather than one

record at a time.

2 It is common language for all relational databases. In other words it is portable and it

requires very few modifications so that it can work on other databases.

3 Very simple commands for querying, inserting, deleting and modifying data and objects.

Characteristics of SQL

Structured Query Language (SQL), pronounced “sequel”, is the set of commands that all programs

and users must use to access data within Oracle7 database. Application programs and Oracle7 tools

often allow users to access the database without directly using SQL, but these applications in turn

must use SQL when executing the user’s request.

Structured Query Language

SQL is a well-defined language that has been developed for the relational database users to

interact with the database in a simple and efficient manner. Every implementation of relational

~ 56 ~

Page 57: Introduction of Rdbms

database, e.g., ORACLE, SYBASE, MS-ACCESS etc. all understands SQL. Therefore, from a

user’s point of view it is quite mandatory that she knows how to use this language. A complete

coverage of this language is out of scope of this book, yet a synopsis of the same is presented

below

Although an exact form of SQL command depends on its implementation, a common sub-set has

been dealt with in the following.

The SQL commands can be categorized in different groups for convenience. They are:

Data Definition, Constraints and Schema Changes in SQL

Basic Queries in SQL

More complex SQL Queries

Insert, Delete and Update statements in SQL

View in SQL

Structured Query language was designed and implemented at IBM research. It was created in late

70’s under the name of SEQUEL. A standard version of SQL is called SQL86 or SQL1 is most

popular. A revised version of standard SQL, called SQL2 is now available. SQL has been extended

with object-oriented and other recent database concept.

SQL offers a range of data types for attributes:

~ 57 ~

Page 58: Introduction of Rdbms

Numeric integer, real (formatted, such as DECIMAL(10,2))

Character String fixed length, varying length

Date in the form YYYYMMDD

Time in the form HH:MM:SS

Timestamp includes both the DATE and TIME fields

Internal to increase/ decrease the value of data, time, or timestamp.

Basic queries in SQL

SQL allows a table (relation) to have two or more tuples that are identical in all their attributes

values. Hence, an SQL table is not a set of tuple, because a set does not allow two identical

members; rather it is a multiset of tuples.

SELECT statement

The SELECT statement used in SQL has no relationship to the SELECT operation of relational

algebra discussed earlier.

The syntax of this command is:

Select <attribute list>From <table list>Where <condition> ;

Example:

Query 0:Retrieve the birthday and address of the employee(S) whose name is’Rajesh’

~ 58 ~

Page 59: Introduction of Rdbms

Q0: Select BDATE, ADDRESSFROM EMP WHERE ENAME = ‘Rajesh’;

Query 1:Retrieve the name and address of all employee who work for the ‘Administration’

Department.

Q1: Select ENAME, ADDRESSFROM EMP, DEPARTMENTWHERE DNAME =”Administration’ AND DEPTNO = DNO;

Advantages of SQL

Benefits of SQL

This section describes many of the reason for SQL’s widespread acceptance by relational

database vendors as well as end users. The strength of SQL benefit all ranges of users including

application programmers, database administrators, and management a few and end users.

Non –Procedural Language

~ 59 ~

Page 60: Introduction of Rdbms

SQL is a non –procedural language because it:

Processes sets of records rather than just one in a time;

Provides automatic navigation to the data

System administrator

Database administrators

Security administrators

Application programmers

Decision support system perdsonnel

Many other types of end users

SQL provides easy-to-learn commands that are both consistent and applicable to all users. The

basic SQL command can be learned in a few hours and even the most advanced commands can be

mastered in a few days.

Unified languages:

SQL provides commands for a variety of task including:

Querying data;

Inserting, updating and deleting rows in a table;

Creating, replacing, altering and dropping objects;

Controlling access to the database and its object;

Guaranteeing database consistency and language.

SQL unifies all the above tasks in one consistent language.

~ 60 ~

Page 61: Introduction of Rdbms

Common Language for all Relational Databases

Because all major relational database management system support SQL, you can transfer all skill

you have gained with SQL from one database to another. In addition, since all programs written

in SQL are portable, they can often be moved from one database to another with very little

modification.

Embedded SQL

Embedded SQL refers to the use of standard SQL commands embedded within a procedure

programming language. Embedded SQL is a collection of these commands.

All SQL commands, such as SELECT and INSERT, available with SQL with interactive tools;

Flow control commands, such as PREPARE and OPEN, which integer the standard SQL

commands with a procedural programming language.

The Oracle precompilers support embedded SQL. The Oracle precompilers interpret embedded

SQL statements and translate them in to statements that can be understood by procedural language

compiler. Each of these Oracle precompilers translate embedded SQL programs in to a different

procedural language:

The Pro*Ada precompiler

The Pro*C/C++ precompiler

The Pro*COBOL precompiler

The Pro*FORTAN precompiler

The Pro*Pascal precompiler

The Pro*PL/1 precompiler

Database Objects

~ 61 ~

Page 62: Introduction of Rdbms

Oracle supports two types of data objects.

Schema objects: A schema is a collection of logical structure of data, of schema objects. A

schema is owned by a database user and has the same name as that user. Each user owns a single

schema. Schema objects can be created and manipulated with SQL and include the following

types of object

Non-schema Objects: Other types of objects are also stored in the database and can be created and manipulated with SQL, but are not contained in a schema.

Object Naming Conventions

The following rules apply when naming objects:

Names must be from 1 to 30 characters long with the following exceptions:

Names of databases are limited to 8 characters. Names of database links can be as long as 128 characters.

Names cannot contain quotation marks.

Names are not case-sensitive.

A name must begin with an alphabetic character from your database character set unless surrounded by double quotation marks.

~ 62 ~

Cluster database links database triggers

Indexes packaged sequences

Snapshots snapshot logs stored functions

Stored procedures synonyms tables

Views

Profiles Rates

Tallback segments table spaces

Users

Page 63: Introduction of Rdbms

Names can only contain alphanumeric character form your database character set and the character_,$ and#. You are strongly discouraged from using $ and#.

Names of the databases links can also contain periods(.) and ampersand(&).

Object Naming Guidelines

There are several helpful guidelines for naming objects and their parts:

Use full, description, pronounceable names (or well-known abbreviation).

Use consistent naming rules.

Use the same name to describe the same entity or attributes across tables.

When naming objects, balance the objective of keeping names short and easy to use with the objectives of making names as long and descriptive as possible. When in doubt, choose the more descriptive name because many people may use the objects in the database over a period of time. Your counterpart ten years from now may have difficulty in understanding a database with names like PMDD instead of PAYMENT_DUE_DATE.

Using consistent naming rules helps users to understand the part that each table plays in your application. One such rule might be to begin the names of all tables belonging to the FINANCE application with FIN_.

Use the same names to describes the same things across tables. For example, the department number columns of the EMP and DEPT tables should both be named DEPTNO.

internal datatype Description VARCHAR2 Variable length character string having maximum length size bytes. Maximum

size is 2000 and min. is 1. You must specify size for a VARCHAR2. Number (p,s) Number having precision p and scale s.The precision p can range 1 to 38. The

scale s can range form-84 to 127.LONG Character data of variable length upto 2 gigabites, or 231-1 bytes.DATE Valid data range from January 1,4712 BC to December 31,4712 AD.RAW(size) Raw binary data of length size bytes. Maximum size is 255 bytes. You must

specify size of a RAW value. LONG RAW Raw binary data of variable length up to 2 gigabytes.CHAR(size) Fixed length character data of length size byte. Maximum size is 255. Default

and minimum size is 1 byte.

~ 63 ~

Page 64: Introduction of Rdbms

Character data types:

Character data types are used to manipulate words and free-form text. These data types are used to store character (alphanumeric) data in the database character set.

CHAR Data Type

The CHAR data type specifies a fixed length character string. When you create a table with in CHAR column, you can supply the column

Introduction to Query by Example

~ 64 ~

Page 65: Introduction of Rdbms

QBE is a simple point and click way for non-technical users to build queries, and is built in to Access. This unit will introduce you to the various queries like SELECT query, Make-Table query, DELETE query, UPDATE query, APPEND query.

SELECT Queries

It is used to retrieve data from one or more tables and display the results in a datasheet, which you can save or modify. You can also use Select queries to group records and calculate sums, averages and so on. Any time you create a new query it is a select query until you tell Access to make it something else. When you run a select select query, Access display the dynaset, which you can then view and make changes to record by record.

How to create a select query by using the wizard

The following steps show you how to create a query to retrieve information about customers and orders from the Northwind sample database that is included with Microsoft Access.

1. Start Microsoft Access , and then open the sample database Northwind.mdb.

2. On the View menu, point to Database Objects, and then click Queries.

3. On the Insert menu, click Query.

4. In the New Query dialog box, click Simple Query Wizard, and then click OK.

5. In the Simple Query Wizard dialog box, click the Customers table in the Tables/Queries list. Double –click each of the following fields to add them to the Selected Fields box: CustomerID, CompanyName, ContactName, ContactTitle.

6. On the same page of the Simple Query Wizard, click the Orders table in the Tables/Queries list. Click >> to add all the fields from the Orders table to the Selected Fields box.

7. Click Finish. The Simple Query Wizard constructs the query, and displays the results in Datasheet view.

How to Create a Select Query in Design View

1. Start Microsoft Access.

~ 65 ~

Page 66: Introduction of Rdbms

2. Open the Northwind.mdb sample database.

3. On the View menu, point to Database Objects, and then click Queries.

4. In the Database Window, double-click Create Query in Design View.

5. In the Show Table dialog box, click Customers, and then click Add.

6. Repeat step 5 for the Orders Table

7. Click Close to close the Show Table dialog box

8. In the Customer table field list in the top half of the query design window, double-click to add the following fields: CustomerID, CompanyName, ContactName, ContactTitle .

9. In the Order table field list in the top half of the query design window, double-click the *. Adding the * is the equivalent of selecting all the fields from a particular table.

10. On the File menu, click Save. Type qryCustomerOrders for the name of the query.

11. On the Query menu, click Run to view the results of the query.

Action Queries

While finding and organizing the information stored in your database is an immensely useful task, queries are capable of doing much more. They can actually perform specified tasks for you and make managing your data a great deal easier. Queries that perform jobs for you are known as action queries because they do things with or to the information stored in your tables. A query can look for and modify records that have an unmatched field between two related tables, update records stored in your tables, delete specified records from one or more tables, create a new table for you , or even add records to an existing table. All of these jobs would take a good while to do manually, but a carefully designed query can accomplish them in a matter of seconds.

Access provides four types of action queries:

A Make-Table query creates a new table from all or part of another table or tables.

A Delete query deletes records from a table or tables.

An Append query adds a group of records to an existing tables or tables.

An Update query changes the data in a group of records.

~ 66 ~

Page 67: Introduction of Rdbms

The Make-Table Query

A Make-table query creates a new table by retrieving the records you asked for and using them to make a new table. You can choose the fields from one or more existing tables to include in the new tables and can also specify the criteria that must be met from each fields.

Delete Query

To remove a number of records that meet the same criteria, it is much faster to use a Delete query than to delete each record separately.

Append Query

An Append question query is an action query that adds from one or more Access tables to another existing tables. As with make-table queries, you can append records to a table in the current database or in another database.

Update Query

With an update query, you can change data in existing tables. Update queries allow you to update large quantities of information in a single action.

~ 67 ~