Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 1
Database Architecture 2
References“Fundamentals of Database Systems”,
Elmasri/Navathe, Chapter 2
“Database Systems : A Practical Approach”,
Connolly/Begg/Strachan, Chapter 2
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 2
2
Definitions• Schema
– Description of the database
• Integrity Constraints– Rules describing the consistency
and validity of the data
• Indexes– Data structure providing fast
access to data
• Buffers
– Area of memory used fortransferring data between discand memory
• Three-level Architecture
– External, Conceptual andInternal view of DBMSstructure
• SQL
– Database language
• INSERT Statement
– SQL command used to addrecords into a relation.
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 3
3
Overview
• Components of a DBMS– Database Manager
– File Manager– Dictionary Manager– Language Processor
• Database Languages– Data Definition Language
– Data ManipulationLanguage
– Data Query Language
• DBMS Interfaces– Menus
– Forms
• DBMS Utilities– Loaders
– Backup
– File Reorganisation
– Performance Monitoring
• Classifications of DBMSs– Number of Users
– Distribution
– Cost
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 4
4
Components of a DBMS
ApplicationPrograms
QueriesDatabaseSchema
DMLPre-processor
Query Processor
DDL Compiler
ProgramObject Code
Database Manager
DictionaryManager
FileManager
AccessMethods
SystemBuffers Database
From Connolly et al
A database management system (DBMS) is a complex piece of software. Itspurpose is to store and retrieve large volumes of data in the most efficientway possible.
A DBMS normally consists of one or more program modules that eachprovide some part of the overall functionality of the system.
These modules include the language processors, the query processor, thedatabase manager, the data dictionary manager and the file manager.
For most DBMSs there are three major types of input: (1) applicationprograms that change data in the database, (2) queries that retrieve data fromthe database (usually in response to a user’s request) and (3) commands tochange the database schema.
Some of the functions of the DBMS may be provided by the underlyingoperating system, for example, the file manager or the system buffers.
Ref: Connolly, sec. 2,5; Elmasri, sec. 2.4
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 5
5
Database Manager
• Responsible for– Authorisation and
access control
– Command processor
– Integrity checking
– Query optimisation
– Transactionmanagement
– Recovery management
ApplicationPrograms
QueriesDatabaseSchema
DMLPre-processor
Query Processor
DDL Compiler
ProgramObject Code
Database Manager
DictionaryManager
FileManager
AccessMethods
SystemBuffers Database
The database manager is responsible for accepting commands from the user,ensuring that commands are valid, calculating the most effective way toexecute commands and executing commands. Elmasri et al calls thedatabase manager the run-time database processor.
Connolly et al identifies the following major responsibilities of the databasemanager:
Authorisation control All commands from the user are checked toensure that the user is allowed to execute them. This process willinvolve checking the security permissions that have been given tothe user and the restrictions that have been placed on the datarequested.
Command processor When a command has been authorised it iscarried out by the command processor. Carrying out a commandinvolves selecting the best method of executing the commandusing the query optimiser and the transaction manager.
Integrity checking All commands that change the contents of thedatabase must be checked to ensure that they do not introduceerrors into the database. Integrity constraints are created by thedatabase administrator.
Query optimisation The most efficient method of executing a querymust be identified. This is done by analysing a variety of possibleplans and selecting the best.
Transaction & Recovery management Large DBMSs use transactionprocessing to management very large changes to the database.
Ref: Connolly, p59; Elmasri, sec. 2.4
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 6
6
File Manager
• Responsible for– Allocating disc storage
– Maintaining files andindexes
– Managing system buffersin main memory
– Transferring blocksbetween discs and buffers
• Functionality may beprovided by theunderlying operatingsystem.
ApplicationPrograms
QueriesDatabaseSchema
DMLPre-processor
Query Processor
DDL Compiler
ProgramObject Code
Database Manager
DictionaryManager
FileManager
AccessMethods
SystemBuffers Database
The file manager is responsible for the operation of all the discs and buffersused by the DBMS. Elmasri et al calls the file manager the stored datamanager.
The file manager will manage the following components:
The allocation of disc space in which to store the data in the databasesystem. The storage of data in a large DBMS can be a verycomplicated process because of the complexity of the datastructures required to make retrieving the data efficient.
The indexes and hash functions used to improve the performance ofqueries and updates to the database. Indexes must be automaticallyupdated when data is added to or removed from the database.
A large DBMS will require many main memory buffers (sometimescalled caches). Buffers are used to store data that has been writtento the database but not yet stored on the disc. When buffersbecome full they must be stored on the disc. The use of buffers in aDBMS can greatly affect the overall performance of the database.
The functionality of the file manager may be provided by the underlyingoperating system, for example, UNIX or MSDOS. That is, the operatingsystem will be responsible for managing buffers and disc allocation. This isnormally true for small scale database systems.
Ref: Connolly, p59; Elmasri, sec 2.4
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 7
7
Dictionary Manager
• Responsible for– Keeping the data
dictionary up-to-date
– Providing informationabout the databaseschema
– Storing integrityconstraints
– Storing authorisationpermissions
ApplicationPrograms
QueriesDatabaseSchema
DMLPre-processor
Query Processor
DDL Compiler
ProgramObject Code
Database Manager
DictionaryManager
FileManager
AccessMethods
SystemBuffers Database
The dictionary manager is responsible for managing all aspects of the datadictionary. Elmasri et al calls the dictionary manager the data dictionarysystem.
The data dictionary is a database that stores information about all the datastored in the DBMS, for example, descriptions of the tables and attributes.
The dictionary manager is responsible for:
Updating the data dictionary when the database schema changes, forexample, when a new table is added.
Providing the database manager with information about the content ofthe database.
Storing information about the integrity constraints and authorisationpermissions of the users. This allows the database manager to fulfilits role of enforcing the integrity and security constraints of thedatabase administrator.
Ref: Connolly, p60; Elmasri, sec. 2.4
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 8
8
Language Processors• Includes
– Data ManipulationLanguage Pre-processor
– Data DefinitionLanguage Compiler
• Responsible for– Checking commands to
the DBMS are correct
– Translating commandsinto machine readableform
– Work with the queryprocessor
ApplicationPrograms
QueriesDatabaseSchema
DMLPre-processor
Query Processor
DDL Compiler
ProgramObject Code
Database Manager
DictionaryManager
FileManager
AccessMethods
SystemBuffers Database
There are many languages in a DBMS. The language processor actuallyconsists of a variety of different processes.
The data manipulation language pre-processor is responsible forconverting manipulation commands, for example, the SQL insertcommand, into commands that may be executed by the databasemanager.
The data definition language compiler is responsible for convertingcommands that define the structure of the database into entries inthe data dictionary.
The language processor is responsible for checking that all commands arecorrect and for translating each command into a form that may be executedby the database manager.
The language processor must also work with the query processor to producethe most efficient method of answering queries.
The data manipulation commands are often stored in application programsand they must be converted into commands that can be understood by thedatabase manager.
Ref: Elmasri, sec. 2.3
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 9
http://www.openlinksw.com/virtuoso/virtuowp/virtuowp.htm#_Toc430023374
Database Components
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 10
Oracle Concepts Manual
Oracle Instance
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 11
11
Overview
• Components of a DBMS– Database Manager
– File Manager
– Dictionary Manager
– Language Processor
• Database Languages– Data Definition Language– Data Manipulation
Language
– Data Query Language
• DBMS Interfaces– Menus
– Forms
• DBMS Utilities– Loaders
– Backup
– File Reorganisation
– Performance Monitoring
• Classifications of DBMSs– Number of Users
– Distribution
– Cost
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 12
12
Database Languages
• Users communicate with the DBMS through adatabase language.
• A database language is simpler than aprogramming language.
• Types of database languages– Data Definition Language
– Data Manipulation Language
– Data Query Language
Users of a DBMS communicate with the database by giving it commands toexecute. These commands are expressed using a database language, forexample, SQL.
A database language consists of a set of commands that allow the user tochange the database’s structure and content, for example, creating newtables or inserting new records.
A database language is normally not as complex as a programming language.It does not, for instance, contain while-loops or for-loops.
There are three important types of database language:
Data Definition Language
Data Manipulation Language
Data Query Language
Each database language deals with a different aspect of the database. Forexample, the data definition language provides commands to change thestructure of the database.
Users can execute commands by either:
Sending the commands directly to the DBMS, or
Embedding the commands in a programming language as part of anapplication program.
Modern database languages do not distinguish between the different languagetypes, for example, SQL contains commands for all the language types.
Ref: Connolly, sec. 2.2; Elmasri, sec 2.3
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 13
The data definition language (DDL) is used to describe the structure of thedatabase. It allows the database administrator to describe the relations,attributes and integrity constraints of the system.
Connolly et al defines the DDL as a “descriptive language that allows theDBA or user to describe and name entities required for the application andthe relationships that may exist between the different entities”.
When the DBMS receives a DDL command it:
1. Creates or changes the underlying file structures that are used toimplement the database. For example, a file might be created foreach new relation created by a CREATE TABLE DDL command.
2. Changes the data dictionary to record the change that has beenmade to the database structure. For example, a new record may beadded to the data dictionary describing the structure of a newlycreated relation.
In the three-level schema, Elmasri et al identifies three different datadefinition languages:
1. A view definition language for creating entities and relationshipsat the external level and mapping them to the conceptual schema.
2. A data definition language for creating entities and relationshipsat the conceptual level and mapping them to the internal level.
3. A storage definition language for creating file and index structuresat the internal level.
Ref: Connolly, sec. 2.2; Elmasri, sec 2.3
13
Data Definition Language
• Used to describe the database schema– Creating relations, attributes, etc.
– Declaring integrity constraints
• Results of executing a DDL command– Updated data dictionary
– New file structures
– Deleted file structures
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 14
CREATE TABLE scott.emp ( empno NUMBER CONSTRAINT pk_emp PRIMARY KEY, ename VARCHAR2(10) CONSTRAINT nn_ename NOT NULL CONSTRAINT upper_ename CHECK (ename = UPPER(ename)), job VARCHAR2(9), mgr NUMBER CONSTRAINT fk_mgr REFERENCES scott.emp(empno), hiredate DATE DEFAULT SYSDATE, sal NUMBER(10,2) CONSTRAINT ck_sal CHECK (sal > 500), comm NUMBER(9,0) DEFAULT NULL, deptno NUMBER(2) CONSTRAINT nn_deptno NOT NULL
CONSTRAINT fk_deptno REFERENCES scott.dept(deptno))PCTFREE 5 PCTUSED 75;
Data Definition - Creating a Table
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 15
15
Data Manipulation Language
• Used to change the content of the database– Insertion
– Deletion
• Two types of DML– Procedural (How to make changes)
– Non-procedural (What changes to make)
The data manipulation language (DML) is used to make changes to the contentof the database. For example, in SQL the DML command INSERT INTOinserts a new tuple into a relation.
Connolly et al defines the DML as a “language that provides a set of operationsthat support the basic data manipulation operations on the data held in thedatabase”.
The DML includes commands to:
Insert new data into the database,
Delete existing data from the database, and
Modify existing data in the database.
In the three-level schema a DML is used to make changes at the external andconceptual levels of the schema. The internal level DML is more complexbecause it must handle low level file and index structures.
There are two main types of DML:
1. Procedural DMLs describe how the desired changes should bemade to the database. For example, they will allow the user to describethe process to be used by the DBMS to update the database.
2. Non-procedural DMLs describe what the desired changes are butnot how to actually perform the changes. The DBMS must select the bestmethod of making the changes in the database. Relational DBMSs usenon-procedural languages, for example, SQL.
A non-procedural DML allows the user to concentrate on what they requirerather than how to get it.
Ref: Connolly, sec 2.2; Elmasri, sec 2.3
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 16
INSERT INTO emp (empno, ename, job, sal, comm, deptno) VALUES (7890, ’JINKS’, ’CLERK’, 1200, NULL, 40);
INSERT INTO bonus SELECT ename, job, sal, comm FROM emp WHERE comm > 0.25 * sal OR job IN (’PRESIDENT’, ’MANAGER’);
UPDATE emp SET emp_no = 1356 WHERE name = ’SMITH’;
Data Manipulation - Inserting Record
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 17
Data Manipulation• Procedural
Input.openFile(“emp.dat”);
Output.openFile(“emp.out”);
r = Input.readInt();
salary = Input.readInt();
while (r != -1) {
if (r = 15) {
salary = salary + 100;
Output.writeInt(r);
Output.writeInt(salary);
}
r = Input.readInt();
salary = Input.readInt();
}
Output.close();
Input.close();
• Non-Procedural
update empset salary = salary + 100
where empno = 15;
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 18
18
Data Query Language
• Used to retrieve data from the database– The DQL is part of the data manipulation language.
• e.g. the SELECT statement in SQL
• Types of DQL– Procedural (How to make changes)
– Non-procedural (What changes to make)
The data query language (DQL) is used to retrieve data from the database. Itis part of the data manipulation language. For example, the SELECTstatement in SQL is the DQL component of SQL. Using the SELECTstatement the user can express all queries on the database.
Connolly et al defines a DQL as a “high-level special-purpose language usedto satisfy diverse requests for the retrieval of data held in the database”.
As with the data manipulation language, the DQL can be either procedural ornon-procedural.
A procedural DQL describes how to answer the query. For example,a procedural DQL describes the tables that should be accessed, theindexes to use in accessing the tables and the order in which to accessthem.
A non-procedural DQL describes what data is required to answer thequery but not how to retrieve the data. For example, a non-proceduralDQL will not describe the indexes that should be accessed.
A non-procedural DQL is used in relational DBMSs because this allows theDBMS to decide the best strategy for accessing the data. This isparticularly important when many users are accessing the same data.
Ref: Connolly, sec 2.2; Elmasri, sec 2.3
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 19
SELECT * FROM emp WHERE deptno = 30;
SELECT deptno, MIN(sal), MAX (sal) FROM emp WHERE job = ’CLERK’ GROUP BY deptno;
SELECT deptno, MIN(sal), MAX (sal) FROM emp WHERE job = ’CLERK’ GROUP BY deptno HAVING MIN(sal) < 1000;
Data Query - Select
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 20
Data Query• Procedural
Input.openFile(“emp.dat”);
r = Input.readInt();
salary = Input.readInt();
while (r != -1) {
if (r = 15) {
Output.writeInt(r);
Output.writeInt(salary);
}
r = Input.readInt();
salary = Input.readInt();
}
Input.close();
• Non-Procedural
select * from empwhere empno = 15;
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 21
21
Overview
• Components of a DBMS– Database Manager
– File Manager
– Dictionary Manager
– Language Processor
• Database Languages– Data Definition Language
– Data ManipulationLanguage
– Data Query Language
• DBMS Interfaces– Menus
– Forms
• DBMS Utilities– Loaders
– Backup
– File Reorganisation
– Performance Monitoring
• Classifications of DBMSs– Number of Users
– Distribution
– Cost
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 22
22
DBMS Interfaces
• 4GL– ‘Non-procedural’
programming languages
• Forms– User interfaces
• Menus– Introductory screens
• Reports– Formal printed reports
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 23
23
Overview
• Components of a DBMS– Database Manager
– File Manager
– Dictionary Manager
– Language Processor
• Database Languages– Data Definition Language
– Data ManipulationLanguage
– Data Query Language
• DBMS Interfaces– Menus
– Forms
• DBMS Utilities– Loaders
– Backup– File Reorganisation– Performance Monitoring
• Classifications of DBMSs– Number of Users
– Distribution
– Cost
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 24
24
DBMS Utilities• Loaders
– Loads or extracts largeamounts of data from thedatabase
• Backup– Copies data in case of a
failure
• File Reorganiser– Improves performance by
reorganising data
• Performance Monitor– Monitors the DBMS
A large DBMS provides many tools that the database administrator can useto manage the database.
Loader A loader is a piece of software that loads data from a file intothe database or extracts data from the database into a file. It is usedbecause using many INSERT statements may too slow.
Backup To safeguard the data it is important to backup the databaseon a regular basis. Special tools are provided to perform thisfunction.
File Reorganiser In a large database changing the structure of filescan be a slow process. It can also be difficult to understand how thecurrent structure is performing. The file reorganiser changes thestructure of the database to improve its efficiency.
Performance Monitor The performance monitor allows the databaseadministrator to investigate the performance of the database. It willindicate how slow or fast the DBMS is performing and indicate anyproblems with the system. The monitor provides statistics on allaspects of the system.
Ref: Elmasri, sec 2.4.
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 25
25
Overview
• Components of a DBMS– Database Manager
– File Manager
– Dictionary Manager
– Language Processor
• Database Languages– Data Definition Language
– Data ManipulationLanguage
– Data Query Language
• DBMS Interfaces– Menus
– Forms
• DBMS Utilities– Loaders
– Backup
– File Reorganisation
– Performance Monitoring
• Classifications ofDBMSs– Number of Users– Distribution
– Cost
BBIT2 - Database Systems 2
© Stephen Mc Kearney, 2001. 26
26
Classifications of DBMSs
• Type of Data Model– e.g. relational, network, hierarchical
• Number of users– e.g. single-user, multi-user
• Distribution– e.g. number of sites
• Cost
There are many different types of DBMS. Depending on the requirements ofthe user the database administrator must select the most appropriateDBMS.
For example, if the database is to hold details of one million customers and isto be accessed by 1500 salespeople from numerous locations then a verycomplex DBMS is required. However, if the database is to store a list of100 customers for a small manufacturing firm and is to be accessed by thefirms one salesperson then a simple DBMS will be sufficient.
The major differences between DBMS packages include:
The type of data model used to describe the data. The most populardata models include relational, network and hierarchical. Newermodels include object-oriented databases
The number of users that will be accessing the data, for example, 100,1000, or 2000 users.
The distribution of the data across sites in a network. A databasemay be contained on one site (centralised) or many sites (distributed).The databases in a distributed systems may use the same DBMSsoftware (homogeneous) or different DBMS software(heterogeneous).
The cost of the DBMS software and equipment.
Ref: Elmasri, sec 2.5.