Click here to load reader

Chapter 1 database introduction

Embed Size (px)

DESCRIPTION

this document provides aclear introduction to the fundamentals of database

Citation preview

Chapter 1

Introduction to Database SystemsChapter OneChapter 1 - Objectives2013-10-22Fundamentals of Database Systems (INSY 321)2Some common uses of database systems.Characteristics of file-based systems.Problems with file-based approach.Meaning of the term database.Meaning of the term Database Management System (DBMS).Typical functions of a DBMS.Major components of the DBMS environment.Personnel involved in the DBMS environment.History of the development of DBMSs.Advantages and disadvantages of DBMSs.

2Chapter 1 - Objectives Contd2013-10-22Fundamentals of Database Systems (INSY 321)3Purpose of three-level ANSI-SPARC database architecture.Contents of external, conceptual, and internal levels.Purpose of external/conceptual and conceptual/internal mappings.Meaning of logical and physical data independence.Distinction between DDL and DML.A classification of data models.Architecture for Multi-User Database System

3Database systems2013-10-22Fundamentals of Database Systems (INSY 321)4Today, Databases are essential to every business. They are used to maintain internal records, to present data to customers and clients on the World-Wide-Web, and to support many other commercial processes. Databases are likewise found at the core of many modern organizations.4Examples of Database Applications2013-10-22Fundamentals of Database Systems (INSY 321)5Purchases from the supermarketPurchases using your credit card Booking a holiday at the travel agents Using the local library Taking out insurance Renting a videoE- Commerce BankingSocial media (in general the WWW)5Data Handling Approaches in Organizations2013-10-22Fundamentals of Database Systems (INSY 321)6Data management passes through the different levels of development along with the development in technology and services. These levels could best be described by categorizing the levels into three levels of development. Even though there is an advantage and a problem to overcome at each new level, all methods of data handling are in use even today to some extent. The major three levels are;Manual ApproachTraditional File Based ApproachDatabase Approach 6Manual Approach2013-10-22Fundamentals of Database Systems (INSY 321)7Cards and paper are used for the purposeFiles for as many event and objects as the organization has are used to store information.Each of the files containing various kinds of information is labelled and stored in one or more cabinets.The cabinets could be kept in safe places for security purpose based on the sensitivity of the information contained in it ( Cabinet Lockers).Insertion and retrieval is done by searching first for the right cabinet then for the right file then the information.One could have an indexing system to facilitate access to the dataLimitations of the Manual approachProne to errorDifficult to update, retrieve, integrateYou have the data but it is difficult to compile the informationLimited to small size informationCross referencing is difficult

7File-Based Systems2013-10-22Fundamentals of Database Systems (INSY 321)8Collection of application programs that perform services for the end users (e.g. reports).

Each program defines and manages its own data.8File-Based Processing2013-10-22Fundamentals of Database Systems (INSY 321)9

9Limitations of File-Based Approach2013-10-22Fundamentals of Database Systems (INSY 321)10Separation and isolation of dataEach program maintains its own set of data.Users of one program may be unaware of potentially useful data held by other programs.

Duplication of dataSame data is held by different programs.Wasted space and potentially different values and/or different formats for the same item.10Limitations of File-Based Approach2013-10-22Fundamentals of Database Systems (INSY 321)11Data dependence (Program Data Dependence)File structure is defined in the program code.Any change in the data structure necessitates a change in the program as well

Incompatible file formatsPrograms are written in different languages, and so cannot easily access each others files. (C, COBOL) Data Structures are different for different Languages

Fixed Queries/Proliferation of application programsPrograms are written to satisfy particular functions.Any new requirement needs a new program.11Database Approach2013-10-22Fundamentals of Database Systems (INSY 321)12Problems of file approach arose because:Definition of data was embedded in application programs, rather than being stored separately and independent of the applications.No control over access and manipulation of data beyond that imposed by application programs.

Solution(Result): The database and Database Management System (DBMS).This approach solves the problems of the File-based Approach12Database2013-10-22Fundamentals of Database Systems (INSY 321)13Shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization.

System catalogue (metadata) provides description of data to enable programdata independence.

Logically related data comprises entities, attributes, and relationships of an organizations information.13Database Management System (DBMS)2013-10-22Fundamentals of Database Systems (INSY 321)14A software system that enables users to define, create, maintain, and control access to the database.

(Database) application program: a computer program that interacts with database by issuing an appropriate request (SQL statement) to the DBMS.14Database Management System (DBMS)2013-10-22Fundamentals of Database Systems (INSY 321)15

15Database Approach Contd..2013-10-22Fundamentals of Database Systems (INSY 321)16Data definition language (DDL).Permits specification of data types, structures and any data constraints. All specifications are stored in the database.Enables the Creation, Alteration and Removal of a Database ObjectData manipulation language (DML).General enquiry facility (query language -retrieval) of the data.In addition to querying, we can also have manipulation of Data (Adding New info, Updating info, Deleting info)16Database Approach Access Control2013-10-22Fundamentals of Database Systems (INSY 321)17Controlled access to database may include: (User/Role- definition, Privilege Assignment/Revocation, Access Enforcement )a security systeman integrity systema concurrency control systema recovery control systema user-accessible catalogue.17Data base Views2013-10-22Fundamentals of Database Systems (INSY 321)18The Database Approach introduces a little complexity on the part of the end user.Although the Database is a shared collection, users are interested in their specific data needs A view allows each user to have his or her own view of the database.A view is essentially some subset of the database. Data Irrelevant to a user is not at all visible

18Views - Benefits2013-10-22Fundamentals of Database Systems (INSY 321)19Reduce complexityProvide a level of securityProvide a mechanism to customize the appearance of the databasePresent a consistent, unchanging picture of the structure of the database, even if the underlying database structure is changed19Components of DBMS Environment2013-10-22Fundamentals of Database Systems (INSY 321)20

In A DBMS Environment there are Five basic ComponentsHardwareSoftwareDataProcedurePeople20Components of DBMS Environment2013-10-22Fundamentals of Database Systems (INSY 321)21HardwareCan range from a PC to a network of computers.Includes all the necessary input, output , storage and backup devices SoftwareDBMS, operating system, network software (if necessary) and also the application programs.DataData used by the organization and a description of this data called the schema.

21Components of DBMS Environment2013-10-22Fundamentals of Database Systems (INSY 321)22ProceduresInstructions and rules that should be applied to the design and use of the database and DBMS.PeopleDifferent Roles taken by people while designing and using a Database systems22Roles in the Database Environment2013-10-22Fundamentals of Database Systems (INSY 321)23Data Administrator (DA)Responsible on management of data resources. This involves in database planning, development, maintenance of standards policies and procedures at the conceptual and logical design phases.Database Administrator (DBA)This is more technically oriented role. DBA is responsible for the physical realization of the database. It is involved in physical design, implementation, security and integrity control of the database.Also deals with Optimizing the performance of the system23Database Designers2013-10-22Fundamentals of Database Systems (INSY 321)24Database Designers (Logical and Physical)Identifies the data to be stored and choose the appropriate structures to represent and store the data. Should understand the user requirement and should choose how the user views the database. Involve on the design phase before the implementation of the database system. We have two distinctions of database designers, one involving in the logical &conceptual design and another involving in physical design.

24Database Designers- Contd2013-10-22Fundamentals of Database Systems (INSY 321)25Logical and Conceptual DBDIdentifies data (entity, attributes and relationship) relevant to the organizationIdentifies constraints on each dataUnderstands data and business rules in the organizationSees the database independent of any data model at conceptual level and considers one specific data model at logical design phase.Physical DBDTakes logical design specification as input and decide how it should be physically realized.Maps the logical data model on the specified DBMS with respect to tables and integrity constraints. (DBMS dependent designing)Selects specific storage structure and access path to the databaseDesigns security measures required on the database25Application Programmers2013-10-22Fundamentals of Database Systems (INSY 321)26System analyst determines the user requirement and how the user wants to view the database.The application programmer implements these specifications as programs; code, test, debug, document and maintain the application program.The application programmer determines the interface on how to retrieve, insert, update and delete data in the database.26End Users (naive and sophisticated)2013-10-22Fundamentals of Database Systems (INSY 321)27Nave Users:Sizable proportion of usersUnaware of the DBMSOnly access the database based on their access level and demandUse standard and pre-specified types of queries.Sophisticated UsersUsers familiar with the structure of the Database and facilities of the DBMS.Have complex requirementsHave higher level queries

27History of Database Systems2013-10-22Fundamentals of Database Systems (INSY 321)28First-generation Hierarchical and Network

Second generationRelational

Third generationObject-RelationalObject-Oriented28Advantages of DBMSs2013-10-22Fundamentals of Database Systems (INSY 321)29Control of data redundancyData consistencyMore information from the same amount of dataSharing of dataImproved data integrityImproved securityEnforcement of standardsEconomy of scale29Advantages of DBMSs2013-10-22Fundamentals of Database Systems (INSY 321)30Balance conflicting requirementsImproved data accessibility and responsivenessIncreased productivityImproved maintenance through data independenceIncreased concurrencyImproved backup and recovery services30Disadvantages of DBMSs2013-10-22Fundamentals of Database Systems (INSY 321)31ComplexitySizeCost of DBMSAdditional hardware costsCost of conversionPerformanceHigher impact of a failure31Objectives of Three-Level ANSI-SPARC Architecture2013-10-22Fundamentals of Database Systems (INSY 321)32All users should be able to access same data.

A users view is immune to changes made in other users views.

Users should not need to know physical database storage details.32Objectives of Three-Level ANSI-SPARC Architecture2013-10-22Fundamentals of Database Systems (INSY 321)33DBA should be able to change database storage structures without affecting the users views.

Internal structure of database should be unaffected by changes to physical aspects of storage.

DBA should be able to change conceptual structure of database without affecting all users.33ANSI-SPARC Three-Level Architecture2013-10-22Fundamentals of Database Systems (INSY 321)34

34ANSI-SPARC Three-Level Architecture2013-10-22Fundamentals of Database Systems (INSY 321)35External LevelUsers view of the database. Describes that part of database that is relevant to a particular user.

Conceptual LevelCommunity view of the database. Describes what data is stored in database and relationships among the data. 35ANSI-SPARC Three-Level Architecture2013-10-22Fundamentals of Database Systems (INSY 321)36Internal LevelPhysical representation of the database on the computer. Describes how the data is stored in the database.This is how the OS and DBMS view Data 36Differences between Three Levels of ANSI-SPARC Architecture2013-10-22Fundamentals of Database Systems (INSY 321)37

37Data Independence2013-10-22Fundamentals of Database Systems (INSY 321)38Main Concept is Upper layers are immune to changes in the lower layersLogical Data IndependenceRefers to immunity of external schemas to changes in conceptual schema.Conceptual schema changes (e.g. addition/removal of entities) should not require changes to external schema or rewrites of application programs. Obviously particular users will be affected but not all users38Data Independence2013-10-22Fundamentals of Database Systems (INSY 321)39Physical Data IndependenceRefers to immunity of conceptual schema to changes in the internal schema.Internal schema changes (e.g. using different file organizations, storage structures/devices) should not require change to conceptual or external schemas.39Schema Mapping- Provision for Data Independence2013-10-22Fundamentals of Database Systems (INSY 321)40ANSI-SPARC has three Layers of Describing Organizational DataThe DBMS is responsible for mapping between these three types of schema. External/Conceptual MappingThis enables the DBMS to map names in the users view on to the relevant part of the conceptual schema.Conceptual/Internal MappingThis enables the DBMS to find the actual record or combination of records in physical storage that constitute a logical record in the conceptual schema, together with any constraints to be enforced on the operations for that logical record.Data Independence and the ANSI-SPARC Three-Level Architecture2013-10-22Fundamentals of Database Systems (INSY 321)41

41Database Languages2013-10-22Fundamentals of Database Systems (INSY 321)42Data Definition Language (DDL)Allows the DBA or user to describe and name entities, attributes, and relationships required for the applicationplus any associated integrity and security constraints. 42Database Languages2013-10-22Fundamentals of Database Systems (INSY 321)43Data Manipulation Language (DML)Provides basic data manipulation operations on data held in the database.Procedural DML allows user to tell system exactly how to manipulate data.Non-Procedural DML allows user to state what data is needed rather than how it is to be retrieved.Fourth Generation Languages (4GLs)Automated CASE tools43Data Model2013-10-22Fundamentals of Database Systems (INSY 321)44Defined: Integrated collection of concepts for describing data, relationships between data, and constraints on the data in an organization.Data Model comprises:a structural part;a manipulative part;possibly a set of integrity rules.Can have Three types of Models ( In-line with ANSI-SPARC) an external data model, to represent each users view of the organization, sometimes called the Universe of Discourse (UoD);a conceptual data model, to represent the logical (or community) view that is DBMS independent;an internal data model, to represent the conceptual schema in such a way that it can be understood by the DBMS.44Data Model2013-10-22Fundamentals of Database Systems (INSY 321)45PurposeTo represent data in an understandable way.

Categories of data models include:Object-basedRecord-basedPhysical.45Data Models2013-10-22Fundamentals of Database Systems (INSY 321)Object-Based Data Models- based on the concept of Entity (distinct object)Entity-Relationship- Considers only the data aspectObject-Oriented- considers both data and behaviour.Record-Based Data Models (Hierarchical, Network and Relational)based on fixed format recordsEach record has fixed number of fields each field is of a fixed lengthPhysical Data ModelsModels for describing physical storage characterstics46Hierarchical Data Models2013-10-22Fundamentals of Database Systems (INSY 321)47 Record type is referred to as node or segment The top node is the root node Nodes are arranged in a hierarchical structure as sort of upside down tree A parent node can have more than one child node A child node can only have one parent node The relationship between parent and child is one-to-many Relation is established by creating physical link between stored records (implemented as pointer) To add new record type or relationship, the database must be redefined and then stored in a new form.

47Hierarchical Data Model2013-10-22Fundamentals of Database Systems (INSY 321)48

48Network Data model2013-10-22Fundamentals of Database Systems (INSY 321)49Allows record types to have more than one parent unlike hierarchical model A network data models sees records as set membersEach set has an owner and one or more membersAllows many to many relationship between entitiesLike hierarchical model network model is a collection of physically linked records.Allow member records to have more than one owner49Network Data Model2013-10-22Fundamentals of Database Systems (INSY 321)50

501st Generation Data Models2013-10-22Fundamentals of Database Systems (INSY 321)51Hierarchical and NetworkNavigational and procedural approach to data processingNeed to know the physical Database to access the dataTreat records as individual objects linked with pointers i.e. cannot process in sets

51Relational Data Model (2nd Gen.)2013-10-22Fundamentals of Database Systems (INSY 321)52Relational Model for Large Shared Data Banks famous paper of Dr. Edgar F. CoddTerminologies originates from the branch of mathematics called set theory and predicate logic and is based on the mathematical concept called RelationCan define more flexible and complex relationshipViewed as a collection of tables called Relations equivalent to collection of record typesRelation: Two dimensional tableStores information or data in the form of tables rows and columnsA row of the table is called tuple equivalent to recordA column of a table is called attribute equivalent to fieldsData value is the value of the AttributeRecords are related by the data stored jointly in the fields of records in two tables or files. The related tables contain information that creates the relationshipUses Declarative ( as Opposed to Procedural) approach to Database Processing Can Treats Records as a Group (Set)52Relational Data Model2013-10-22Fundamentals of Database Systems (INSY 321)53

53Physical Data Models2013-10-22Fundamentals of Database Systems (INSY 321)54Used in Internal Database Design structure / Class/ Type Def Eg. Frame Memory, Unifying Model54Conceptual Modelling2013-10-22Fundamentals of Database Systems (INSY 321)55Conceptual schema is the core of a system supporting all user views.Should be complete and accurate representation of an organizations data requirements.

Conceptual modelling is process of developing a model of information use in an organization that is independent of implementation details such as the target DBMS, application programs, programming languages, or any other physical considerations.Result is a conceptual data model.55Logical Modeling2013-10-22Fundamentals of Database Systems (INSY 321)56Conceptual models are also referred to as logical models in the some literature. However, critically thinking, we make a distinction between conceptual and logical data models. The conceptual model is independent of all implementation details, whereas the logical data model assumes knowledge of the underlying data model of the target DBMS.Functions of a DBMS2013-10-22Fundamentals of Database Systems (INSY 321)57Data Storage, Retrieval, and Update.

A User-Accessible Catalog.

Transaction Support.

Concurrency Control Services.

Recovery Services.57Functions of a DBMS2013-10-22Fundamentals of Database Systems (INSY 321)58Authorization Services.

Support for Data Communication.

Integrity Services.

Services to Promote Data Independence.

Utility Services( Import/Export, Task scheduler, etc)58System Catalog2013-10-22Fundamentals of Database Systems (INSY 321)59Repository of information (metadata) describing the data in the database.One of the fundamental components of DBMS.Typically stores:names, types, and sizes of data items;constraints on the data;names of authorized users;data items accessible by a user and the type of access;usage statistics.59Components of a DBMS2013-10-22Fundamentals of Database Systems (INSY 321)60

60Components of Database Manager (DM)2013-10-22Fundamentals of Database Systems (INSY 321)61

61Multi-User DBMS Architectures2013-10-22Fundamentals of Database Systems (INSY 321)62There are Common architectures that are used to implement multi-user database management systems,

Teleprocessing,File-server, and Clientserver.

Tele-Processing2013-10-22Fundamentals of Database Systems (INSY 321)63The traditional architecture for multi-user systems was teleprocessing, where there is one computer with a single central processing unit (CPU) and a number of terminalsUser terminals are typically dumb ones, incapable of functioning on their own. Just cabled to the main computer.

Disadvantage2013-10-22Fundamentals of Database Systems (INSY 321)64Unfortunately, this architecture placed a tremendous burden on the central computer, which not only had to run the application programs and the DBMS, but also had to carry out a significant amount of work on behalf of the terminals (such as formatting data for display on the screen).

File-Server Architecture2013-10-22Fundamentals of Database Systems (INSY 321)65In a file-server environment, the processing is distributed about the network, typically a local area network (LAN). The file-server holds the files required by the applications and the DBMS. However, the applications and the DBMS run on each workstation requesting files from the file-server when necessary. In this way, the file-server acts simply as a shared hard disk drive. The DBMS on each workstation sends requests to the file-server for all data that the DBMS requires that is stored on disk.File-Server Contd2013-10-22Fundamentals of Database Systems (INSY 321)66

The file-server architecture - three main disadvantages2013-10-22Fundamentals of Database Systems (INSY 321)67(1) There is a large amount of network traffic.(2) A full copy of the DBMS is required on each workstation.(3) Concurrency, recovery, and integrity control are more complex because there can be multiple DBMSs accessing the same files at the same time.ClientServer Architecture2013-10-22Fundamentals of Database Systems (INSY 321)68To overcome the disadvantages of the first two approaches and accommodate an increasingly decentralized business environment, the clientserver architecture was developed.Clientserver refers to the way in which software components interact to form a system.As the name suggests, there is a client process, which requires some resource, and a server, which provides the resource. There is no requirement that the client and server must reside on the same machine.

Possible combinations of the clientserver topology.2013-10-22Fundamentals of Database Systems (INSY 321)69

Traditional Two tier Applications2013-10-22Fundamentals of Database Systems (INSY 321)70

Two Tier- Problems2013-10-22Fundamentals of Database Systems (INSY 321)71The need for enterprise scalability challenged this traditional two-tier clientserver model.In the mid-1990s, as applications became more complex and potentially could be deployed to hundreds or thousands of end-users, the client side presented two problems that prevented true scalability: A fat client, requiring considerable resources on the clients computer to run effectively. This includes disk space, RAM, and CPU power.A significant client-side administration overhead.Solution is to go for Three Tier- ArchitectureThree tier-Client Server Architecture2013-10-22Fundamentals of Database Systems (INSY 321)72