Ch09 DBMS (Revised 20071206)

Embed Size (px)

Citation preview

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    1/40

    Chapter 9

    Database Management Systems

    Accounting Information Systems, 5th editionJames A. Hall

    (Revised by Jiin-Feng Chen, National Chengchi University forclassroom use)

    COPYRIGHT 2007 Thomson South-Western, a part of The Thomson Corporation. Thomson, the Star logo,and South-Western are trademarks used herein under license

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    2/40

    Objectives for Chapter 9

    Problems inherent in the flat file approach to datamanagement that gave rise to the database concept

    Relationships among the defining elements of the

    database environment Anomalies caused by un-normalized databases and the

    need for data normalization

    Stages in database design: entity identification, data

    modeling, constructing the physical database, andpreparing user views

    Features of distributed databases and issues to considerin deciding on a particular database configuration

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    3/40

    Flat-File Versus Database

    Environments

    Computer processing involves two components: dataand instructions (programs).

    Conceptually, there are two methods for designing theinterface between program instructions and data:

    file-oriented processing: A specific data file wascreated for each application

    data-oriented processing: Create a singledatarepository to support numerous applications.

    Disadvantages of file-oriented processing includeredundant data and programs and varying formats forstoring the redundant data.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    4/40

    Flat-File Environment

    Program 1

    Program 2

    Program 3

    A,B,C

    X,B,Y

    L,B,M

    User 2

    Transactions

    User 1Transactions

    User 3

    Transactions

    Data

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    5/40

    Data Redundancy &Flat-File Problems

    Data Storage - creates excessive storage costs

    of paper documents and/or magnetic form Data Updating - any changes or additions mustbe performed multiple times

    Currency of Information - potential problem of

    failing to update all affected files Task-Data Dependency - users inability to

    obtain additional information as his or herneeds change

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    6/40

    Program 1

    Program 2

    Program 3

    User 2

    Transactions

    User 1Transactions

    User 3

    Transactions

    Database

    D

    BMS

    A,B,

    C,X,Y,L,

    M

    Database Approach

    Applications

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    7/40

    Advantages of the

    Database ApproachData sharing/centralizeddatabase resolves flat-file problems:

    Nodata redundancy-Data is stored only once, eliminating

    data redundancy and reducing storage costs. Single update-Because data is in only one place, it

    requires only a single update, reducing the time and cost ofkeeping the database current.

    Current values-A change to the database made by any

    user yields current data values for all other users. Task-data independence-As users information needs

    expand, the new needs can be more easily satisfied thanunder the flat-file approach.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    8/40

    Disadvantages of the

    Database Approach Can be costly to implement

    additional hardware, software, storage, and networkresources are required

    Can only run in certain operating environments may make it unsuitable for some system

    configurations

    Because it is so different fromthe file-oriented approach, the databaseapproach requires training users may be inertia or resistance

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    9/40

    Elements of the Database Approach

    System Development

    Process

    DatabaseAdministrator

    USERS

    DBMS

    HostOperatingSystem

    PhysicalDatabase

    User

    Programs

    UserPrograms

    UserPrograms

    Applications

    DataDefinitionLanguage

    DataManipulationLanguage

    QueryLanguage

    User Queries

    Transactions

    Transactions

    Transactions

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    10/40

    DBMS Features User Programs - makes the presence of the DBMS

    transparent to the user

    Direct Query - allows authorized users to accessdata without programming

    Application Development - user createdapplications

    Backup and Recovery - copies database

    Database Usage Reporting - captures statistics ondatabase usage (who, when, etc.)

    Database Access - authorizes access to sections ofthe database

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    11/40

    Internal Controls and DBMS

    The purpose of the DBMS is to provide

    controlled accessto the database. The DBMS is a special software system

    programmed to know which data elements

    each user is authorized to access anddeny unauthorized requests of data.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    12/40

    Data Definition Language (DDL)

    DDL is a programming language used to definethe database to the DBMS.

    The DDL identifies the names and the relationship

    of all data elements, records, and files thatconstitute the database. Viewing Levels:

    internal view - physical arrangement of

    records (1) conceptual view - representation of database(1)

    user view - the portion of the database eachuser views (many)

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    13/40

    Data Manipulation Language (DML)

    DML is the proprietary programminglanguage that a particular DBMS uses to

    retrieve, process, and store data.

    Entire user programs may be written in theDML, or selected DML commands can be

    inserted into universal programs, such asCOBOL and FORTRAN.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    14/40

    Query Language

    The query capability permits end usersand professional programmers to access

    data in the database without the need forconventional programs.

    ANSIs Structured Query Language (SQL)

    is a fourth-generation language (4GL) thathas emerged as the standard querylanguage.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    15/40

    Functions of the DBA

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    16/40

    Logical Data Structures

    A particular method used to organize records ina database is called the databases structure.

    The objective is to develop this structureefficiently so that data can be accessed quicklyand easily.

    Four types of structures are:

    hierarchical (tree structure) network

    relational

    object-oriented

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    17/40

    The Relational Model

    The relational model portrays data in theform of two dimensional tables:

    relation - the database table

    attributes (data elements) - form columns

    tuples (records) - form rows

    data - the intersection of rows and columns

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    18/40

    RESTRICT - filtering out rows,such as the dark blue

    PROJECT - filtering out columns,such as the light blue

    X1 X1

    X2 X2

    X3 X3

    Y1

    Y1

    Y1 Y1

    Y1

    Y2 Y2 Y2

    Y3

    Z1 Z1

    Z2 Z2

    Z3 Z1

    JOIN

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    19/40

    Properly Designed Relational Tables

    No repeating values - All occurrences at theintersection of a row and column are a single

    value. The attribute values in any column must all

    be of the same class.

    Each column in a giventable must beuniquely named.

    Each row in the table must be unique in atleast one attribute, which is the primary key.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    20/40

    Crows Feet Cardinalities

    (1:0,1)

    (1:1)

    (1:0,M)

    (1:M)

    (M:M)

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    21/40

    Relational Model Data

    Linkages (>1 table) No explicit pointers are present. The data are viewed as a

    collection of independent tables.

    Relations are formed by an attribute that is common to bothtables in the relation.

    Assignment of foreign keys:

    if 1 to 1association, either of the tables primary key may

    be the foreign key. if 1 to many association, the primary key on one of the

    sides is embedded as the foreign key on the other side.

    if many to many association, may embed foreign keys orcreate a separate linking table.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    22/40

    Three Types of Anomalies

    Insertion Anomaly: A new item cannotbe added to the table until at least oneentity uses a particular attribute item.

    Deletion Anomaly: If an attribute itemused by only one entity is deleted, allinformation about that attribute item islost.

    Update Anomaly: A modification on an

    attribute must be made in each of therows in which the attribute appears.

    Anomalies can be corrected by creatingrelational tables.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    23/40

    Advantages of Relational Tables

    Removes all three anomalies

    Various items of interest (customers,inventory, sales) are stored in separatetables.

    Space is used efficiently.

    Very flexible. Users can form ad hocrelationships.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    24/40

    The Normalization Process A process which systematically splits

    unnormalized complex tables intosmaller tablesthat meet two conditions:

    all nonkey (secondary) attributes in the table aredependent on the primary key

    all nonkey attributes are independent of theother nonkey attributes

    When unnormalized tables are split andreduced to third normal form, they must thenbe linked together by foreign keys.

    S i N li i

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    25/40

    Steps in Normalization

    Table with

    repeating groups

    First normalform 1NF

    Second normalform 2NF

    Third normalform 3NF

    Higher normal

    forms

    Removerepeatinggroups

    Remove

    partialdependencies

    Removetransitive

    dependencies

    Removeremaininganomalies

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    26/40

    Accountants and Data

    Normalization Update anomalies can generate conflicting and

    obsolete database values.

    Insertion anomalies can result in unrecordedtransactions and incomplete audit trails.

    Deletion anomalies can cause the loss of accountingrecords and the destruction of audit trails.

    Accountantsshouldunderstand the datanormalization process and be able to determinewhether a database is properly normalized.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    27/40

    Six Phases in Designing Relational

    Databases1. Identify entities

    identify the primary entities of the

    organization construct a data model of theirrelationships

    2. Construct a data model showing entity

    associations determine the associations betweenentities

    model associations into an ER diagram

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    28/40

    Six Phases in Designing Relational

    Databases3. Add primary keys and attributes

    assign primary keys to all entities in the

    model to uniquely identify records every attribute should appear in one or

    more user views

    4. Normalize and add foreign keys remove repeating groups, partial and

    transitive dependencies

    assign foreign keys to be able to link tables

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    29/40

    Six Phases in Designing Relational

    Databases

    5. Construct the physical database

    create physical tables

    populate tables with data

    6. Prepare the user views

    normalized tables should support allrequired views of system users

    user views restrict users from haveaccess to unauthorized data

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    30/40

    Distributed DataProcessing

    Site CSite BSite A

    CentralizedDatabase

    Central

    Site

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    31/40

    Distributed Data Processing

    Data processing is organized aroundseveral information processing units (IPUs)

    distributed throughout the organization Each IPU is placed under the control of the

    end user.

    DDP does not mean

    decentralization IPUs are connected to one

    another and coordinated

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    32/40

    Advantages of DDP

    Cost reductions in hardware and dataentry tasks

    Improved cost control responsibility

    Improved user satisfaction since controlis closer to the user level

    Backup of data can be improved throughthe use of multiple data storage sites

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    33/40

    Disadvantages of DDP

    Loss of control

    Mismanagement of resources

    Hardware and software incompatibility

    Redundant tasks and data

    Consolidating incompatible tasks

    Difficulty attracting qualified personnel

    Lack of standards

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    34/40

    The data is retained in a central location.

    Remote IPUs send requests for data.

    Central site services the needs of theremote IPUs.

    The actual processing of the data isperformed at the remote IPU.

    Centralized Databases in DDP

    Environment

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    35/40

    Data Currency

    Occurs in DDP with a centralizeddatabase

    During transaction processing, data willtemporarily be inconsistent as records areread and updated.

    Database lockout procedures arenecessary to keep IPUs from readinginconsistent data and from writing over atransaction being written by another IPU.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    36/40

    Distributed Databases: Partitioning

    Splits the central database into segments that aredistributed to their primary users

    Advantages: users control is increased by having data storedat local sites

    transaction processing response time is improved

    volume of transmitted data between IPUs isreduced

    reduces the potential data loss from a disaster

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    37/40

    The Deadlock Phenomenon

    Especially a problem withpartitioned databases

    Occurs when multiple sites lock each otherout of data that they are currently using

    One site needs data locked by another site.

    Special software is needed to analyze andresolve conflicts.

    Transactions may be terminated and restarted.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    38/40

    The Deadlock Phenomenon

    A,BE, F

    C,D

    Locked A, waiting for C

    Locked C, waiting for E

    Locked E, waiting for A

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    39/40

    Distributed Databases: Replication The duplication of the entire database for

    multiple IPUs

    Effective for situations with a high degreeof data sharing, but no primary user

    supports read-only queries.

    Data traffic between sites is reducedconsiderably.

  • 8/4/2019 Ch09 DBMS (Revised 20071206)

    40/40

    Concurrency Problems and

    Control Issues Database concurrency is the presence of

    complete and accurate data at all IPU sites.

    With replicated databases, maintainingcurrent data at all locations is difficult.

    Time stamping is used to serialize

    transactions. Prevents and resolves conflicts created by

    updating data at various IPUs