Ghani DBMS Fundamental

Embed Size (px)

Citation preview

  • 8/7/2019 Ghani DBMS Fundamental

    1/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 1

    What Is a DBMS?

    A very large, integrated collection of data. Models real-world enterprise.

    Entities (e.g., students, courses) Relationships (e.g., Tarkan is taking CENG302)

    A Database Management System (DBMS)is a software package designed to store andmanage databases.

  • 8/7/2019 Ghani DBMS Fundamental

    2/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 2

  • 8/7/2019 Ghani DBMS Fundamental

    3/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 3

    Database Management System (DBMS):

    DBMS contains information about a particular enterprise Collection of interrelated data Set of programs to access the data An environment that is bothconvenientand efficientto use

    Database Applications: Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases

    Online retailers: order tracking, customized recommendations

    Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions

    Databases touch all aspects of our lives

  • 8/7/2019 Ghani DBMS Fundamental

    4/16

    DBMS F DAME AL

    Abdu Gh iKh M.Tech (I T)Sch ofI for tion & Communication Technology, GBU Page 4

    Purpose of Database Systems:

    In thee y days database applications we e built directly on top of filesyste s Drawbac s of using filesyste s to store data:

    Data redundancy and inconsistencyxMultiple file formats duplication of information in different files

    Difficulty in accessing datax Need to write a new program to carry out each new tas

    Data isolation multiple files and formats Integrity problems

    x

    Integrityconstraints (e

    g. account balance > 0

    become buried in program

    code rather than being stated e

    plicitly

    x Hard to add new constraints or changee isting onesPurpose of Database Systems:

    Drawbac s of using filesystems (cont.) Atomicity of updates

    x Failures may leave database in an inconsistent state with partial updatescarried out

    x Example: Transfer of funds from one account to another should eithercomplete or not happen at all

    Concurrent access by multiple usersx Concurrent accessed needed for performancex Uncontrolled concurrent accessescan lead to inconsistencies

    x Example: Two people reading a balance and updating it at thesametime

    Security problemsx Hard to provide user access to some but not all, data

    Databasesystems offer solutions to all the above problems

  • 8/7/2019 Ghani DBMS Fundamental

    5/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 5

  • 8/7/2019 Ghani DBMS Fundamental

    6/16

    DBMS FUNDAMENTAL

    Abdul GhaniKhan, M.Tech (ICT)School ofInformation & Communication Technology, GBU Page 6

    File System vs. DB S:

    Acompany has 500 GB of data on employees, departments, products, sales, & so on.. Data is accessed concurrently byseveral employees Questions about the data must be answered quic ly Changes made to the data by different users must be applied consistently Access to certain parts of the data be restricted

    File System vs. DB S:

    Data stored in operating system filesMany drawbac s!!!

    500 GB of main memory not available to hold all data. Data must bestored onsecondarystorage devices Even if 500GB of main memory is available, with 32-bit addressing, wecannot refer

    directly to more than 4GB of data

    Data redundancy and inconsistencyxMultiple file formats, duplication of information in different files

    Special program to answer each question a user may as File System vs. DB

    S:

    Many drawbac ! s!!! Integrity problems

    x Integrityconstraints (e.g. account balance > 0) become buried in programcode rather than being stated explicitly

    x Hard to add new constraints or changeexisting onesWe must protect the data from inconsistent changes made by different users. If

    application programs need to addressconcurrency, their complexity increases

    manifolds

    Consistent state of data must be restored if thesystem crashes whilechanges arebeing made

    OS provide only a password mechanism for security. Not flexibleenough if usershave permission to accesssubsets of data

  • 8/7/2019 Ghani DBMS Fundamental

    7/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 7

    File System vs. DBMS:

    These drawbacks have prompted the development of database systems Database systems offer solutions to all the above problems?

    Levels ofAbstraction:

    Physicallevel: describes how a record (e.g., customer) is stored. Logicallevel: describes data stored in database, and the relationships among the data.

    typecustomer= record

    customer_id: string; customer_name : string;

    customer_street: string;

    customer_city: string;

    end;

    Viewlevel: application programs hide details of data types. Views can also hide information(such as an employees salary) for security purposes.

  • 8/7/2019 Ghani DBMS Fundamental

    8/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 8

    Instances and Schemas:

    Similar to types and variables in programming languages Schema the logical structure of the database

    Example: The database consists of information about a set of customers andaccounts and the relationship between them)

    Analogous to type information of a variable in a program Physical schema: database design at the physical level Logical schema: database design at the logical level

    Instance the actual content of the database at a particular point in time Analogous to the value of a variable

    Physical Data Independence the ability to modify the physical schema without changingthe logical schema

    Applications depend on the logical schema In general, the interfaces between the various levels and components should be well

    defined so that changes in some parts do not seriously influence others.

    Data Models:

    A collection of tools for describing Data

  • 8/7/2019 Ghani DBMS Fundamental

    9/16

    DBMS FUNDAMENTAL

    Abdul GhaniKhan, M.Tech (ICT)School ofInformation & Communication Technology, GBU Page 9

    Data relationships Data semantics Data constraints

    Relational model Entity-Relationship data model (mainly for database design) Object-based data models (Object-oriented and Object-relational) Other older models:

    Network model Hierarchical model

    Data Manipulation Language (DML):

    Language for accessing and manipulating the data organized by the appropriate data model DML also known as query language

    Two classes of languages Procedural user specifies what data is required and how to get those data Declarative (nonprocedural) user specifies what data is required without

    specifying how to get those data

    SQL is the most widely used query languageData Definition Language (DDL):

    Specification notation for defining the databaseschemaExample: createtablea " " oun# (

    a " " oun# _nu$

    b %&

    char(10),

    b'

    an( h_na)

    0char(10),

    balan 1 2 inte3

    er)

    DDL compiler generates a set of tablesstored in a da4 ad5

    6 4

    5

    ona

    78

    Data dictionarycontains metadata (i.e., data about data) Databaseschema Data sto9 ag@ andd@ A B nB tion language

    x Specifies thestoragestructure and access methods used

  • 8/7/2019 Ghani DBMS Fundamental

    10/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 10

    Integrity constraintsx Domain constraintsx Referential integrity (e.g. branch_name must correspond to a valid branch in

    the branch table)

    Authorization

  • 8/7/2019 Ghani DBMS Fundamental

    11/16

    DBMS FUNDAMENTAL

    Abdul GhaniKhan, M.Tech (ICT)School ofInformation & Communication Technology, GBU Page 11

    SQL:

    SQL: widely used non-procedural language Example: Find the name of thecustomer with customer-id 192-83-7465

    selectcustoC

    D

    E

    .custoC

    D

    E

    _naC

    D

    from custoF

    G

    H

    wherecustoI

    P

    Q

    .custoI

    P

    Q

    _id= 192-83-7465

    Example: Find the balances of all accounts held by thecustomer with customer-id192-83-7465

    selectaccount.balance

    from depositoR

    , account

    where depositoS

    .custoT

    er_id= 192-83-7465 and

    depositor.account_nuU

    ber= account.account_nuU

    ber

    Application programs generally access databases through one of Languageextensions to allow embedded SQL Application program interface (e.g., ODBC/JDBC) which allow SQL queries to besent

    to a database

    Database Design:

    The process of designing the general structure of the database:

    Logical Design Deciding on the databaseschema. Database design requiresthat we find agood collection of relation schemas.

    Business decision What attributesshould we record in the database V Computer Science decision What relation schemasshould we have and how

    should the attributes be distributed among thevarious relation schemasW

    Physical Design Deciding on the physical layout of the database The Entity-Relationship Model:

    Models an enterprise as a collection ofentities and relationships Entity: a thing or object in theenterprise that is distinguishable from other

    objects

    x Described by a set ofattributes Relationship: an association among several entities

    Represented diagrammatically by an entity-relationshipdiagraX :

  • 8/7/2019 Ghani DBMS Fundamental

    12/16

    DBMS FUNDAMENTAL

    Abdul GhaniKhan, M.Tech (ICT)School ofInformation & Communication Technology, GBU Page 12

    DatabaseApplication Architectures:

    Two-tier architecture: E.g. client programs usingODBC/JDBC tocommunicate with a database.

    Three-tier architecture: E.g. web-based applications.DatabaseManagement System Internals:

    Storage management Query processing Transaction processing

    StorageManagement:

    Storagemanager is a program module that provides the interface between the low-leveldata stored in the database and the application programs and queriessubmitted to the

    system.

  • 8/7/2019 Ghani DBMS Fundamental

    13/16

    DBMS FUNDAMENTAL

    Abdul GhaniKhan, M.Tech (ICT)School ofInformation & Communication Technology, GBU Page 13

    Thestorage manager is responsible to the following tasks: Interaction with the file manager Efficient storing, retrieving and updating of data

    Issues: Storage access File organization Indexing and hashing

    QueryProcessing:

    1.Parsing and translation

    2.Optimization

    3.Evaluation

    Transaction Management:

    Atransaction is a collection of operations that performs a single logical function in adatabase application

    Transaction-management componentensures that the database remains in a consistent(correct) state despitesystem failures (e.g., power failures and operating system crashes)

    and transaction failures.

  • 8/7/2019 Ghani DBMS Fundamental

    14/16

    DBMS FUNDAMENTAL

    Abdul Ghani Khan, M.Tech (ICT)

    School of Information & Communication Technology, GBU Page 14

    Concurrency-controlmanager controls the interaction among the concurrent transactions,to ensure the consistency of the database.

    History ofDatabase Systems:

    1950s and early 1960s: Data processing using magnetic tapes for storage

    x Tapes provide only sequential access Punched cards for input

    Late 1960s and 1970s: Hard disks allow direct access to data

  • 8/7/2019 Ghani DBMS Fundamental

    15/16

    DBMS FUNDAMENTAL

    Abdul GhaniKhan, M.Tech (ICT)School ofInformation & Communication Technology, GBU Page 15

    Network and hierarchical data models in widespread use Ted Codd defines the relational data model

    xWould win theACM TuringAward for this workx IBMResearch beginsSystem R prototypex UCBerkeley begins Ingres prototype

    High-performance (for theera) transaction processing 1980s:

    Research relational prototypesevolve into commercial systemsx SQL becomes industrystandard

    Parallel and distributed databasesystems Object-oriented databasesystems

    1990s: Large decision support and data-mining applications Large multi-terabyte data warehouses Emergence ofWeb commerce

    2000s: XML and XQuerystandards Automated database administration Increasing use of highly parallel databasesystemsWeb-scale distributed data storagesystems

    DatabaseUsers:

    Usersare differentiated by the way theyexpect to interact with

    thesystem

    Application programmers interact with system through DML calls Sophisticated users form requests in a database query language Specialized users writespecialized database applications that do not fit into the traditional

    data processing framework

  • 8/7/2019 Ghani DBMS Fundamental

    16/16

    DBMS FUNDAMENTAL

    Abdul GhaniKhan, M.Tech (ICT)School ofInformation & Communication Technology, GBU Page 16

    Naveusers invoke one of the permanent application programs that have been writtenpreviously

    Examples, people accessing database over the web, bank tellers, clerical staffDatabaseAdministrator:

    Coordinates all the activities of the databasesystem has a good understanding of theenterprises information resources and needs.

    Database administrator's duties include: Storagestructure and access method definition Schema and physical organization modification Granting users authority to access the database Backing up dataMonitoring performance and responding to changes

    x Database tuningDatabaseArchitecture:

    The architecture of a databasesystems is greatly influenced by

    the underlying computer system on which the database is running:

    Centralized Client-server Parallel (multiple processors and disks) Distributed