27
© 2007 by Prentice Hall 1 Introduction to Introduction to databases databases

© 2007 by Prentice Hall 1 Introduction to databases

Embed Size (px)

Citation preview

Page 1: © 2007 by Prentice Hall 1 Introduction to databases

© 2007 by Prentice Hall1

Introduction to databasesIntroduction to databases

Page 2: © 2007 by Prentice Hall 1 Introduction to databases

2

ObjectivesObjectives• Define termsDefine terms• Show limitations of conventional file processingShow limitations of conventional file processing• Show advantages of databasesShow advantages of databases• Identify costs and risks of databasesIdentify costs and risks of databases• List components of database environmentList components of database environment• Describe evolution of database systemsDescribe evolution of database systems• Describe database system development life cycleDescribe database system development life cycle• Explain prototyping and agile development approachesExplain prototyping and agile development approaches• Explain roles of individualsExplain roles of individuals• Explain the three-schema architecture for databasesExplain the three-schema architecture for databases

Page 3: © 2007 by Prentice Hall 1 Introduction to databases

3

DefinitionsDefinitions

• Database: organized collection of logically related Database: organized collection of logically related datadata

• Data: stored representations of meaningful objects Data: stored representations of meaningful objects and eventsand events– Structured: numbers, text, datesStructured: numbers, text, dates– Unstructured: images, video, documentsUnstructured: images, video, documents

• Information: data processed to increase knowledge Information: data processed to increase knowledge in the person using the datain the person using the data

Page 4: © 2007 by Prentice Hall 1 Introduction to databases

4

Data in context

Context helps users understand data

Page 5: © 2007 by Prentice Hall 1 Introduction to databases

5

Graphical displays turn data into useful information that managers can use for decision making and

interpretation

Summarized data

Page 6: © 2007 by Prentice Hall 1 Introduction to databases

6

Metadata: Descriptions of the properties or characteristics of the data, including data types, field sizes, allowable values, and

data context

Page 7: © 2007 by Prentice Hall 1 Introduction to databases

7

Old file processing systems at Pine Valley Furniture Company

Duplicate Data

Page 8: © 2007 by Prentice Hall 1 Introduction to databases

8

Disadvantages of File ProcessingDisadvantages of File Processing

• Program-Data DependenceProgram-Data Dependence– All programs maintain metadata for each file they useAll programs maintain metadata for each file they use

• Duplication of Data (data redundancy)Duplication of Data (data redundancy)– Different systems/programs have separate copies of the same dataDifferent systems/programs have separate copies of the same data

• Limited Data SharingLimited Data Sharing– No centralized control of dataNo centralized control of data

• Lengthy Development TimesLengthy Development Times– Programmers must design their own file formatsProgrammers must design their own file formats

• Excessive Program MaintenanceExcessive Program Maintenance– 80% of information systems budget80% of information systems budget

Page 9: © 2007 by Prentice Hall 1 Introduction to databases

9

Problems with Data DependencyProblems with Data Dependency• Each application program typically maintains its own Each application program typically maintains its own

data. data. – One application’s data might be duplicated in another One application’s data might be duplicated in another

application. application. – (Then the users must do “double data entry” or there needs (Then the users must do “double data entry” or there needs

to be an interface program that copies data from one to be an interface program that copies data from one application to the other.)application to the other.)

• Each application program includes code for the Each application program includes code for the metadata of each file (e.g., “record layouts” appear in metadata of each file (e.g., “record layouts” appear in each cobol program).each cobol program).– So, if you want to add a new field to a table (e.g., birth-date So, if you want to add a new field to a table (e.g., birth-date

added to each employee), every single program must be added to each employee), every single program must be modified and retested.modified and retested.

Page 10: © 2007 by Prentice Hall 1 Introduction to databases

10

Problems with Data DependencyProblems with Data Dependency• Each application program has its own Each application program has its own

processing routines for reading, inserting, processing routines for reading, inserting, updating, and deleting data. updating, and deleting data. – Too many lines of code makes it hard to Too many lines of code makes it hard to

maintain (modify while keeping out the bugs)maintain (modify while keeping out the bugs)• Lack of coordination and central controlLack of coordination and central control• Non-standard file formatsNon-standard file formats

Page 11: © 2007 by Prentice Hall 1 Introduction to databases

11

Problems with Data RedundancyProblems with Data Redundancy

• Data inconsistency:Data inconsistency: Data changes in one file Data changes in one file could cause inconsistencies (if not changed in could cause inconsistencies (if not changed in other copies of that data – which one is the other copies of that data – which one is the latest/correct version?)latest/correct version?)

• Compromises in Compromises in data integritydata integrity: As example of : As example of “compromised data integrity” would be “compromised data integrity” would be allowing records (that are referenced by other allowing records (that are referenced by other data) to be deleted. data) to be deleted.

Page 12: © 2007 by Prentice Hall 1 Introduction to databases

12

SOLUTION: SOLUTION: The DATABASE ApproachThe DATABASE Approach

• Central repository of data -- accessed by all Central repository of data -- accessed by all applications. (so, no data redundancy)applications. (so, no data redundancy)

• The data is managed (protected) by a software The data is managed (protected) by a software called a Database Management System (no called a Database Management System (no programs can access the database except thru programs can access the database except thru the DBMS) – so, no data inconsistencies.the DBMS) – so, no data inconsistencies.

Page 13: © 2007 by Prentice Hall 1 Introduction to databases

13

Database Management SystemDatabase Management System

DBMS manages data resources like an operating system manages hardware resources

A software system that is used to create, maintain, and provide controlled access to user databases

Order Filing System

Invoicing System

Payroll System

DBMSCentral database

Contains employee,order, inventory,

pricing, and customer data

Page 14: © 2007 by Prentice Hall 1 Introduction to databases

14

Advantages of the Database ApproachAdvantages of the Database Approach

• Program-data independenceProgram-data independence• Less data redundancyLess data redundancy• Improved data consistencyImproved data consistency• Improved data sharingImproved data sharing• Increased application development productivityIncreased application development productivity• Enforcement of standardsEnforcement of standards• Improved data qualityImproved data quality• Improved data accessibility and responsivenessImproved data accessibility and responsiveness• Reduced program maintenanceReduced program maintenance• Improved decision supportImproved decision support

Page 15: © 2007 by Prentice Hall 1 Introduction to databases

15

Costs and Risks of the Costs and Risks of the Database ApproachDatabase Approach

• Installation and conversion costsInstallation and conversion costs• management cost and complexity - specialized management cost and complexity - specialized

personnel are needed to support the Database and personnel are needed to support the Database and DBMS.DBMS.

• Organizational conflict – (user groups may not agree Organizational conflict – (user groups may not agree about the details of the shared data, or who should about the details of the shared data, or who should be allowed to access/update it)be allowed to access/update it)

Page 16: © 2007 by Prentice Hall 1 Introduction to databases

16

DefinitionsDefinitions• Relational DatabaseRelational Database: a database that contains related tables and little/no redundancy. Typically accessed : a database that contains related tables and little/no redundancy. Typically accessed

by business applications. by business applications.

• Data modelData model: Graphical representation (picture) of a database, showing it’s tables and how they are : Graphical representation (picture) of a database, showing it’s tables and how they are related.related.

• Database Management System (DBMS) Database Management System (DBMS) : software for managing the database. No program can see or : software for managing the database. No program can see or modify the database unless it “passes through” the DBMS which acts guards to ensuremodify the database unless it “passes through” the DBMS which acts guards to ensure

– Everyone sees only what they are allowed to see and modifies only what they are allowed to modifyEveryone sees only what they are allowed to see and modifies only what they are allowed to modify

– Data integrity is maintained (e.g., you can’t delete data that is being referenced by other data. Data integrity is maintained (e.g., you can’t delete data that is being referenced by other data. • RepositoryRepository: A database management system can work with ANY data model. How? It stores the details : A database management system can work with ANY data model. How? It stores the details

about the data model (table names, field names and types etc). These details aer called “METADATA”.about the data model (table names, field names and types etc). These details aer called “METADATA”.

• Database Application:Database Application:

– A business application program that supports business functions and accesses a database, e.g., add A business application program that supports business functions and accesses a database, e.g., add new data, update data, delete data, read data, summarize/report data)new data, update data, delete data, read data, summarize/report data)

– Usually called a Web application (if used in browser, run over the internet) or a windows app (if Usually called a Web application (if used in browser, run over the internet) or a windows app (if installed directly on a PC and run over a local area network).installed directly on a PC and run over a local area network).

Page 17: © 2007 by Prentice Hall 1 Introduction to databases

17

One customer may place many orders, but each order is placed by a single customer

One-to-many relationship

Page 18: © 2007 by Prentice Hall 1 Introduction to databases

18

One order has many order lines; each order line is associated with a single order

One-to-many relationship

Page 19: © 2007 by Prentice Hall 1 Introduction to databases

19

One product can be in many order lines, each order line refers to a single product

One-to-many relationship

Page 20: © 2007 by Prentice Hall 1 Introduction to databases

20

Therefore, one order involves many products and one product is involved in many orders

Many-to-many relationship

Page 21: © 2007 by Prentice Hall 1 Introduction to databases

21

Enterprise Database ApplicationsEnterprise Database Applications

• Enterprise Resource Planning (ERP)Enterprise Resource Planning (ERP)– Integrate all enterprise functions (manufacturing, Integrate all enterprise functions (manufacturing,

finance, sales, marketing, inventory, accounting, finance, sales, marketing, inventory, accounting, human resources)human resources)

• Data WarehouseData Warehouse– Integrated decision support system derived from Integrated decision support system derived from

various operational databasesvarious operational databases

Page 22: © 2007 by Prentice Hall 1 Introduction to databases

22

Enterprise Data WarehouseEnterprise Data Warehouse

– A data warehouse is an Integrated decision support A data warehouse is an Integrated decision support system derived from various operational databases. system derived from various operational databases. The data warehouse contains historical copies of The data warehouse contains historical copies of transactions of those operational systems.transactions of those operational systems.

• By getting all that data into one place, users can then By getting all that data into one place, users can then ask questions of the warehouse , such as which ask questions of the warehouse , such as which customers buy the most, when, how often.customers buy the most, when, how often.

Page 23: © 2007 by Prentice Hall 1 Introduction to databases

23

Enterprise Data WarehouseEnterprise Data Warehouse• Each day data is cleaned/reconciled (one system may call Each day data is cleaned/reconciled (one system may call

something one way and another system may call it another something one way and another system may call it another way) so that it can be loaded into the warehouse.way) so that it can be loaded into the warehouse.

• While the data is being loaded, it is also being summarized. While the data is being loaded, it is also being summarized. This is to make it more efficient to access. Otherwise there is This is to make it more efficient to access. Otherwise there is too much detailed data that needs to be summarized on the too much detailed data that needs to be summarized on the fly and performance suffers. fly and performance suffers.

• During the load process, the warehouse is not available for During the load process, the warehouse is not available for users (because the totals wont be correct until the load is users (because the totals wont be correct until the load is complete). complete).

Page 24: © 2007 by Prentice Hall 1 Introduction to databases

24

Enterprise Data WarehouseEnterprise Data Warehouse• It is interesting to note that good design principles for It is interesting to note that good design principles for

relational databases (no data redundancy, many constraints relational databases (no data redundancy, many constraints to maintain data integrity) are BAD design for data warehouse to maintain data integrity) are BAD design for data warehouse

• Data Warehouses (because they load lots of data from Data Warehouses (because they load lots of data from different sources) use lots of data redundancy (e.g., pre-different sources) use lots of data redundancy (e.g., pre-calculated totals) to make queries run faster.calculated totals) to make queries run faster.

• Data Warehouses have few or no constraints so that the load Data Warehouses have few or no constraints so that the load process goes faster (so that the warehouse becomes available process goes faster (so that the warehouse becomes available for users as soon as possible).for users as soon as possible).

• We’ll learn more about warehouses later…We’ll learn more about warehouses later…

Page 25: © 2007 by Prentice Hall 1 Introduction to databases

25

An enterprise data warehouse

• Data is extracted from multiple systems within a corporation.Data is extracted from multiple systems within a corporation.• This data is transformed (as necessary) and merged possibly This data is transformed (as necessary) and merged possibly

adding data from external sources) into the data warehouse. adding data from external sources) into the data warehouse.

Page 26: © 2007 by Prentice Hall 1 Introduction to databases

26

Evolution of DB SystemsEvolution of DB Systems

Page 27: © 2007 by Prentice Hall 1 Introduction to databases

Summary• Define terms• Show limitations of conventional file

processing• Show advantages of databases• Identify costs and risks of databases• Data Models• Define Enterprise Data Warehouse• Describe evolution of database systems