36
Week 5 Lecture Distributed Database Management Systems Samuel Conn , Asst Professor Suggestions for using the Lecture Slides

Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

Embed Size (px)

Citation preview

Page 1: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

Week 5 Lecture

Distributed Database Management Systems

Samuel Conn, Asst Professor

Suggestions for using the Lecture Slides

Page 2: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

2

In this lecture, you will learn:

What a distributed database management system (DDBMS) is and what its components are

How database implementation is affected by different levels of data and process distribution

How transactions are managed in a distributed database environment

How database design is affected by the distributed database environment

Page 3: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

3

Evolution of DDBMS

Decentralized database management systems (DDBMS) Interconnected computer systems Data/processing functions reside on multiple sites

1970’s: Centralized DBMS 1980’s: Social and Technical Changes

Ad hoc capability required Decentralized management structure common

1990’s: New forces Internet and the World Wide Web used for data access and

distribution Data analysis through data mining and data warehousing

Page 4: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

4

DDBMS Advantages

Data located near site with greatest demand

Faster data access Faster data processing Growth facilitation Improved communications Reduced operating costs User-friendly interface Less danger of single-point failure Processor independence

Page 5: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

5

DDBMS Disadvantages

Complexity of management and control

Security Lack of standards Increased storage requirements Greater difficulty in managing data

environment Increased training costs

Page 6: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

6

Distributed Processing

Shares database’s logical processing among physically, networked independent sites

Figure 10.1

Page 7: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

7

Distributed Database

Stores logically related database over physically independent sites

Figure 10.2

Page 8: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

8

Distributed Database vs. Distributed Processing

Distributed processing Does not require distributed database May be based on a single database on single

computer Copies or parts of database processing

functions must be distributed to all data storage sites

Distributed database Requires distributed processing

Both Require a network to connect components

Page 9: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

9

Functions of DDBMS

Application/end user interface Validation to analyze data requests Transformation to determine request components Query optimization to find the best access

strategy Mapping to determine the data location I/O interface to read or write data Formatting to prepare the data for presentation Security to provide data privacy Backup and recovery DB Administration Concurrency Control Transaction Management

Page 10: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

10

Centralized Database

Figure 10.3

Page 11: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

11

Fully Distributed Database Management System

Figure 10.4

Page 12: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

12

DDBMS Components

Computer workstations Network hardware and software

components Communications media Transaction processor (TP)

Also called application manager (AP) or transaction manager (TM)

Data processor (DP) Also called data manager (DM)

Page 13: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

13

Distributed Database Components

Figure 10.5

Page 14: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

14

DDBMS Protocols

Interface with network to transport data and commands between DPs and TPs

Synchronize data received from DPs and route to appropriate TPs

Ensure common database functions Security Concurrency control Backup and recovery

Page 15: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

15

Levels of Data and Process Distribution

Database systems can be classified based on process distribution and data distribution

Table 10.1

Page 16: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

16

Single-Site Processing, Single-Site Data (SPSD)

All processing on single CPU or host computer

All data are stored on host computer disk

DBMS located on the host computer DBMS accessed by dumb terminals Typical of mainframe and minicomputer

DBMSs Typical of 1st generation of single-user

microcomputer database

Page 17: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

17

Single-Site Processing, Single-Site Data (con’t.)

Figure 10.6

Page 18: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

18

Multiple-Site Processing, Single-Site Data (MPSD)

• Requires network file server • Applications accessed through LAN • Variation known as client/server architecture

Figure 10.7

Page 19: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

19

Multiple-Site Processing, Multiple-Site Data (MPMD)

Fully distributed DDBMS with support for multiple DPs and TPs at multiple sites Homogeneous I

• Integrate one type of centralized DBMS over the network

Heterogeneous • Integrate different types of centralized

DBMSs over a network

Page 20: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

20

Heterogeneous Distributed Database Scenario

Figure 10.8

Page 21: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

21

Distributed DB Transparency

Allows end users to feel like only database user

Hides complexities of distributed database

Transparency features Distribution Transaction Failure Performance Heterogeneity

Page 22: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

22

Distribution Transparency

Allows management of a physically dispersed database as though it were centralized

Three Levels Fragmentation transparency Location transparency Local mapping transparency

Table 10.2

Page 23: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

23

Transaction Transparency

Ensures transactions maintain integrity and consistency

Completed only if all involved database sites complete their part of the transaction

Management mechanisms Remote request Remote transaction Distributed transaction Distributed request

Page 24: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

24

Remote Request

Figure 10.10

Page 25: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

25

Remote Transaction

Figure 10.11

Page 26: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

26

Distributed Transaction

Figure 10.12

Page 27: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

27

Distributed Requests

Figure 10.13

Page 28: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

28

Distributed Requests (con’t.)

Figure 10.14

Page 29: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

29

Distributed Concurrency Control

Multi-site, multiple-process operations more likely to create data inconsistencies and deadlocked transactions

Problems Transaction committed by local DP One DP could not commit transaction’s result Yields inconsistent database

Page 30: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

30

Two-Phase Commit Protocol

DO-UNDO-REDO protocol Write-ahead protocol Two kinds of nodes

• Coordinator • Subordinates

Phases Preparation

• Coordinator sends message to all subordinates • Confirms all are ready to commit or abort

Final Commit • Ensures all subordinates have committed or aborted

Page 31: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

31

Performance Transparency and Query Optimization

Objective: Minimize total cost associated with execution of request

Main costs Access time Communication CPU time

Basis for query optimization algorithms Optimum execution order Sites accessed to minimize communication costs

Dynamic or static optimization Statistically based vs. rule-based query

optimization algorithms

Page 32: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

32

Distributed Database Design

Partition database into fragments Horizontal Vertical Mixed

Fragments to replicate Storage of data copies at multiple sites Fully, partially, un-replicated databases

Data allocation Where to locate data Centralized, partitioned, replicated

Page 33: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

33

Client/Server Advantages Over DDBMS

Client/server less expensive Client/server solutions allow use of

microcomputer’s GUI More people with PC skills than

mainframe skills PC is well established in workplace Numerous data analysis and query tools

exist Considerable cost advantages to off-

loading application development

Page 34: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

34

Client/Server Disadvantages

Creates more complex environment with different platforms Increased number of users and sites creates security problems Training issues become more complex and expensive

Page 35: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

35

Date’s 12 Commandments for Distributed Databases

1. Local Site Independence 2. Central Site Independence 3. Failure Independence 4. Location Transparency 5. Fragmentation Transparency

6. Replication Transparency

Page 36: Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides

36

Date’s 12 Commandments for Distributed Databases

7. Distributed Query Processing 8. Distributed Transaction Processing 9. Hardware Independence 10. Operating System Independence 11. Network Independence

12. Database Independence