28
605.741 David Silberberg D. Silberberg 1 Distributed Database Systems - Distributed Database Design

8 Distributed Database Design

Embed Size (px)

Citation preview

Page 1: 8 Distributed Database Design

605.741

David Silberberg

D. Silberberg 1

Distributed Database Systems - Distributed

Database Design

Page 2: 8 Distributed Database Design

Computer Architectures The conceptual design and fundamental operational structure of a

computer system

It is the functional description

Requirements (especially speeds and interconnections)

Design implementations for the various parts of a system

Focuses largely on the way by which the system performs

Architecture typically refers to the internal structure of the system that enables logical operations

Architecture design issues revolve around:

Tradeoffs between cost and performance

Reliability

Feature set

Expandability

D. Silberberg

Distributed Database Systems - Distributed

Database Design 2

Page 3: 8 Distributed Database Design

DBMS Architectures Architecture

The structure of a computer system

Usually based on a reference model, which is an idealized architecture model

Conceptual framework that divides system into manageable pieces

Demonstrates how the pieces are related

There are 3 types of architecture standards bases Component architecture

Functional architecture

Data architecture

D. Silberberg

Distributed Database Systems - Distributed

Database Design 3

Page 4: 8 Distributed Database Design

Component Architectures Describes components of a system and their

interrelationships

Each component provides functionality

Their interaction provides overall system functionality

This is best decomposition if you want to build a system

Not best for conceptually understanding the system

Usually, multiple components form a function

D. Silberberg

Distributed Database Systems - Distributed

Database Design 4

Page 5: 8 Distributed Database Design

Functional Architectures Different classes of users are defined

A Functional Architecture defines the functions that users can perform

Usually, user classes are decomposed hierarchically

ANSI/SPARC architecture is an example

Does not help you build the system

Does not help you understand the complexity of the system

D. Silberberg

Distributed Database Systems - Distributed

Database Design 5

Page 6: 8 Distributed Database Design

Data Architectures Data representations and views are defined

Architecture framework helps understand how they are realized

Since data is central DBMS resource, this is the representation of choice

However, to fully define system, you need functions and components

D. Silberberg

Distributed Database Systems - Distributed

Database Design 6

Page 7: 8 Distributed Database Design

ANSI/X3/SPARC Committee DBMS Reference Architecture User and Data views

D. Silberberg

Distributed Database Systems - Distributed

Database Design 7

External View External View

Conceptual View

Internal View

Page 8: 8 Distributed Database Design

ANSI/X3/SPARC Committee DBMS Reference Architecture (cont.) External views are views of the data shared by end users and apps

CREATE VIEW GOOD_CUST

AS SELECT DISTINCT cust_name, cust_address

FROM CUST, ORDER

WHERE CUST.cust_no = ORDER.cust_no AND

ORDER.quantity > 10

Conceptual view is the entire world view of the data environmentCUST(cust_no, cust_name, cust_address) key is cust_no

Internal view is how the database views the structure of the dataCUST( index on C#;

C# : 4 bytes,

C-name: 30 bytes,

C-addr : 100 bytes )

D. Silberberg

Distributed Database Systems - Distributed

Database Design 8

Page 9: 8 Distributed Database Design

Partial Schematic of ANSI/SPARC Architecture All the different representations are integrated in the data

dictionary/directory (not necessarily located in one place)

Enterprise administrator takes care of Conceptual DB

DB administrator

Takes care of Internal DB

Affects/uses Conceptual DB

Application administrator

Takes care of External DB

Affects/uses Conceptual DB

Data dictionary/directory

Enables schema integration

Transforms different representations

Defines internal and external database structures for applications

D. Silberberg

Distributed Database Systems - Distributed

Database Design 9

Page 10: 8 Distributed Database Design

Partial Schematic of ANSI/SPARC Architecture

D. Silberberg

Distributed Database Systems - Distributed

Database Design 10

Database

Administrator

Application

System

Programmer

Internal database

schema processor

Internal storage/

database transform

Internal database

application program

Application

System

Administrator

Application

Programmer

External database

application program

Conceptual/external

database transform

External database

schema processor

Enterprise

Administrator

Internal database/

conceptual transform

Conceptual database

schema processor

Schema

Integration

Page 11: 8 Distributed Database Design

Architecture Models for Distributed DBsThree categories:

Autonomy – Type of data independence

Distribution – Level of distribution

Heterogeneity – Differences in supported systems

D. Silberberg

Distributed Database Systems - Distributed

Database Design 11

Page 12: 8 Distributed Database Design

Definitions of Autonomy Different definitions

Gligor and Popescu-Zeletin

Local operations not affected by participation in global multi-database systems

Query processing and optimization not affected by global query access

System consistency not compromised when DBs are added to/removed from global databases

Du and Elmagarmid

Design autonomy: DBs use data models and transaction management they want

Communication autonomy: DBs decide which data to provide to other DBs or applications

Execution autonomy: each DB executes queries that are presented to it in its own way

D. Silberberg

Distributed Database Systems - Distributed

Database Design 12

Page 13: 8 Distributed Database Design

Ozsu’s Definition of Autonomy Classes of autonomy

0. Tight integration -- single image of DB available

1. Semi-autonomous

Databases determine what parts of database they want to share

Must be modified to exchange information with each other

2. Total isolation - stand-alone databases

D. Silberberg

Distributed Database Systems - Distributed

Database Design 13

Page 14: 8 Distributed Database Design

Database Distribution 0. None

1. Client/Server -- distributes functionality of DB

2. Peer-to-peer

Fully distributed

Act in concert with each other

D. Silberberg

Distributed Database Systems - Distributed

Database Design 14

Page 15: 8 Distributed Database Design

Heterogeneity 0. Homogeneous

1. Heterogeneous

Different hardware

Different data models

Different query languages

Different transaction models

D. Silberberg

Distributed Database Systems - Distributed

Database Design 15

Page 16: 8 Distributed Database Design

Examples A0,D2,H0

Tightly integrated system

DBMSs located peer-to-peer

Same access, platforms, etc.

A1,D0,H1

Semiautonomous -- different type of data (video & text)

No distribution

Heterogeneous access

Heterogeneous, federated DBMS

A2, D1, H1

Autonomous database systems

Client/server architecture

Heterogeneous access

Functionality in middleware – three-layer architecture

D. Silberberg

Distributed Database Systems - Distributed

Database Design 16

Page 17: 8 Distributed Database Design

Client/Server Architecture (A*, D1, H*) Multiple-client/single-server

Like a single DBMS available on one machine

Some differences with respect to transaction management

Multiple-client/multiple-server

Two options

Application manages data access

Application accesses one server, and it manages requests for information that reside on other servers

First loads burden on client

Second loads burden on server -- "light clients"

From user perspective, client/server and peer-to-peer appear the same

Differences are in architecture

D. Silberberg

Distributed Database Systems - Distributed

Database Design 17

Page 18: 8 Distributed Database Design

Client/Server Architecture

D. Silberberg

Distributed Database Systems - Distributed

Database Design 18

Op Sys UI App Prog

Client DBMS

Comm Software

Op Sys Comm Software

Semantic Data Controller

Query Optimizer

Transaction Manager

Recovery Manager

Runtime Support Proc

Operating System

ResultsSQL Query

Page 19: 8 Distributed Database Design

Peer-to-Peer Architecture (A0, D2, H*) Features

Data independence

Network transparency - supported by global schemas & mapping

Users query independent of location

Global mapping taken care of at GCS level

Local mapping taken care of at LCS/LIS level

D. Silberberg

Distributed Database Systems - Distributed

Database Design 19

Page 20: 8 Distributed Database Design

Peer-to-Peer Architecture (High-Level)

D. Silberberg

Distributed Database Systems - Distributed

Database Design 20

ES1 ES2 ES3

GCS

LCS1 LCS2 LCS3

LIS3LIS2LIS1

- for replication, etc.

Page 21: 8 Distributed Database Design

Peer-to-Peer Architecture

D. Silberberg

Distributed Database Systems - Distributed

Database Design 21

User

UI handler

Semantic Data Controller

Global Query Optimizer

Global Execution Monitor

External Schema

GCS

GD/D

Local Query Proc

Local Recovery Mgr

Runtime Support Proc

LCS

Sys log

LIS

Page 22: 8 Distributed Database Design

Peer-to-Peer Architecture Elements UIF handler - interprets user commands

Semantic data controller - checks integrity constraints, authorizations, syntax, etc.

Global query optimizer and decomposer - minimizes cost of global query, finds best strategy, etc.

Distributed execution monitor - coordinates distributed execution of the request, distributed transaction manager, etc.

Local query processor - chooses best access path, minimizes cost of query, finds best strategy, etc.

Local recovery manager - maintains the database consistency

Run-time support - O/S routines that interact with database data files

D. Silberberg

Distributed Database Systems - Distributed

Database Design 22

Page 23: 8 Distributed Database Design

Multiple Database Architecture (A2, D*, H*) Difference between MDBMS and distributed DBMS

MDBMS

Bottom-up design

GCS describes some of the databases

GCS is subset of the databases

Distributed DBMSs

Top-down design

GCS describes all of the databases

GCS is union of the databases

D. Silberberg

Distributed Database Systems - Distributed

Database Design 23

Page 24: 8 Distributed Database Design

Multiple Database Architecture

D. Silberberg

Distributed Database Systems - Distributed

Database Design 24

GES1 GES2 GES3

GCSLES1 LES1 LES1 LES1 LES1 LES1

LCS1 LCS1

LIS1 LIS1

Page 25: 8 Distributed Database Design

Unilingual vs. Multilingual Unilingual MDBMS – for local and global data

Users need to use different data models

Users may need to use different access languages

Burden is on the user

Multilingual MDBMS – for local and global data

GSC must understand many query languages

Users use one data model through external schema

Users use one (their own) access language

Burden is on the MDBMS

D. Silberberg

Distributed Database Systems - Distributed

Database Design 25

Page 26: 8 Distributed Database Design

Models Without GCS Manages several databases without a local schema

Responsibility for providing access to multiple databases lies with each of the applications

They must provide the mappings between ESs and LCSs

Different than a distributed database

Each local system has a DBMS

MDBMS provides a layer of software that runs on top of DBMSs and allows applications to access them

D. Silberberg

Distributed Database Systems - Distributed

Database Design 26

Page 27: 8 Distributed Database Design

Models Without GCS (cont.)

D. Silberberg

Distributed Database Systems - Distributed

Database Design 27

ES1 ES2 ES3

LCS1 LCS2 LCS3

LIS1 LIS2 LIS3

Multi DB Layer

Local

System

Layer

Page 28: 8 Distributed Database Design

Conclusion Data architectures provide a framework for distributed

database systems

Data architectures focus on User views

Data views

Levels of autonomy

Levels of distribution

Levels of heterogeneity

The tradeoff of these issues will drive Distributed data design

Distributed query processing algorithms

D. Silberberg

Distributed Database Systems - Distributed

Database Design 28