Upload
jewel-hensley
View
235
Download
1
Embed Size (px)
Citation preview
2
Distributed Databases
• Distributed Systems goal: – to offer local DB autonomy at geographically
distributed locations
• Multiple CPU's – each has DBMS, but data distributed
• Loosely coupled – homogeneous – heterogeneous - different DBMSs - need
ODBC, standard SQL
3
Advantages of DDBs
• distributed nature of some DB applications (bank branches)
• increased reliability and availability if site failure - also replicate data at > 1 site
• data sharing but also local control
• improved performance - smaller DBs exist at each site
• easier expansion
Client-Server
• Client-Server (b) in figure– Client sends request for service (strict – fixed
roles)– 3-tier architecture
• Presentation tier• Logic tier • Data Tier
5
6
Distributed DBSs (DDBS)
• Distributed DB (c) in figure– WAN– Multiple CPU's – each has DBMS, but data
distributed– lower communication rates– Heterogeneous machines – Homogeneous DDBS
• homogeneous – same DBMSs
– Heterogeneous DDBS• different DBMSs - need ODBC, standard SQL
7
Heterogeneous distributed DBSsHDDBs
• Data distributed and each site has own DBMS ORACLE at one site, DB2 at another, etc.
• need ODBC, standard SQL • usually transaction manager responsible for
cooperation among sites • must coordinate distributed transaction• need data conversion and to access data at
other sites
9
Federated DB - FDBS• federated DB is a multidatabase that is autonomous
(a) in figure• collection of cooperating DBSs that are
heterogeneous • preexisting DBs form new database
• Each DB specifies import/export schema (view)– keeps a partial view of total schema
• Each DB has its own local users, local transparency and DBA– appears centralized for local autonomous users
– appears distributed for global users
11
Replication
• Full vs. partial replication
• Which copy to access
• Improves performance for global queries but updates a problem
• Ensure consistency of replicated copies of data
12
Data fragments
• Can distribute a whole relation at a siteor
• Data fragments – logical units of the DB assigned for storage at
various sites – horizontal fragmentation - subset of tuples in
the relation (select) – vertical fragmentation - keeps only certain
attributes of relation (project) need a PK
13
Fragments cont’d
• Horizontal fragments: – disjoint - tuples only member of 1 fragment
salary < 5000 and dno=4 – complete - set of fragments whose conditions
include every tuple – Complete vertical fragment:
L1 U L2 U ... Ln - attributes of R Li intersect Lj = PK(R)
14
Example replication/fragmentation
• Example of fragments for company DB: site 1 - company headquarters gets
entire DB site 2, 3 – horizontal fragments
based on dept. no.
16
Increased complexity
Additional functions needed:
• global vs. local queries
• keep track of data and replication• execution strategies if data at > 1 site
– which copy to access – maintain consistency of copies
17
To process a query
• Must use data dictionary that includes info on data distribution among servers
• Ensure atomicity• Parse user query
– decomposed into independent site queries– each site query sent to appropriate server site– site processes local query, sends result to result site– result site combines results of subqueries
18
Architectures
• Distributed Systems goal: to offer local DB autonomy at geographically distributed locations
versus • Parallel Systems goal: to construct a faster
centralized computer – Improve performance through parallelization– Distribution of data governed by performance– Processing, I/O simultaneously
19
Parallel DBSs
• Shared-memory multiprocessor – get N times as much work with N CPU's access – MIMD, SIMD - equal access to same data, massively
parallel
• Parallel shared nothing– data split among CPUs, each has own CPU, divide
work for transactions, communicate over high speed networks LANs - homogeneous machines CPU + memory - called a site
Query Parallelism
• Decompose query into parts that can be executed in parallel at several sites– Intra query parallelism
• If shared nothing & horizontally fragmented:Select name, phone from account where age > 65
– Decompose into K different queries– Result site accepts all and puts together (order by, count)
• What if a join and table is fragmented?
20