Upload
farooq-shad
View
225
Download
0
Embed Size (px)
Citation preview
7/29/2019 Chap 12 Corrected
1/48
Chapter 12
Data and
DatabaseAdministration
Modern Database ManagementModern Database Management
88thth EditionEdition
Jeffrey A. Hoffer, Mary B. Prescott,Jeffrey A. Hoffer, Mary B. Prescott,
Fred R. McFaddenFred R. McFadden2
007byPrenticeHall
FAROOQ
1
7/29/2019 Chap 12 Corrected
2/48
Chapter 12
Objectives
Definition of terms
List functions and roles of data/databaseadministration
Describe role of data dictionaries and information
repositories Compare optimistic and pessimistic concurrency
control
Describe problems and techniques for data
security Describe problems and techniques for data
recovery
Describe database tuning issues and list areaswhere changes can be done to tune the database
Describe importance and measures of data
2
7/29/2019 Chap 12 Corrected
3/48
Chapter 12
Traditional AdministrationDefinitions Data AdministrationData Administration: A high-level
function that is responsible for theoverall management of data resourcesin an organization, includingmaintaining corporate-wide definitionsand standards
Database AdministrationDatabase Administration: Atechnical function that is responsiblefor physical database design and fordealing with technical issues such assecurity enforcement, databaseperformance, and backup and recovery
3
7/29/2019 Chap 12 Corrected
4/48
Chapter 12
Traditional DataAdministration Functions Data policies, procedures, standards
Planning
Data conflict (ownership) resolution
Managing the information repository
Internal marketing of DA concepts
4
7/29/2019 Chap 12 Corrected
5/48
Chapter 12
Traditional DatabaseAdministration Functions Selection of DBMS and software tools
Installing/upgrading DBMS
Tuning database performance
Improving query processing performance
Managing data security, privacy, and integrity
Data backup and recovery
5
7/29/2019 Chap 12 Corrected
6/48
Chapter 12
Evolving Approaches toData Administration Blend data and database administration into
one role
Fast-track development monitoringdevelopment process (analysis, design,
implementation, maintenance) Procedural DBAsmanaging quality of triggers
and stored procedures
eDBAmanaging Internet-enabled database
applications PDA DBAdata synchronization and personal
database management
Data warehouse administration 6
D W h
7/29/2019 Chap 12 Corrected
7/48
Chapter 12
Data WarehouseAdministration New role, coming with the growth in data
warehouses Similar to DA/DBA roles
Emphasis on integration and coordination ofmetadata/data across many data sources
Specific roles: Support DSS applications
Manage data warehouse growth
Establish service level agreements regarding datawarehouses and data marts
7
7/29/2019 Chap 12 Corrected
8/48
Chapter 12
Open Source DBMSs
An alternative to proprietary packages such asOracle, Microsoft SQL Server, or Microsoft Access
mySQL is an example of open-source DBMS
Less expensive than proprietary packages
Source code available, for modification
8
7/29/2019 Chap 12 Corrected
9/48
Chapter 12
9
Figure 12-2 Data modeling responsibilities
7/29/2019 Chap 12 Corrected
10/48
Chapter 12
Database Security
Database Security:Database Security:Protection of the data against
accidental or intentional loss,destruction, or misuse
Increased difficulty due to
Internet access andclient/server technologies
10
7/29/2019 Chap 12 Corrected
11/48
Chapter 12
11
Figure 12-3 Possible locations of data security threats
7/29/2019 Chap 12 Corrected
12/48
Chapter 12
Threats to Data Security
Accidental losses attributable to: Human error
Software failure
Hardware failure
Theft and fraud
Improper data access: Loss of privacy (personal data)
Loss of confidentiality (corporate data) Loss of data integrity
Loss of availability (through, e.g.sabotage)
12
7/29/2019 Chap 12 Corrected
13/48
Chapter 12
13
Figure 12-4 Establishing Internet Security
7/29/2019 Chap 12 Corrected
14/48
Chapter 12
Web Security
Static HTML files are easy to secure Standard database access controls Place Web files in protected directories on
server
Dynamic pages are harder Control of CGI scripts User authentication Session security
SSL for encryption Restrict number of users and open ports Remove unnecessary programs
14
7/29/2019 Chap 12 Corrected
15/48
Chapter 12
W3C Web PrivacyStandard Platform for Privacy Protection (P3P)
Addresses the following:
Who collects data
What data is collected and for what purpose Who is data shared with
Can users control access to their data
How are disputes resolved
Policies for retaining data
Where are policies kept and how can they beaccessed 15
7/29/2019 Chap 12 Corrected
16/48
Chapter 12
Database SoftwareSecurity Features
Views or subschemas
Integrity controls
Authorization rules
User-defined procedures Encryption
Authentication schemes
Backup, journalizing, and checkpointing
16
7/29/2019 Chap 12 Corrected
17/48
Chapter 12
Views and IntegrityControls Views
Subset of the database that is presented toone or more users
User can be given access privilege to viewwithout allowing access privilege tounderlying tables
Integrity Controls
Protect data from unauthorized use Domainsset allowable values
Assertionsenforce database conditions 17
7/29/2019 Chap 12 Corrected
18/48
Chapter 12
Authorization Rules
Controls incorporated in the datamanagement system
Restrict: access to data
actions that people can take on data
Authorization matrix for: Subjects
Objects
Actions
Constraints18
7/29/2019 Chap 12 Corrected
19/48
Chapter 12
19
Figure 12-5 Authorization matrix
7/29/2019 Chap 12 Corrected
20/48
Chapter 12
20
Some DBMSs also provide
capabilities for
to customize theauthorization process
Figure 12-6a Authorization table for subjects (salespeople)
Figure 12-6b Authorization table for objects (orders)
Figure 12-7 Oracle privileges
Implementing
authorizationrules
7/29/2019 Chap 12 Corrected
21/48
Chapter 12
21
the codingor scrambling of data sothat humans cannot read
them
Secure Sockets Layer(SSL) is a popularencryption scheme for
TCP/IP connections
Figure 12-8 Basic two-key encryption
7/29/2019 Chap 12 Corrected
22/48
Chapter 12
Authentication Schemes Goal obtain a positive identification of the user
Passwords: First line of defense Should be at least 8 characters long
Should combine alphabetic and numeric data
Should not be complete words or personalinformation
Should be changed frequently
22
u en ca on c emes
7/29/2019 Chap 12 Corrected
23/48
Chapter 12
u en ca on c emes(cont.)
Strong Authentication
Passwords are flawed: Users share them with each other
They get written down, could be copied
Automatic logon scripts remove need to explicitlytype them in
Unencrypted passwords travel the Internet
Possible solutions:Two factore.g. smart card plus PIN
Three factore.g. smart card, biometric, PIN
Biometric devicesuse of fingerprints, retinalscans, etc. for positive ID
Third-party mediated authenticationusingsecret keys, digital certificates 23
7/29/2019 Chap 12 Corrected
24/48
Chapter 12
Procedures Personnel controls
Hiring practices, employee monitoring, security training
Physical access controls Equipment locking, check-out procedures, screen
placement
Maintenance controls Maintenance agreements, access to source code, quality
and availability standards
Data privacy controls Adherence to privacy legislation, access rules
24
7/29/2019 Chap 12 Corrected
25/48
Chapter 12
Database Recovery
Mechanism for restoring a database quickly andaccurately after loss or damage
Recovery facilities:
Backup Facilities
Journalizing Facilities Checkpoint Facility
Recovery Manager
25
7/29/2019 Chap 12 Corrected
26/48
Chapter 12
Back-up Facilities
Automatic dump facility that produces backup copyof the entire database
Periodic backup (e.g. nightly, weekly)
Cold backupdatabase is shut down during backup
Hot backupselected portion is shut down andbacked up at a given time
Backups stored in secure, off-site location
26
7/29/2019 Chap 12 Corrected
27/48
Chapter 12
Journalizing Facilities Audit trail of transactions and database updates
Transaction logrecord of essential data for eachtransaction processed against the database
Database change logimages of updated data Before-imagecopy before modification
After-imagecopy after modification
27Produces an
7/29/2019 Chap 12 Corrected
28/48
Chapter 12
28
Figure 12-9 Database audit trail
From the backup and
logs, databases can berestored in case ofdamage or loss
7/29/2019 Chap 12 Corrected
29/48
Chapter 12
Checkpoint Facilities
DBMS periodically refuses to accept new transactions
system is in a quietstate
Database and transaction logs are synchronized
29
7/29/2019 Chap 12 Corrected
30/48
Chapter 12
Recovery and RestartProcedures
Disk Mirroringswitch between identicalcopies of databases
Restore/Rerunreprocess transactions
against the backupTransaction Integritycommit or abort
all transaction changes Backward Recovery (Rollback)apply
before images Forward Recovery (Roll Forward)apply
after images (preferable torestore/rerun) 30
7/29/2019 Chap 12 Corrected
31/48
Chapter 12
Transaction ACIDProperties Atomic
Transaction cannot be subdivided
Consistent Constraints dont change from before
transaction to after transaction Isolated
Database changes not revealed to usersuntil after transaction has completed
Durable Database changes are permanent
31
7/29/2019 Chap 12 Corrected
32/48
Chapter 12
32
Figure 12-10 Basic recovery techniquesa) Rollback
Figure 12 10 Basic recovery techniques (cont )
7/29/2019 Chap 12 Corrected
33/48
Chapter 12
33
Figure 12-10 Basic recovery techniques (cont.)b) Rollforward
Database Failure
7/29/2019 Chap 12 Corrected
34/48
Chapter 12
Database FailureResponses
Aborted transactions Preferred recovery: rollback Alternative: Rollforward to state just prior to abort
Incorrect data Preferred recovery: rollback
Alternative 1: rerun transactions not including inaccurate
data updates Alternative 2: compensating transactions
System failure (database intact) Preferred recovery: switch to duplicate database
Alternative 1: rollback
Alternative 2: restart from checkpoint Database destruction
Preferred recovery: switch to duplicate database
Alternative 1: rollforward
Alternative 2: reprocess transactions
34
7/29/2019 Chap 12 Corrected
35/48
Chapter 12
Concurrency Control
Problemin a multiuser environment, simultaneousaccess to data can result in interference and dataloss
SolutionConcurrency ControlConcurrency Control
The process of managing simultaneous operationsagainst a database so that data integrity ismaintained and the operations do not interfere witheach other in a multi-user environment
35
Figure 12 11 Lost update (no concurrency control in effect)
7/29/2019 Chap 12 Corrected
36/48
Chapter 12
36
Figure 12-11 Lost update (no concurrency control in effect)
Simultaneous access causes updates to cancel each other
A similar problem is the problem
7/29/2019 Chap 12 Corrected
37/48
Chapter 12
Concurrency ControlTechniques
Serializability Finish one transaction before starting another
Locking Mechanisms The most common way of achieving serialization
Data that is retrieved for the purpose of updating islocked for the updater
No other user can perform update until unlocked
37
Fi 12 12 U d t ith l ki ( t l)
7/29/2019 Chap 12 Corrected
38/48
Chapter 12
38
Figure 12-12: Updates with locking (concurrency control)
7/29/2019 Chap 12 Corrected
39/48
Chapter 12
Locking Mechanisms Locking level:
Databaseused during database updates
Tableused for bulk updates
Block or pagevery commonly used
Recordonly requested row; fairly commonly
used Fieldrequires significant overhead;
impractical
Types of locks: Shared lockRead but no update permitted.
Used when just reading to prevent anotheruser from placing an exclusive lock on therecord
Exclusive lockNo access permitted. Usedwhen preparing to update
39
7/29/2019 Chap 12 Corrected
40/48
Chapter 12
Deadlock An impasse that results when two or more
transactions have locked common resources,and each waits for the other to unlock theirresources
40
Figure 12-13The problem of deadlock
John and Marsha will waitJohn and Marsha will wait
forever for each other toforever for each other to
release their lockedrelease their locked
resources!resources!
7/29/2019 Chap 12 Corrected
41/48
Chapter 12
Managing Deadlock
Deadlock prevention: Lock all records required at the beginning of
a transaction
Two-phase locking protocol Growing phase
Shrinking phase
May be difficult to determine all neededresources in advance
Deadlock Resolution:
Allow deadlocks to occur Mechanisms for detecting and breaking
them Resource usage matrix
41
7/29/2019 Chap 12 Corrected
42/48
Chapter 12
Versioning
Optimistic approach to concurrencycontrol
Instead of locking
Assumption is that simultaneousupdates will be infrequent
Each transaction can attempt an updateas it wishes
The system will reject an update when itsenses a conflict
Use of rollback and commit for this 42
7/29/2019 Chap 12 Corrected
43/48
Chapter 12
43
Figure 12-15 The use of versioning
Better performance than locking
7/29/2019 Chap 12 Corrected
44/48
Chapter 12
Managing Data Quality
Causes of poor data quality External data sources Redundant data storage Lack of organizational commitment
Data quality improvement Perform data quality audit Establish data stewardship program (data
stewardis a liaison between IT and businessunits)
Apply total quality management (TQM)practices
Overcome organizational barriers Apply modern DBMS technology
Estimate return on investment
44
Data Dictionaries and
7/29/2019 Chap 12 Corrected
45/48
Chapter 12
Data Dictionaries andRepositories
Data dictionary
Documents data elements of a database
System catalog
System-created database that describes alldatabase objects
Information Repository Stores metadata describing data and data
processing resources Information Repository Dictionary System (IRDS)
Software tool managing/controlling access to
information repository
45
Figure 12-16 Three components of the repository system
7/29/2019 Chap 12 Corrected
46/48
Chapter 12
46
g p p y yarchitecture
A schema of the
repositoryinformation
Softwarethat
managestherepositoryobjects
Where repositoryobjects are stored
: adapted from Bernstein, 1996.
Database Performance
7/29/2019 Chap 12 Corrected
47/48
Chapter 12
Database PerformanceTuning
DBMS Installation
Setting installation parameters Memory Usage
Set cache levels
Choose background processes
Input/Output (I/O) Contention Use striping
Distribution of heavily accessed files
CPU Usage Monitor CPU load
Application tuning Modification of SQL code in applications 47
7/29/2019 Chap 12 Corrected
48/48
Data Availability
Downtime is expensive How to ensure availability
Hardware failuresprovide redundancy for fault tolerance
Loss of datadatabase mirroring
Maintenance downtimeautomated and nondisruptivemaintenance utilities
Network problemscareful traffic monitoring, firewalls,and routers
48