Upload
rahul
View
760
Download
3
Embed Size (px)
Citation preview
©Silberschatz, Korth and Sudarshan2.2Database System Concepts
HOW TO START MODELING
ENTERPRISE MODEL
BUSINESS PROCESS MODEL
REQUIREMENT OF INFORMATION
RELATIONSHIP AMONG INFORMATION
DECIDE ON THE TYPE OF MODEL
©Silberschatz, Korth and Sudarshan2.3Database System Concepts
HOW TO CONSTRUCT A HOUSE
MOBILISE THE FINANCES
SELECT A PLACE/SITE & BUY IT
REQUIREMENT OF MATERIALS AS PER PLAN
HIRING OF LABOUR FORCE
COMPLETE CONSTRUCTION & OCCUPY
©Silberschatz, Korth and Sudarshan2.4Database System Concepts
PUBLISHING A BOOK
DECIDE THE SUBJECT ON WHICH YOU ARE GOING TO WRITE A BOOK AND PUBLISH.
COLLECT ALL RELEVANT DATA FROM VARIOUS SOURCES ORGANISE THE SUBJECT CHAPTERWISE & SEQUENCE
THE CHAPTERS. DEVELOP THE SUBJECT IN EACH CHAPTER ADD ILLUSTRATIONS, EXAMPLES, QUOTES SUMMARY, REVIEW QUESTIONS, SELF STUDY WORK,
CASE STUDY AT THE END OF EACH CHAPTER ONCE ALL THE CHAPTERS ARE READY, INDEX/CONTENT
PREFACE, APPENDICES, BIBLIOGRAPHY, REFERENCES NAME INDEX, SUBJECT INDEX PUBLISHING AGENCY, PRICING, MARKETING
©Silberschatz, Korth and Sudarshan2.5Database System Concepts
WHAT IS DATA MODELING
It is a conceptual framework which defines the logical relationship among the data elements needed to support a basic business process or other activities.
It is the underlying structure of a database. It is a map or diagram that represents entities and
their relationships A collection of tools for describing
data data relationshipsdata semanticsdata constraints
©Silberschatz, Korth and Sudarshan2.6Database System Concepts
What is database all about
Database model is defined to consist a combination of the following components:
A collection of data object types, which form the basic building blocks for any database that confirms to the model.
A collection of general integrity rules, which constrain the set of occurrences of these object types that can largely appear in any such database.
A collection of operators, which can be applied to such object occurrences for retrieval and other purposes.
©Silberschatz, Korth and Sudarshan2.7Database System Concepts
TYPES OF MODELS AVAILABLE
OBJECT BASED LOGICAL MODELS.
RECORD BASED LOGICAL MODELS.
PHYSICAL MODELS.
©Silberschatz, Korth and Sudarshan2.8Database System Concepts
TYPES OF MODELS AVAILABLE
OBJECT BASED LOGICAL MODELS. Entity – relationship (ER) model Object oriented model Semantic data model Functional data model
RECORD BASED LOGICAL MODELS.Relational modelNetwork modelHierarchical model
PHYSICAL MODELS.Unifying modelFrame-memory model
©Silberschatz, Korth and Sudarshan2.9Database System Concepts
What is this Object based Logical Model
Object based logical models are used in describing data at the logical and view levels.
They are characterised by the fact that they provide fairly flexible structuring capabilities.
They allow data constraints to be specified explicitly.
©Silberschatz, Korth and Sudarshan2.10Database System Concepts
What is this ER model
Used at the design stage of the database.
It is like the flow chart in computer programming.
Certain symbols are used in flow charts, similarly here in ER modeling too, symbols are used.
Entity is a physical thing that exists, live or otherwise that is distinguishable from other objects. It can be explained by some of its characteristics called attributes.
©Silberschatz, Korth and Sudarshan2.11Database System Concepts
Entity Sets
A database can be modeled as:
a collection of entities,
relationship among entities. An entity is an object that exists and is distinguishable from
other objects.
Example: specific person, company, event, plant Entities have attributes
Example: people have names and addresses An entity set is a set of entities of the same type that share the
same properties.
Example: set of all persons, companies, trees, holidays
©Silberschatz, Korth and Sudarshan2.12Database System Concepts
Entity Relationship Model
Entities (objects) E.g. customers, accounts, bank branch
Relationships between entities E.g. Account A-101 is held by customer Sridhar Relationship-set depositor associates customers with
accounts
Widely used for database design
Database design in E-R model usually converted to design in the relational model (coming up next) which is used for storage and processing
©Silberschatz, Korth and Sudarshan2.13Database System Concepts
Entity Sets customer and loan
customer-id customer- customer- customer- loan- amount name street city number
ANJU
PATTA
MARY
KUNDU
NISHU
MADAN
MUKHU
TIRUVALLA
CHE’CHERY
ANAIYUR
KESTOPUR
RANCHI
NEW DELHI
KOLKATA
©Silberschatz, Korth and Sudarshan2.14Database System Concepts
Attributes
An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set.
Domain – the set of permitted values for each attribute Attribute types:
Simple and composite attributes. Single-valued and multi-valued attributes
E.g. multivalued attribute: phone-numbers Derived attributes
Can be computed from other attributes E.g. age, given date of birth
Example:
customer = (customer-id, customer-name, customer-street, customer-city)
loan = (loan-number, amount)
©Silberschatz, Korth and Sudarshan2.16Database System Concepts
Relationship Sets
A relationship is an association among several entities
Example:Sridhar depositor A-102customer entity relationship set account entity
A relationship set is a mathematical relation among n 2 entities, each taken from entity sets
{(e1, e2, … en) | e1 E1, e2 E2, …, en En}
where (e1, e2, …, en) is a relationship
Example:
(Sridhar, A-102) depositor
©Silberschatz, Korth and Sudarshan2.17Database System Concepts
Relationship Set borrower
customer-id customer- customer- customer- loan- amount name street city number
ANJU
PATTA
MARY
KUNDU
NISHU
MADAN
MUKHU
TIRUVALLA
CHE’CHERY
ANAIYUR
KESTOPUR
RANCHI
NEW DELHI
KOLKATA
©Silberschatz, Korth and Sudarshan2.18Database System Concepts
Relationship Sets (Cont.) An attribute can also be property of a relationship set. For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date
©Silberschatz, Korth and Sudarshan2.19Database System Concepts
Degree of a Relationship Set
Refers to number of entity sets that participate in a relationship set. Relationship sets that involve two entity sets are binary (or degree two). Generally,
most relationship sets in a database system are binary. Relationship sets may involve more than two entity sets. Binary (Two), Ternary
(Three), Quaternary (Four), Quinary (Five), Senary (Six) and so on…..
Relationships between more than two entity sets are rare. Most relationships are binary. (More on this later.)
E.g. Suppose employees of a bank may have jobs (responsibilities) at multiple branches, with different jobs at different branches. Then there is a ternary relationship set between entity sets employee, job and branch
©Silberschatz, Korth and Sudarshan2.20Database System Concepts
Mapping Cardinalities
Express the number of entities to which another entity can be associated via a relationship set.
Most useful in describing binary relationship sets. For a binary relationship set the mapping cardinality must be
one of the following types: One to one
One to many
Many to one
Many to many
©Silberschatz, Korth and Sudarshan2.21Database System Concepts
Mapping Cardinalities
One to one One to many
Note: Some elements in A and B may not be mapped to any elements in the other set
©Silberschatz, Korth and Sudarshan2.22Database System Concepts
Mapping Cardinalities
Many to one Many to many
Note: Some elements in A and B may not be mapped to any elements in the other set
©Silberschatz, Korth and Sudarshan2.23Database System Concepts
Mapping Cardinalities affect ER Design
Can make access-date an attribute of account, instead of a relationship attribute, if each account can have only one customer I.e., the relationship from account to customer is many to one,
or equivalently, customer to account is one to many
©Silberschatz, Korth and Sudarshan2.24Database System Concepts
E-R Diagrams
Rectangles represent entity sets. Diamonds represent relationship sets. Lines link attributes to entity sets and entity sets to relationship sets. Ellipses represent attributes
Double ellipses represent multivalued attributes. Dashed ellipses denote derived attributes.
Underline indicates primary key attributes (will study later)
©Silberschatz, Korth and Sudarshan2.25Database System Concepts
E-R Diagram With Composite, Multivalued, and Derived Attributes
©Silberschatz, Korth and Sudarshan2.27Database System Concepts
Roles
Entity sets of a relationship need not be distinct The labels “manager” and “worker” are called roles; they specify how
employee entities interact via the works-for relationship set. Roles are indicated in E-R diagrams by labeling the lines that connect
diamonds to rectangles. Role labels are optional, and are used to clarify semantics of the
relationship
©Silberschatz, Korth and Sudarshan2.28Database System Concepts
Cardinality Constraints
We express cardinality constraints by drawing either a directed line (), signifying “one,” or an undirected line (—), signifying “many,” between the relationship set and the entity set.
E.g.: One-to-one relationship: A customer is associated with at most one loan via the relationship
borrower
A loan is associated with at most one customer via borrower
©Silberschatz, Korth and Sudarshan2.29Database System Concepts
One-To-Many Relationship
In the one-to-many relationship a loan is associated with at most one customer via borrower, a customer is associated with several (including 0) loans via borrower
©Silberschatz, Korth and Sudarshan2.30Database System Concepts
Many-To-One Relationships
In a many-to-one relationship a loan is associated with several (including 0) customers via borrower, a customer is associated with at most one loan via borrower
©Silberschatz, Korth and Sudarshan2.31Database System Concepts
Many-To-Many Relationship
A customer is associated with several (possibly 0) loans via borrower
A loan is associated with several (possibly 0) customers via borrower
©Silberschatz, Korth and Sudarshan2.32Database System Concepts
Participation of an Entity Set in a Relationship Set
Total participation (indicated by double line): every entity in the entity set participates in at least one relationship in the relationship set E.g. participation of loan in borrower is total
every loan must have a customer associated to it via borrower Partial participation: some entities may not participate in any
relationship in the relationship set E.g. participation of customer in borrower is partial
©Silberschatz, Korth and Sudarshan2.33Database System Concepts
Alternative Notation for Cardinality Limits
Cardinality limits can also express participation constraints
©Silberschatz, Korth and Sudarshan2.34Database System Concepts
Keys
A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity.
A candidate key of an entity set is a minimal super key Customer-id is candidate key of customer
account-number is candidate key of account
Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.
©Silberschatz, Korth and Sudarshan2.35Database System Concepts
Keys for Relationship Sets
The combination of primary keys of the participating entity sets forms a super key of a relationship set. (customer-id, account-number) is the super key of depositor
NOTE: this means a pair of entity sets can have at most one relationship in a particular relationship set. E.g. if we wish to track all access-dates to each account by each
customer, we cannot assume a relationship for each access. We can use a multivalued attribute though
Must consider the mapping cardinality of the relationship set when deciding the what are the candidate keys
Need to consider semantics of relationship set in selecting the primary key in case of more than one candidate key
©Silberschatz, Korth and Sudarshan2.36Database System Concepts
E-R Diagram with a Ternary Relationship
©Silberschatz, Korth and Sudarshan2.37Database System Concepts
Cardinality Constraints on Ternary Relationship
We allow at most one arrow out of a ternary (or greater degree) relationship to indicate a cardinality constraint
E.g. an arrow from works-on to job indicates each employee works on at most one job at any branch.
If there is more than one arrow, there are two ways of defining the meaning. E.g a ternary relationship R between A, B and C with arrows to B and C
could mean
1. each A entity is associated with a unique entity from B and C or
2. each pair of entities from (A, B) is associated with a unique C entity, and each pair (A, C) is associated with a unique B
Each alternative has been used in different formalisms
To avoid confusion we outlaw more than one arrow
©Silberschatz, Korth and Sudarshan2.38Database System Concepts
Binary Vs. Non-Binary Relationships
Some relationships that appear to be non-binary may be better represented using binary relationships E.g. A ternary relationship parents, relating a child to his/her father and
mother, is best replaced by two binary relationships, father and mother Using two binary relationships allows partial information (e.g. only
mother being know)
But there are some relationships that are naturally non-binary E.g. works-on
©Silberschatz, Korth and Sudarshan2.39Database System Concepts
Converting Non-Binary Relationships to Binary Form
In general, any non-binary relationship can be represented using binary relationships by creating an artificial entity set. Replace R between entity sets A, B and C by an entity set E, and three
relationship sets:
1. RA, relating E and A 2.RB, relating E and B
3. RC, relating E and C Create a special identifying attribute for E Add any attributes of R to E For each relationship (ai , bi , ci) in R, create
1. a new entity ei in the entity set E 2. add (ei , ai ) to RA
3. add (ei , bi ) to RB 4. add (ei , ci ) to RC
©Silberschatz, Korth and Sudarshan2.40Database System Concepts
Converting Non-Binary Relationships (Cont.)
Also need to translate constraints Translating all constraints may not be possible
There may be instances in the translated schema thatcannot correspond to any instance of R
Exercise: add constraints to the relationships RA, RB and RC to ensure that a newly created entity corresponds to exactly one entity in each of entity sets A, B and C
We can avoid creating an identifying attribute by making E a weak entity set (described shortly) identified by the three relationship sets
©Silberschatz, Korth and Sudarshan2.41Database System Concepts
Design Issues
Use of entity sets vs. attributesChoice mainly depends on the structure of the enterprise being modeled, and on the semantics associated with the attribute in question.
Use of entity sets vs. relationship setsPossible guideline is to designate a relationship set to describe an action that occurs between entities
Binary versus n-ary relationship setsAlthough it is possible to replace any nonbinary (n-ary, for n > 2) relationship set by a number of distinct binary relationship sets, a n-ary relationship set shows more clearly that several entities participate in a single relationship.
Placement of relationship attributes
©Silberschatz, Korth and Sudarshan2.44Database System Concepts
Summary of Symbols Used in E-R Notation
©Silberschatz, Korth and Sudarshan2.48Database System Concepts
the better we understand the data,
the more effective
the discovery and retrieval will be.
©Silberschatz, Korth and Sudarshan2.49Database System Concepts
TYPES OF MODELS AVAILABLE
OBJECT BASED LOGICAL MODELS. Entity – relationship (ER) model Object oriented model Semantic data model Functional data model
RECORD BASED LOGICAL MODELS. Relational model Network model Hierarchical model
PHYSICAL MODELS. Unifying model Frame-memory model
©Silberschatz, Korth and Sudarshan2.50Database System Concepts
What is this Record Based Logical Model
To describe data at the logical and view levels. Can be used to specify the overall logical structure. Can also be used to provide higher level description of the
implementation. Fixed format records from structured database.
©Silberschatz, Korth and Sudarshan2.51Database System Concepts
Database Models
Employee2
Employee2
A
Empno Ename Etitle Dept
1
2 B
3 C
Relational Structure
Network Structure
Hierarchical Structure
Employee3
Project BProject A
Dept Dname Dloc Dmgr
A
B
C
Employee2
Employee1
Project A
Employee1
Dept A Dept BDept
Project B
©Silberschatz, Korth and Sudarshan2.52Database System Concepts
Hierarchical Database Model
• Organize data in tree structure• Hierarchy of parent-child relationship• One-to-many relationship• Restricts the child segment to having only one parent segment• Redundant data• Used for structured, routine types of transactions• Not flexible in support of databases• Cannot easily handle ad hoc requests
©Silberschatz, Korth and Sudarshan2.53Database System Concepts
Relational Database Model
• Organizes data in the form of tables consisting of rows and columns• Many-to-many relationship• Based on Relational Algebra such as the application of Boolean operators, Projection,
Cartesian Product• Redundancy in the data can be avoided by normalization rules• Used for unstructured types of transactions• More flexible in support of databases• Can easily handle ad hoc requests
©Silberschatz, Korth and Sudarshan2.54Database System Concepts
What is DBMS
D
B
M
S
DEFINE THE (TYPE OF) DATABASE
BUILD THE STRUCTURE (FOR STORAGE)
MANIPULATE DATA (FOR RETRIEVAL)
SAFETY OF THE DATASYSTEM CRASH
UNAUTHORISED INTRUSIONS
©Silberschatz, Korth and Sudarshan2.55Database System Concepts
DDL AND DML
DBMS
DATA DEFINITION PART: DATA DEFINITION LANGUAGES (DDL)COBOL, SQL, Creation of tables, Entering data,Normalisation
DATA MANIPULATION PARTDATA MANIPULATION LANGUAGES (DML)Structured query languages (SQL) QBEQuery, Report generation and writing, Checking errors, other control features
©Silberschatz, Korth and Sudarshan2.56Database System Concepts
Database Management Systems
DBMS is a set of computer programs that facilitates Creation of database by programmers Maintenance and security of database by DBAs and Application of database by end users.
DDL– Data Definition Language is for defining tables (files)
DML - Data Manipulation Language is for processing data and records (e.g., update, sort, …)
DCL - A Data Control Language is a computer language for controlling access to data in a database. Examples of DCL commands are : GRANT and REVOKE
Data dictionary- computer-based catalog or directory containing
metadata.
©Silberschatz, Korth and Sudarshan2.57Database System Concepts
DATA DICTIONARY
It is a database management catalog, prepared by database designers to help individuals to enter data.
It contains metadata, i.e. Data on data, like, why the data item is needed, how often it should be updated, on which form and reports the data appears.
It relies on a DBMS software component to manage a database of data definitions i.e. metadata about the structure, data elements, and other characteristics of a database.
It contains the names and descriptions of all types of data records and their inter relationships.
It also has information on end user’s access requirements, use of application programs, database maintenance and security.
It can be queried by database administrator. He can make some changes in data dictionary.
©Silberschatz, Korth and Sudarshan2.60Database System Concepts
Types of keys
Primary key
1. Simple Primary key
2. Composite Primary Key Secondary Key Foreign Key Reference Key
------------------------------------------------------------------------------
Primary key: Unique identifier attribute for an entity.
Simple Primary key : Based on single attribute
Composite Primary Key : Based on two or more attributes
Secondary Key : All attributes which is not primary
Foreign Key : Primary key in one table, which is not primary in another table.
Reference Key : Column in child table, which is referred by primary key by parent table.
©Silberschatz, Korth and Sudarshan2.61Database System Concepts
TYPES OF DATABASES
OPERATIONAL DATABASE ANALYTICAL DATABASE DATA WAREHOUSING DISTRIBUTED DATABASE HYPERMEDIA DATABASE
©Silberschatz, Korth and Sudarshan2.62Database System Concepts
OPERATIONAL DATABASE
DATABASE ON ABOUT THE OPERATION IN AN ORGANISATION. CAN BE SUB-DIVIDED INTO SUBJECT AREAS LIKE:
TRANSACTION DATABASE
MARKETING DATABASE
PRODUCTION DATABASE
STORE INFORMATION RELATED TO DAY TO DAY OPERATIONS, RELATED TO CUSTOMERS
INVENTORY
EMPLOYEES
TRANSACTION INFORMATION SYSTEM (TPS) USE THIS DATABASE.
©Silberschatz, Korth and Sudarshan2.63Database System Concepts
ANALYTICAL DATABASE
INFORMATION EXTRACTED FROM OPERATIONAL AND EXTERNAL.
INFORMATION REQUIRED FOR MIDDLE LEVEL MANAGERS TO TAKE DECISIONS.
CAN ALSO BE CALLED AS MANAGEMENT DATABASES. USE MULTI-DIMENSIONAL DATA STRUCTURE TO
ORGANISE DATA. ONLINE ANALYTICAL PROCESSING, DECISION SUPPORT
SYSTEMS, EXECUTIVE SUPPORT SYSTEMS USE THIS DATABASE.
©Silberschatz, Korth and Sudarshan2.64Database System Concepts
DISTRIBUTED DATABASE
It is database where the data is not stored in one physical location, but is distributed over many locations (computers/ servers), which are geographically dispersed and connected by networking.
Example. Advantages. Disadvanteges
©Silberschatz, Korth and Sudarshan2.65Database System Concepts
HYPERMEDIA DATABASES
Hypermedia databases are a repository of home pages and other hyperlinked pages of multimedia.
Can store text, graphics, photo or video files. World wide web uses hypermedia databases to access HTML
files, GIF files and video files.
©Silberschatz, Korth and Sudarshan2.66Database System Concepts
Data Warehousing
Large organizations have complex internal organizations, and have data stored at different locations, on different operational (transaction processing) systems, under different schemas
Data sources often store only current data, not historical data Corporate decision making requires a unified view of all
organizational data, including historical data A data warehouse is a repository (archive) of information
gathered from multiple sources, stored under a unified schema, at a single site
Greatly simplifies querying, permits study of historical trends
Shifts decision support query load away from transaction processing systems
©Silberschatz, Korth and Sudarshan2.67Database System Concepts
Definition of data warehousing
According to W.H.Inmon
A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process.
©Silberschatz, Korth and Sudarshan2.70Database System Concepts
DATA WAREHOUSING
A data warehouse archives information gathered from multiple sources, and stores it under a unified schema, at a single site.
Important for large businesses which generate data from multiple divisions, possibly at multiple sites
Data may also be purchased externally
This has been depicted pictorially in the next slide.
Concept is much alike to physical warehouses (Godowns)
It supports online analysis of sales, inventory and other vital business data, that has been called from operational systems.
It also enables to go into the past history, to improve upon the future operations, to provide more customer satisfaction, introduce customer oriented schemes etc.,
©Silberschatz, Korth and Sudarshan2.71Database System Concepts
DATA MART
One way of putting it across what is data mart is:
“It is data warehouse that has limited or specific scope”. It contains selected information from the data warehouse such
that each separate data mart is customised for the decision support applications of a particular end-user group.
Data stored in a centralised location of data warehouse is edited, standardised, integrated and updated so that it can be used by managers for decision making.
Some of the organisations, instead of having one centralised warehouse, might opt for multiple data marts, each one one concentrating on specific aspect.
©Silberschatz, Korth and Sudarshan2.72Database System Concepts
DATA MART
Data warehouse
Data mart(Production)
Data mart(Finance)
Data mart(Marketing)
©Silberschatz, Korth and Sudarshan2.73Database System Concepts
Data Mining
Broadly speaking, data mining is the process of semi-automatically analyzing large databases to find useful patterns
Like knowledge discovery in artificial intelligence data mining discovers statistical rules and patterns
Differs from machine learning in that it deals with large volumes of data stored primarily on disk.
Some types of knowledge discovered from a database can be represented by a set of rules.
e.g.,: “Young women with annual incomes greater than $50,000 are most likely to buy sports cars”
Other types of knowledge represented by equations, or by prediction functions
Some manual intervention is usually required
Pre-processing of data, choice of which type of pattern to find, postprocessing to find novel patterns
©Silberschatz, Korth and Sudarshan2.74Database System Concepts
What is Data Mining?
Data Mining is:
(1) The efficient discovery of previously unknown, valid, potentially useful, understandable patterns in large datasets
(2) The analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner
©Silberschatz, Korth and Sudarshan2.75Database System Concepts
What is Data Mining?
Very little functionality in database systems to support mining applications
Beyond SQL Querying:
SQL (OLAP) Query:- How many computers did we sell in the 1st Qtr of 1999
in Chennai vs New Delhi?
Data Mining Queries:- Which sales region (Chennai, Mumbai, Kolkata & New
Delhi) had anomalous sales in the 1st Qtr of 2005?
- How do the buyers of computers in Chennai and New Delhi differ?
- What else do the buyers of computers in Chennai buy along with computers, as compared to New Delhi?
©Silberschatz, Korth and Sudarshan2.76Database System Concepts
Relation between data mining and OLAP
Data mining can be viewed as an advanced stage of On-line Analytical Processing (OLAP).
However, data mining goes far beyond the narrow scope of summarization style analytical processing of data warehouse systems by incorporating more advanced techniques for data understanding.
©Silberschatz, Korth and Sudarshan2.77Database System Concepts
Applications of Data Mining
Prediction based on past history
Predict if a credit card applicant poses a good credit risk, based on some attributes (income, job type, age, ..) and past history
Predict if a customer is likely to switch brand loyalty
Predict if a customer is likely to respond to “junk mail”
Predict if a pattern of phone calling card usage is likely to be fraudulent
Descriptive Patterns
Associations Find books that are often bought by the same
customers. If a new customer buys one such book, suggest that he buys the others too.
Other similar applications: camera accessories, clothes, etc.
Data mining is the method used by the companies to sort and analyse information to better understand their customers, products, markets or any other aspect/phase of their business.
©Silberschatz, Korth and Sudarshan2.79Database System Concepts
OLTP AND OLAP
ONLINE TRANSACTION PROCESSING (OLTP) And
ONLINE ANALYTICAL PROCESSING (OLAP)
ARE TWO IMPORTANT ASPECTS OF DATA MINING. OLTP: It refers to the immediate and automated response to the
requests of the user. Example: Send a query mail to HDFC or ICICI bank, you will get
immediate response (sort of acknowledgement). OLTP is designed to handle multiple concurrent transactions
from the customers. It has a fixed number of inputs (type of query) and standard
format output. OLTP is a big part of interactive e-Commerce applications.
©Silberschatz, Korth and Sudarshan2.80Database System Concepts
EXAMPLE OF OLTP
From: [email protected] [[email protected]]Sent: Saturday, Jun 4 2005 2:26PMTo: [email protected] [[email protected]]Subject: Banker's Cheque Inquiry
Name : MR. T MOHANAKRISHNANCustomer ID: 7488678Account No : 0821050216585
Sir/Madam, I have deposited a US cheque for $250, about 15 days back, through Lloyds Road ATM center. I would like to get confirmation of receipt of this cheque, as well as approximate time to realise the cheque, please.Thanking you,T Mohanakrishnan.
©Silberschatz, Korth and Sudarshan2.81Database System Concepts
EXAMPLE FOR OLTP
Dear Customer,
Thank you for writing to us. This auto acknowledgement confirms the receipt of your e-mail. This interaction is being tracked through the subject line. We request you not to change the subject line.
If you are an existing account holder and have not mentioned your Customer Identification Number / Account Number in your earlier mail, please re-send your mail with the details.
Please ignore this message if you have quoted your Customer Identification Number / Account Number.
We will get back to you shortly.
Warm regards,
HDFC Bank Ltd
©Silberschatz, Korth and Sudarshan2.82Database System Concepts
OLAP
A graphical software tools that provide complex analysis of data stored in database.
Why it is called “Online” analytical processing?? Purpose of OLAP server links data and special functions. Analysis by OLAP goes beyond normal database queries. It can provide time series and trend analysis views of the
data. “what if” and “why” analysis.
©Silberschatz, Korth and Sudarshan2.83Database System Concepts
WHAT IF ANALYSIS
A CAPABILITY OF SOME INFORMATION SYSTEMS (EX: DECISION SUPPORT SYSTEM) THAT ALLOWS USER TO MAKE HYPOTHETICAL CHANGES TO THE DATA ASSOCIATED WITH THE PROBLEM AND OBSERVE HOW THESE CHANGES INFLUENCE THE RESULTS OR OUTCOME.
©Silberschatz, Korth and Sudarshan2.84Database System Concepts
WHAT IF ANALYSIS
Take the case of pay roll of a company. Basic pay, DA, other allowances. If DA is increased by some %, what is the net result or load on
ex-chequer. If Basic pay is increased, what are the effects? (HRA, DA etc.,
are % of Basic)
©Silberschatz, Korth and Sudarshan2.85Database System Concepts
OLAP
OLAP applications are found in the area of financial modeling (budgeting, planning), sales forecasting, customer and product profitability etc
Provides leverage to library managers by providing the ability to model real life projections and a more efficient use of resources
OLAP enables the organization as a whole to respond more quickly to market demands and improve revenue and profitability
©Silberschatz, Korth and Sudarshan2.86Database System Concepts
Additional Database Models
East
WestDenver
FebActual Budget
Margin
Sales TV
VCR
TV
VCR
MultidimensionalDatabase Structure
Attributes• Customer• BalanceOperations• Deposit• Withdraw
Bank Account Object
Attributes• Credit Line• Mthly StatementOperations• Calculate Interest• Print Mthly Statement
Checking AccountObject
Attributes• Credit Line• Mthly StatementOperations• Calculate Interest• Print Mthly Statement
Savings AccountObject
Object-OrientedDatabase Structure
©Silberschatz, Korth and Sudarshan2.87Database System Concepts
Multidimensional Model Enables Realistic Description of Data The multidimensional model is also a natural fit for describing
and storing complex data. Developers can create data structures that accurately represent real-world data, thus making it faster to develop applications and easier to maintain them.
Multidimensional Model
©Silberschatz, Korth and Sudarshan2.88Database System Concepts
Object-Oriented Database Model
• Offers more complex data types to overcome the restriction of normalization rules for relational databases.
• Object-Oriented databases is based on the following principles:
• Encapsulation• Inheritance• Polymorphism• Aggregation:
©Silberschatz, Korth and Sudarshan2.90Database System Concepts
Object-Oriented Database Model
• Encapsulation: An application (another object) can only communicate with an object via messages. The operations provided by an object define the set of messages which can be understood by it; no other operations can be applied to an object.
• Inheritance: New object classes can be derived from another class (the super-class) by inheritance. The new classes inherit the attributes and methods of the super-class and offer additional attributes and operations. The relation between a derived class and its super-class is called ``is A" relation because an instance of the derived class also is an instance of the super-class.
• Polymorphism: This feature is closely connected to inheritance. Derived classes may re-define methods of their super-class(es). This is very useful for achieving class-specific behaviour using messages already available for the super-class.
Aggregation: Composite objects may be constructed as consisting of a set of elementary objects. The container object can communicate with its contained objects via their methods. The relation between the container object and its components is called ``partOf" relation because a component is a part of the container object.
©Silberschatz, Korth and Sudarshan2.91Database System Concepts
CHARACTERISTICS OF A WELL DEFINED DATABASE SYSTEM
DATA INTEGRITY : Ability to navigate and ability to manipulate to produce results.
DATA INDEPENDENCE: Logical independence: not affected by a change in application program Physical Independence: not affected by a change in data structure
PREVENTION OF DATA REDUNDANCY & INCONSISTENCYRedundancy is proportional to storage space
SHARING OF DATAFlexibility in accessing & viewing data by different usersAvailability of data to existing as well as new applications.
DATA MAINTENANCE Checking the integrity of the system and data Provision for modification, addition, deletion of records
DATA SECURITY and REGULATION OF STANDARDSSystem failure/ crash, backup Unauthorised users
SOP, provision of pass words, DBA’s tasks