Upload
gwendoline-morrison
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Information System Solutions: A Project Approach
Chapter Four
Data Modeling
4 - 3
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Data Models
• Represent the data content of an information system
• May provide little or no information on process and infrastructure
• Follow a structure of rules and conventions to facilitate good communication
4 - 4
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Entity Relationship (ER) Models
The ER model represents the system in the following terms:
• Entities – the things about which the organization wishes to maintain data
• Organizational relationships between the entities
• Attributes – the items of data the client wishes to collect for each entity
4 - 5
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Entities
• Person – customer, student, employee, member
• Event – rental, sale, repair, enrollment, flight
• Object – video, product, vehicle, tool• Place – city, zip-code, area, store, plant• Concept – military unit, department,
bank account
4 - 6
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Entity Class and Instance
• Entity Class – set or group of things, for example, the entity class Customer represents all of the customers for an organization
• Entity Instance – one thing or member within an entity class, for example, one customer with the name of Joe Smith
In this text “entity” means entity class
4 - 7
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Attributes
• Define the properties or characteristics of the entity
• Can include such properties as names, descriptions, dates, sizes, and others
• Can include properties of tel-no, name, & address for the entity Customer
An entity must have an attribute with unique values over all instances to serve as the primary key or identifier
4 - 8
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Relationships
• Link or connect instances of one entity to instances of another entity
• Describe the way the organization operates
In the GB Video example, the entity Customer is related to the entity Rental in that a customer can engage in or make rental transactions.
4 - 9
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Maximum Cardinalities of Relationships
• One to one – one instance of entity A can link to one instance of entity B
• One-to-many – one instance of A can link to more than one instance of B
• Many to many – one instance of A can link to more than one instance of B; and one instance of B can link to more than one instance of A
4 - 10
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
ERD Symbols(Figure 4.2)
Entity Relationship
Attribute Multi-valuedAttribute
Relationship Types
One to one
One to many
Many to many
4 - 11
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Entity Relationship Diagram (ERD)(Figure 4.3)
CUSTOMER RENTAL
VIDEO
Makes
Contains
Name
Member-No
Tel-No
Address
Expire-Date
Credit-Card-No
Employee-No
Date
Pay-Type
Rental-No
Video-NoTitle
Date-Acquired
Cost
VendorOverdue
-Charge
Return-Date
Due-Date
Street City State
Rent-Charge/Day
Zip
4 - 12
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Basic ERD Rules
• An entity is a thing about which the organization wishes to keep data
• An entity contains more than one instance • An entity has a primary key attribute with
unique values for every instance • An entity has two or more attributes• A many-to-many relationship may have
attributes• A relationship represents a situation that
exists or the organization wants to exist
4 - 13
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
ERD Naming Rules
• Every component on a ERD has a unique name and/or label
• An Entity name consists of a singular noun in all capital letters
• A Relationship name consists of a verb or a phrase in upper and lower case letters
• Attribute names are nouns or nouns or noun phrases in using upper and lower case
• When an attribute serves as the primary key for an entity, the attribute name is underlined
4 - 14
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Multivalued Attributes
• A multivalued attribute can have several values for a single instance of an entity
• For example, Customer may contain attributes for the names of family members. A single customer (instance) may have more than one family member – spouse, children, etc.
• An entity can replace a multivalued attribute
4 - 15
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Multivalued Attributes(Figure 4.4)
CUSTOMER Person-first-namePerson-last-name
4 - 16
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Weak Entity(Figure 4.5)
CUSTOMER FAMILY MEMBER
Person-ID
Person first-name
Person-last-name
4 - 17
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
The Role of an Associative Entity
• An associative entity (AE) replaces or resolves a many-to-many relationship (m:n)
• One-to-many relationships connect the AE to the original entities in the m:n
• Many sides always connect to the AE• Attributes of the m:n relationship
become attributes of the AE
4 - 18
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Associative Entity(Figure 4.6)
RENTAL VIDEORENTAL/VIDEO
Held byContains
4 - 19
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Degree of a Relationship
The number of entity classes that participate in a relationship defines the degree of the relationship
• A unary relationship links an entity to itself
• A binary relationship links two entities• A ternary relationship links three
entities
4 - 20
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Unary Relationships(Figure 4.7)
POLICEOFFICER Commands Partner of
4 - 21
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Ternary Relationship(Figure 4.8)
ARTIST
HALLWORK
Time-Period
Performs
4 - 22
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Ternary Relationship with an Associative Entity (Figure 4.9)
ARTIST
HALLWORK
Time-Period
PERFORMANCE
4 - 23
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Cardinality
• Maximum cardinality specifies the maximum number of instances of entity B that can link to one instance of entity A by use of the straight line or crowfoot symbols
• Minimum cardinality or optionality specifies the minimum number of instances of B that can link to one instance of A using the 0 (optional) or 1 (mandatory) symbols
4 - 24
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Minimum and Maximum Cardinalities(Figure 4.10)
CUSTOMER RENTAL
Makesl 0
A customer may make zero (minimum) or many (maximum) rentals; a rental is made by one (minimum) and only one (maximum) Customer
4 - 25
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Supertype and Subtype Entities
• A supertype has one or more subtypes • A supertype entity holds the attributes
common to all of its subtype entities • A subtype may have additional attributes• A subtype entity may participate in
relationships with other entities • Supertype and subtype entities are linked by
one-to-one relationships
4 - 26
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Supertype with a Subtype(Figure 4.11)
VIDEO MATERIALl 00 l
Video- No
Area Age-Group
ED-VIDEO
4 - 27
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Specialization
• Total specialization, every instance of the supertype must link to one instance in one of the subtypes
• Partial specialization, an instance in the supertype may link to zero instances in all of the subtypes. In the GB Video example, some instances of regular videos link to zero instances in the subtype.
4 - 28
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Disjoint and Overlap
Instances of a supertype can link to multiple instances of subtypes as follows:
• Disjoint – a supertype instance can link to an instance of only one subtype
• Overlap – a supertype instance can link to one instance in each of several subtypes
4 - 29
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Simplified ERD Rules(SERD)
• Omit the relationship diamonds • Keep the relationship names• List attributes in the entity box • Replace composite attributes with the
component attributes• Replace many-to many relationships with
associative entities • Replace multi-valued attributes with entities
4 - 30
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
A Simplified ERD or SERD(Figure 4.12)
Held by
CUSTOMERMember-NoNameStreet City State ZipTel-NoCredit-Card-NoExpire-Date
RENTALRental-NoDateEmployee-NoPay-Type
VIDEOVideo-NoTitleDate-AcquiredRent-Charge/DayVendor
RENTAL/VIDEORental-NoVideo-NoDue-DateCostReturn-DateOverdue-Charge
Makesl 0
l
l
0l
Contains
4 - 31
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Model Types
• Conceptual – a conceptual data model presents data with no constraints from physical technologies
• Logical – a logical data model observes constraints within a technology class– for example, relational tables
• Physical – a physical data model follows the constraints of one specific technology such as MS Access
4 - 32
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Conceptual Data Model (CDM)(Figure 4.13)
CUSTOMERCust-NoF-NameL-NameAds1Ads2CityStateZipTel-NoCC-NoExpire
RENTALRental-NoDateClerk-NoPay-TypeCC-NoExpireCC-Approval
LINELine-NoDue-DateReturn-DateOD-ChargePay-Type
Requestor of Owner of
VIDEOVideo-NoOne-Day-FeeExtra-DaysWeekend
TITLETitle-NoNameVendor-NoCost
Name for
Holder of
0l ll
0l
0l
SERD Format
4 - 33
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Entity Metadata(Table 4.1)
Entity Description
CUSTOMER Contains all the available information about each customer who has made a transaction in the last year.
LINE Contains the information on each video associated with a rental transaction
RENTAL Contains the information on each rental transaction
TITLE Contains information on each distinct title of the videos
VIDEO Contains information on each individual video
4 - 34
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Attribute Metadata for RENTAL(Table 4.1)
Attribute Description
Rental-No Unique key assigned to each rental
Date Date of the rental
Clerk-No Employee number of the clerk entering the rental
Pay-Type Cash, check or credit card
CC-No Credit card number
Expire Expiration date of the credit card
CC-Approval Credit card approval code
4 - 35
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Relationship Metadata(Table 4.1)
Relationship Name
Description Entity1 with (min,max) cardinality
Entity2 with (min,max) cardinality
Requestor of Links each customerto rentals made bythe customer
CUSTOMER – arental must be forone customer (1,1)
RENTAL - a customer may makemany rentals (0,many)
Owner of Links each rental tothe associative entity
RENTAL – a linemust belong to onerental (1,1)
LINE – a rental mustcontain one or manylines (1, many)
Holder of Links the associativeentity to a specificvideotape
LINE – a video maybe held by manylines (0, many)
VIDEO – a line must hold one videotape (1, 1)
Name for Links a title to thevideo tapes that usethe title
VIDEO – a title mayname many videos(0, many)
TITLE – a videotape must be named by one title (1, 1)
4 - 36
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Enterprise Data Model (EDM) Rules
• The EDM provides an overview of the organizational area under study
• Associative entities are included if they represent an important “thing” in the organization. Weak entities are omitted
• Attributes of the entities are not shown• Relationships are shown with maximum
cardinalities only and relationship phrases; relationship diamonds are omitted. Many to many relationships are acceptable
4 - 37
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Enterprise Data Model (EDM)(Figure 4.14)
RENTALCUSTOMER Requests
VENDOREMPLOYEE
Makes Supplies
VIDEOContains
4 - 38
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Logical Data Models
• Logical data models translate conceptual data models into a specific data storage structure
• Many information systems in the 1950 to 1980 time period used a logical structure of sequential or “flat” files stored physically on magnetic tape
• Today, the most common logical model for data storage is the relational model. Many physical database implementations use the relational model
4 - 39
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
ERD and Corresponding Relational Tables(Figure 4.15)
Cust-No
L-Name
City State
23278 Clinton Little Rock AR
10995 Dole Wichita KS
22671 Kerry Boston MA
00987 Bush Crawford TX
CUSTOMERCust-NoL-NameCityState
RENTALRental-NoDateClerk-No
Makes
Rental-No
Date Clerk-No
Cust-No
1176 0102200x 11 22671
2235 0203200x 07 00987
4450 1121200x 07 22671
0067 0102200x 09 10995
3309 0510200x 11 10995
2621 0330200x 08 22671
CUSTOMERRENTAL
Foreign key
4 - 40
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
ERDs and Relational Models(Table 4.2)
E-R Model Terms and Concepts Relational ModelTerms and Concepts
Entity (regular, weak or associative) Table or relation
Single-valued attribute Column or attribute
Multi-valued attribute (Not allowed)
Instance Row or tuple
Primary key (or primary identifier) Primary key
One-to-one or one-to-many relationship Foreign key
Many-to many-relationship (Not allowed)
4 - 41
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Rules for Relational Models
• Every table name and the full name of every column must be unique
• A column must have a single value for each row
• The meaning of a column is determined only by the name
• A row is defined only by the content of the information in the row
• The content of each row must be unique
4 - 42
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Foreign Keys
Foreign keys express the relationships between instances in different tables
• A foreign key may have any unique full-name• For 1:m relationships, the foreign key always
goes in the table on the many side of the relationship
• For 1:1 relationships, the foreign key may appear in either table
• Relational tables do not allow for the implementation of m:n relationships
4 - 43
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Relational Schema Representations(Figure 4.16)
CUSTOMER
Cust-No L-Name City State Rental-No Date Clerk-No Cust-No
RENTAL
CUSTOMER(Cust-No, City, L-Name, State) RENTAL(Rental-No, Cust-No, Date, Clerk-No)
CUSTOMERCust-NoL-NameCityState
RENTALRental-NoDateCust-NoClerk-No
1
*
Box Schema
Set Notation Schema
Column Heading Schema
4 - 44
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Rules for Converting ERDs to Relational Schemas (I)
• Convert all the many-to-many relationships, if any, to associative entities
• Convert all multi-valued attributes to entities
• Convert every entity to a table with column for each attribute of the entity
• For every one-to-many relationship between two entities, add a foreign key to the table that corresponds to the entity on the many side of the relationship
4 - 45
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Rules for Converting ERDs to Relational Schemas (II)
• For a one to one relationship between two entities, add or identify a foreign key in either of the tables
• Add a referential integrity arrow or line from each foreign key to the corresponding primary key
• With unary relationships, add the foreign key to the single table. The referential integrity arrow connects the primary and foreign key in the same table
4 - 46
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Relational Schema for GB Video(Figure 4.17)
CUSTOMERMember-NoNameStreet City State ZipTel-NoCredit-Card-NoExpire-Date
RENTALRental-No
Member-NoDateEmployee-NoPay-Type
VIDEOVideo-NoTitleDate-AcquiredVendorRent-Charge/Day
RENTAL/ VIDEORental-NoVideo-NoDue-dateCostReturn-DateOverdue-Charge
1 1 1
* * *
Box Schema Format
4 - 47
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Unary Relationships in Relational Tables(Figure 4.18)
ID# Name Rank Partner# Commander#
POLICE OFFICER
ID#NameRank
Partner of
Commands
POLICE OFFICER
4 - 48
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
A Unary m:n Relationship in an ERD(Figure 4.19)
PART
ID#DescriptionWeight
Goes into or contains
Quantity
ERD representation for the bill of materials problem
4 - 49
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Other Unary m:n Representations(Figures 4.20 & 4.21)
Line# Assembly-Part# Component-Part# Quantity
ID# Description Cost
PART
ID#DescriptionCost
PART/PART
Line#Quantity
Contains
Goes into
PART
PART/PART
Associative Entity Representation
RelationalSchemaRepresentation
4 - 50
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Normalization
• Addresses the improvement of a logical data design to avoid possible problems with data duplication and with deletion and updating of data.
• Converts an un-normalized table into two or more smaller, normalized tables
• Consists of a six steps that successively transform a relation into First, Second, Third, Boyce/Codd, Fourth and Fifth Normal Forms
4 - 51
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Normalization Steps 1 – 3
1. First Normal Form – remove multi-varied attributes
2. Second Normal Form – remove functional dependency: the value of one of the non-key attributes depends on only part of the composite primary key
3. Third Normal Form – remove transitive dependency: a non-key attribute depends on another non-key attribute
4 - 52
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Second Normal Form(Table 4.3 & Figure 4.22)
Rental-No Video-No Due-Date Title
Rental-No Video-No Due-Date
Video-No Title
Second normal form violation
Tables in second normal form
4 - 53
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Third Normal Form(Table 4.4 and Figure 4.23)
Video-no Title-No Vendor Date-Acquired
Video-No Title-No Date-Acquired
Third normal form violation
Tables in third normal form
Title-no Vendor
4 - 54
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Structured Query Language (SQL)
• A programming language for relational databases that exists at both the logical and physical level
• SQL provides commands for a:– Data Definition Language (DLL) to create,
alter and drop tables– Data Manipulation Language(DML) to insert,
update, modify and retrieve data– Data Control language to grant and revoke
access privileges for a database
4 - 55
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Dimensional Data Models
Basic idea – use static data generated by operations to gain insight for strategic and tactical decisions
• The typical dimensional model data structure is a data mart designed around a central fact table that contains numeric values for analysis
• The data mart model supplies the framework for creating a data warehouse