3024-8

Embed Size (px)

Citation preview

  • 8/13/2019 3024-8

    1/46

    8. CONCEPTUAL DATA MODELING:FUNDAMENTAL CONCEPTS

    First step in designing a database: specifying conceptualschema.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 1

    This is a critical step. Without an accurate conceptualschema, resulting information system may not meet ourneeds or even function properly.

    Conceptual schema is blueprint for informationsystem design.

  • 8/13/2019 3024-8

    2/46

    Lets first describe fundamental data modeling concepts,so that we know how to specify a conceptual schema.

    We will do this in the context of an old-fashionedbookstore.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 2

    Well then see how we can use the discovery processtodevelop conceptual schema.

  • 8/13/2019 3024-8

    3/46

    8.1 Entities, Entity Classes, Attributes andAttribute Values

    Entity = some particular object of interest to us.

    e.g., a book having ISBN 0-13-056517-2

    2009 John P. Shewchuk ISE 3024 Course Notes 8 3

    Thomson International

    shelf s4

    (a book supplier)

    (a shelf at Annas Books which holdsbooks)

  • 8/13/2019 3024-8

    4/46

    Entity Class= collection of similar entities

    e.g., Book entity class

    Customer entity class

    Supplier entity class

    2009 John P. Shewchuk ISE 3024 Course Notes 8 4

    Shelf entity class

    We can also think of entities as instances of a given class.

  • 8/13/2019 3024-8

    5/46

    Attributes= characteristics used to define similar entities.

    e.g., Book entities defined by ISBN,title,author,publisher,year,price,bookId

    2009 John P. Shewchuk ISE 3024 Course Notes 8 5

    Customer: lastname,firstname,address

    Supplier: name,state,phone,rating

    Shelf: shelfId

  • 8/13/2019 3024-8

    6/46

    Attribute Values= values for attributes for a particularentity.

    e.g., Book entities:

    12 Star Wars: Jedi Trial Dan Cra Random House 0-3454-6115-0 2005 $6.99

    book

    Id title author publisher ISBN year price

    2009 John P. Shewchuk ISE 3024 Course Notes 8 6

    27 Band of Brothers Stephen Ambrose Simon & Schuster 0-7432-2454-X 2001 $17.829 Circle of Friends Maeve Binchy Dell Publishing 0-4402-1126-3 1995 $7.99

    37 Message In A Bottle Nicholas Sparks Time Warner 1-5948-3636-1 1999 $7.50

    23 The Long Road Home Danielle Steel Dell Publishing 0-4402-4344-0 2006 $14.99

    4 Star Wars: Jedi Trial Dan Cragg Random House 0-3454-6115-0 2005 $6.99

    18 Beyond Valor Patrick ODonnell Simon & Schuster 0-6848-7385-0 2001 $8.99

    32 Curious George Margret and H.A. Rey Houghton Mifflin 0-3951-5023-X 1973 $14.00

    17 Beyond Valor Patrick ODonnell Simon & Schuster 0-6848-7385-0 2001 $8.99

  • 8/13/2019 3024-8

    7/46

    Customer entities:

    address

    lastName firstName street city state zip

    Smith John 214 Church St. Blacksburg VA 24073

    Davis Diane 185 Clay St. Blacksburg VA 24073

    2009 John P. Shewchuk ISE 3024 Course Notes 8 7

    Smith John 24 Range Road Radford VA 24141

    Morgan Leslie 240 Kent St., Apt. 27 Christiansburg VA 24061

    Carter Brian 14 2nd Street Lafayette IN 47096

  • 8/13/2019 3024-8

    8/46

    Supplier entities: Shelf entities:

    name state phone rating

    Thomson International CA (216)-175-5065 A

    ABC Books IN (216)-831-9175, A

    (704)-668-2203

    shelfId

    s1

    s2

    s3

    2009 John P. Shewchuk ISE 3024 Course Notes 8 8

    BookWorld VA (331)-791-4982, C(501)-773-7381

    s4s5

    s6

    :

    s40

  • 8/13/2019 3024-8

    9/46

    8.2 More on Attributes

    Every attribute is one of three types:

    single-valued, e.g., title

    multivalued, e.g.,phone

    (attribute of Book)

    (attribute of Supplier)

    2009 John P. Shewchuk ISE 3024 Course Notes 8 9

    compos e, e.g., address

    Additionally, every attribute is either a key or non-keyattribute.

    Key Attribute(s) = set of one or more attributes whichuniquely identify entities within a class.

  • 8/13/2019 3024-8

    10/46

    Possible key attributes for our book store entity classes:Book:

    Customer:

    Supplier:Shelf:

    bookId

    {lastName, firstName, address}

    2009 John P. Shewchuk ISE 3024 Course Notes 8 10

    I an entity class has more than one key (i.e., sets o keyattributes), one of the keys is designated the primary key.The other keys are then called secondary keys.

    Specification of key attributes for entity classes is animportant part of data modeling. So be careful!

    Primary key: single attribute, value unlikely to change

  • 8/13/2019 3024-8

    11/46

    Key attributes must always have values. But what aboutthe remaining (i.e., non-key) attributes?

    Such attributes should normally always have values as

    well. The special case of an empty space (field) is calleda null value(or simply null ).

    2009 John P. Shewchuk ISE 3024 Course Notes 8 11

    e nu va ues are some mes necessary, ey s oube avoided if possible due to the ambiguity they create.For example, null could mean

    the attribute is not applicable for that entity

    the value is known but missing (i.e., not recorded).

    the value is unknown.

  • 8/13/2019 3024-8

    12/46

    e.g.,address

    lastName firstName street city state zip

    Smith John 214 Church St. Blacksburg VA 24073

    Davis Diane

    OKeefe Patrick 35 Cambria St. VA 24061

    Smith John 24 Range Road Radford VA 24141

    2009 John P. Shewchuk ISE 3024 Course Notes 8 12

  • 8/13/2019 3024-8

    13/46

    8.3 Defining Attributes

    Every attribute is defined via the following:

    i) a name

    ii) a data type

    - e.g., integer, string, currency, date

    2009 John P. Shewchuk ISE 3024 Course Notes 8 13

    a engiv) a format

    v) a domain, or set of possible values

    vi) a description

    - e.g., - g n eger, -c arac er s r ng

    - e.g., #.## for decimal

  • 8/13/2019 3024-8

    14/46

    Data

    Name Type Length Format Domain Description Example

    title string unbounded none none Title of a book Beyond Valor

    address composite Two strings of none none An address con- 214 Church St.,string 36 characters, sisting of street, Blacksburg,

    one 2-char city, state, and VA, 24073string, one zipcode

    e.g.,

    2009 John P. Shewchuk ISE 3024 Course Notes 8 14

    5-char string

    phone string 10 char xxx-xxx-xxxx none A phone number 216-175-5065

    price currency 5 digits $xxx.xx none Price ($,cents) 17.42

    shelfId string 2 char none s1s6 Shelf identifier s3

    year integer 4 digits none Values Year (Julian 19941900-2010 calendar)

  • 8/13/2019 3024-8

    15/46

    Attribute values may also have constraints as follows: Key values for associated attributes must be

    unique for every entity in that class.

    - this constraint on key values known as akey constraintor entity integrity constraint.

    - database system responsible for enforcing.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 15

    Not Null

    Derived attribute is not entered directly, but rather is

    calculated from some other information.

  • 8/13/2019 3024-8

    16/46

    8.4 Relationships and Relationship Types

    In many situations, entities are related to one another.

    e.g., whenever a customer buys a book, a relationship iscreated between that customer and that book.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 16

    Important part of data modeling: establishing what kindsof relationships can occur and incorporating thisinformation into the data model.

  • 8/13/2019 3024-8

    17/46

    Relationship Type= specification of the nature of anassociation that one entity may havewith another (entities of differentclasses usually)

    e.g., Customer Purchases Book

    2009 John P. Shewchuk ISE 3024 Course Notes 8 17

    Relationship = specific association between two actualentities, per given relationship type.

    e.g., John Smith buys Band of Brothers

  • 8/13/2019 3024-8

    18/46

    We can also think of relationships as instances of a givenrelationship type.

    How are relationship types designated? Each relationship

    type consists of a verb phrasewhich can be used in asentence with the entity classes, e.g.,

    Phrase Relationshi T e Between

    2009 John P. Shewchuk ISE 3024 Course Notes 8 18

    A customer buys a book Purchases Customer, BookBooks are found on shelves FoundOn Book, Shelf

    We can also specify relationship types in reverse order.

    e.g., Shelf Holds Book

  • 8/13/2019 3024-8

    19/46

    Relationship types may have their own attributes.

    e.g., Customer Purchases Book

    whenever a customer buys a book, we may want torecord the date of the sale.

    the corresponding attribute, saleDate, only exists

    2009 John P. Shewchuk ISE 3024 Course Notes 8 19

    or a g ven sa e, .e., cus omer an oo .

    e.g., Book AvailableFrom Supplier

    price = cost of that book from that supplier

    =attribute of relationship typeAvailableFrom

  • 8/13/2019 3024-8

    20/46

    8.5 Constraints on Relationship Types

    Cardinality Ratio Constraint specification of how manyentities from either class, can

    be involved in a relationshipof a given type.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 20

    Three types of cardinality ratio constraints.

  • 8/13/2019 3024-8

    21/46

    One-to-One (1:1) an entity from either class may berelated to at most one entity fromthe other class.

    e.g., relationship type isMarriedTo betweenManandWoman

    2009 John P. Shewchuk ISE 3024 Course Notes 8 21

    Man toWoman is 1:1

    ac man may e marr e o a mos one woman.

    Each woman may be married to at most one man.

  • 8/13/2019 3024-8

    22/46

    One-to-Many (1:M) an entity from the first class maybe related to multiple entries fromthe second: an entity from thesecond may be related to at most

    one entity from the first.

    e.g., Customer Purchases Book

    2009 John P. Shewchuk ISE 3024 Course Notes 8 22

    Customer to Book is 1:M

    Each customer may purchase many books.

  • 8/13/2019 3024-8

    23/46

    Many-to-Many (M:N) an entity from either class maybe related to multiple entriesfrom the other class.

    e.g., Book AvailableFrom Supplier

    Each book may be available from multiple

    2009 John P. Shewchuk ISE 3024 Course Notes 8 23

    Book to Supplier is M:N

    .

  • 8/13/2019 3024-8

    24/46

    Other types of constraints on relationships: Min/Max Constraints specifies how many relationships

    of a given type an entity may have.

    e.g., Shelf Holds Book. Each shelf can hold nomore than 60 books.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 24

    -

    - min cardinality of Book = 0

    every Shelf entity can be associated withbetween 0 and 60 Book entities.

    some Shelf entities may not be associatedwith any Book entities.

  • 8/13/2019 3024-8

    25/46

    Each book, however, is found in exactly one shelf.- max cardinality of Shelf = 1

    - min cardinality of Shelf = 1

    every Book entity can be associated withbetween 1 and 1 Shelf entity.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 25

    one Shelf entity.

  • 8/13/2019 3024-8

    26/46

    Participation Constraints specify whether or not entitiesmust participate in associated relationship types.

    e.g., Customer Purchases Book

    - participation of Customer is optional(Customer can exist without any books)

    - participation of Book is optional

    2009 John P. Shewchuk ISE 3024 Course Notes 8 26

    e.g., Shelf Holds Book

    - participation of Shelf is optional

    (Some shelves may not have any books)- participation of Book is mandatory

    (Every book must be located on some shelf)

  • 8/13/2019 3024-8

    27/46

    Note that optional participation when min cardinality of related

    entity class = 0= 0= 0= 0.

    mandator artici ation when min cardinalit of

    e.g., cardinality of Book = (0..60) participation ofShelf optionaloptionaloptionaloptional

    2009 John P. Shewchuk ISE 3024 Course Notes 8 27

    related entity class > 0> 0> 0> 0.

    Thus, we can use min cardinality to determine and/orspecify whether participation is mandatory or optional.

    e.g., cardinality of Shelf = (1..1)

  • 8/13/2019 3024-8

    28/46

    8.6 The Discovery Process

    To obtain the data needed to construct a conceptualschema, the discovery processis used.

    This consists of interviewing the people involved with thedata management application, analyzing current

    2009 John P. Shewchuk ISE 3024 Course Notes 8 28

    , .,

    the user requirements. What kind of operations arethe users interested in performing on the data? Whattypes of querieswill they be interested in?

    the objects of interest (entities), their characteristics(attributes), and how they interact (relationships), interms of satisfying the user requirements.

  • 8/13/2019 3024-8

    29/46

    The result of interviews, document collection, etc., arenarrative descriptions of how the system (or some part ofit) works or what information must be extracted from it.

    Such descriptions are often incomplete, conflicting,ambiguous, and/or contain irrelevant information.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 29

    ,

    compile, and analyze the descriptions to identify entityclasses, attributes, and relationship types.

    This is often an iterative process, and can result intensions and conflicts. So good people skills are critical tothis step.

  • 8/13/2019 3024-8

    30/46

    Note also that the discovery process often results ininformation describing how the data managementapplication must behave.

    While not part of the conceptual schema, thisinformation must still be documented as it will beneeded later on in developing the database application.

    2009 John P. Shewchuk ISE 3024 Course Notes 8 30

    Such information is often referred to as the behavioralmodel of the system.

  • 8/13/2019 3024-8

    31/46

    8.7 Example: Conceptual Data Modeling for AnnasBooks

    Annas BooksAnnas BooksAnnas BooksAnnas Books is a small, locally-operated bookstore

    which sells books the old-fashioned way (i.e., person-to-person, with no online sales).

    2009 John P. Shewchuk ISE 3024 Course Notes 8 31

    The store owner has hired you as a consultant to designand implement an information system to aid in sellingbooks.

    The discovery process has resulted in the followingdescription of user requirements and system operation:

  • 8/13/2019 3024-8

    32/46

    Annas Books is a small bookstore which sells books towalk-in customers. The owner wants computer support in

    locating where (i.e., on what shelf) a given book can be

    found.

    determining how many copies of a given title are on

    2009 John P. Shewchuk ISE 3024 Course Notes 8 32

    .

    tracking customer sales, and

    keeping track of what books are available from whichsuppliers, and at what price.

  • 8/13/2019 3024-8

    33/46

    Customers purchase one or more books at a time: for each

    purchase, the date must be recorded. The first and last name ofeach customer is known: it is desirable to also know eachcustomers address (for advertising purposes). The title, author,publisher, year, price, and ISBN # are recorded for each book.

    Books are located on one of forty shelves. Each shelf can hold upto sixty books. All books of the same type (i.e., title) are to be kept

    _________ _____

    ______

    2009 John P. Shewchuk ISE 3024 Course Notes 8 33

    .

    Books are obtained from different suppliers, each of whom has arating (A, B, or C), is located in a particular state, and can becontacted via one or more phone numbers. Every book is available

    from at least two different suppliers: a single supplier may supplymany different books. Suppliers not able to provide any booksshould still be kept on record. For each book and possible supplier,we need to know the book price.

    ________

  • 8/13/2019 3024-8

    34/46

    (a) Entity classes, attributes, and additional attribute information:

    Entity Class AttributeAttribute Type and/or

    Constraints

    Customer firstName

    lastNameaddress

    Partial key

    Partial keyPartial key; compositeattribute consisting ofstreet, city,

    2009 John P. Shewchuk ISE 3024 Course Notes 8 34

    s a e, z p

    Book bookId

    title

    author

    publisher

    yearprice

    ISBN

    Key

    Currency; not null

  • 8/13/2019 3024-8

    35/46

    Entity Class AttributeAttribute Type and/or

    Constraints

    Supplier name

    statephone

    rating

    Key

    Multivalued attributeValue from {A, B, C}

    Shelf shelfId Key

    2009 John P. Shewchuk ISE 3024 Course Notes 8 35

  • 8/13/2019 3024-8

    36/46

    (b) Relationship types, cardinality ratio constraints, and attributes:

    Relationship Type andEntity Classes

    CardinalityRatio Attributes

    Customer Purchases Book 1:M saleDate

    Book AvailableFrom Supplier M:N price

    Shelf Holds Book 1:M

    2009 John P. Shewchuk ISE 3024 Course Notes 8 36

    Note that as (i) Shelf has no other attributes besides key,and (ii) has only a simple relation with Book, we couldsimply make shelfIdan attribute of Book.

    Best to leave as a separate entity, however: meaning clearer.

    may want to provide additional shelf information later.

  • 8/13/2019 3024-8

    37/46

    (c) Participation constraints and min/max cardinality constraints:

    Relationship Type Entity Class ParticipationConstraint

    Min/MaxCardinality

    Purchases Customer optional (0..1)

    Book optional (0..M)

    2009 John P. Shewchuk ISE 3024 Course Notes 8 37

    AvailableFrom Book mandatory (0..M)Supplier optional (2..N)

    Holds Shelf optional (1..1)

    Book mandatory (0..60)

  • 8/13/2019 3024-8

    38/46

    You are an IE at JA IndustriesJA IndustriesJA IndustriesJA Industries, a company employing

    cellular manufacturing.

    8.8 Example: Conceptual Data Modeling for JAIndustries

    2009 John P. Shewchuk ISE 3024 Course Notes 8 38

    e p an manager as c arge you w mprov ng

    productivity on the shop floor.

    The discovery process has resulted in the followingdescription of user requirements and system operation.

  • 8/13/2019 3024-8

    39/46

    JA Industries produces various machined components viathe use of manufacturing cells. The company aims toimprove productivity on the shop floor.

    Typical questions arising in support of this effort:

    Where (i.e., what machine and/or cell) are parts located?

    2009 John P. Shewchuk ISE 3024 Course Notes 8 39

    What is the status (e.g., busy) of each machine? What processes are required for a given part?

    Which machines are capable of a given process?

  • 8/13/2019 3024-8

    40/46

    Each cell at JA Industries has a unique cell number, is of a

    specific type, and consists of 2-4 machines. Some cells alsocontain a robot for loading/unloading machines, in which casethe single robot handles all machines in the cell. Every robothas a particular number, and is of a specific type and model.

    Shop workers perform load/unload if there is no robot in a cell.

    Each machine has a unique name, and is of a specific type. A

    2009 John P. Shewchuk ISE 3024 Course Notes 8 40

    separa e num er en es e mac ne amongs a mac nes

    of that type (e.g., 3rd machine of type T4).

    Every machine can perform up to 5 different processes, whereeach process is defined by a unique identifier and a process

    description. Every process can be performed by at least onemachine: some can be done by multiple machines. Not allmachines perform a given process equally: the efficiency mayvary from one machine to another.

  • 8/13/2019 3024-8

    41/46

    Each part produced in the facility has a unique serial number

    (SN) and is of a given type. For each part type, a unique processplan exists. The process plan specifies the sequence ofoperations (between 1 and 4) for that part type, where eachoperation is defined by machine, process, setup time, and run

    time.

    Also associated with each process plan is a list of the planners

    2009 John P. Shewchuk ISE 3024 Course Notes 8 41

    respons e or e p an. ome processes an mac nes may no

    be required for any operations.

    At any given time during production, every part in the system islocated at a machine (i.e., in process or waiting to be processed).

    A given machine may not have any parts, however. Additionally,at any time, each machine is either busy, idle, or down.

  • 8/13/2019 3024-8

    42/46

    Entity Class AttributeAttribute Type and/or

    Constraints

    Cell cellNo Key

    cellType

    Robot robotNo Key

    (a) Entity classes, attributes, and additional attribute information:

    2009 John P. Shewchuk ISE 3024 Course Notes 8 42

    ro o ype

    robotModel

    Machine machName Key

    machNo Partial secondary key

    machType Partial secondary keystatus Values from {busy, idle, down}

  • 8/13/2019 3024-8

    43/46

    Entity Class Attribute

    Constraints or Further

    Description

    Part partSN Key

    Op

    Process processId Keyprocess

    Description

    2009 John P. Shewchuk ISE 3024 Course Notes 8 43

    ey

    partType

    planners Multivalued attribute

    date

    Operation opno Keysetup

    run

  • 8/13/2019 3024-8

    44/46

    Relationship Type andEntity Classes

    CardinalityRatio Attributes

    Cell Has Machine 1:M

    Cell Contains Robot 1:1

    Robot Serves Machine 1:M

    (b) Relationship types, cardinality ratio constraints, and attributes:

    2009 John P. Shewchuk ISE 3024 Course Notes 8 44

    ac ne as ar :

    Process DoneAt Machine M:N efficiency

    Process DoneVia Operation 1:M

    Machine Performs Operation 1:M

    ProcessPlan Has Operation 1:MProcessPlan For Part 1:M

  • 8/13/2019 3024-8

    45/46

    RelationshipType

    Entity Class ParticipationConstraint

    Min/MaxCardinality

    Has Cell mandatory (1..1)

    Machine mandatory (2..4)

    (c) Participation constraints and min/max cardinality constraints:

    2009 John P. Shewchuk ISE 3024 Course Notes 8 45

    Conta ns Ce opt ona ..

    Robot mandatory (0..1)

    Serves Robot mandatory (0..1)

    Machine optional (2..4)

    Has Machine optional (1..1)

    Part mandatory (0..M)

  • 8/13/2019 3024-8

    46/46

    RelationshipType

    Entity Class ParticipationConstraint

    Min/MaxCardinality

    DoneAt Process mandatory (1..5)

    Machine mandatory (1..M)

    DoneVia Process optional (1..1)

    Operation mandatory (0..M)

    2009 John P. Shewchuk ISE 3024 Course Notes 8 46

    ..

    Operation mandatory (0..M)

    Has ProcessPlan mandatory (1..1)

    Operation mandatory (1..4)

    For ProcessPlan optional (1..1)

    Part mandatory (0..M)