78164802

Embed Size (px)

Citation preview

  • 7/30/2019 78164802

    1/29

    RESEARCH NOTE

    ARESEARCH NOTE ON REPRESENTING PARTWHOLE

    RELATIONS IN CONCEPTUAL MODELING1

    Gove N. Allen

    Marriot School of Management, Brigham Young University, 783 Tanner Building,

    Provo, UT 84602 U.S.A. {[email protected]}

    Salvatore T. March

    Owen Graduate School of Management, Vanderbilt University,

    Nashville, TN 37203 U.S.A. {[email protected]}

    Empirical research is an important methodology for the study of conceptual modeling practices. The recently

    published article Representing PartWhole Relations in Conceptual Modeling: An Empirical Evaluation

    (Shanks et al. 2008) uses the lens of ontology to study a relatively sophisticated aspect of conceptual modeling

    practice, the representation of aggregation and composition. It contends that some analysts argue that a

    composite should be represented as a relationship while others argue that a composite should be represented

    as an entity. We find no evidence of such a dispute in the data modeling literature. We observe that composites

    are objects. By definition, all object-types should be represented as entities. Therefore, using the relationship

    construct to represent composites should not be seen as a viable alternative. Additionally, we found significant

    conceptual and methodological issues within the study that call its conclusions into question. As a way to offer

    insight into the requisite methodological procedures for research in this area, we conducted two experiments

    that both explicate and address the issues raised. Our results call into question the utility of using ontology

    as a foundation for conceptual modeling practice. Furthermore, they suggest a contrary but at least equally

    plausible explanation for the results reported by Shanks et al. In conducting this work we hope to encourage

    dialogue that will be beneficial for future endeavors aimed at identifying, developing, and evaluating

    appropriate foundations for the discipline of conceptual modeling.

    Keywords: Conceptual modeling, empirical research, ontology, information systems development, compo-

    sition, UML, entityrelationship model

    Introduction1

    Academic research in the field of information systems haslong been interested in examining the relationship between

    conceptual models (grammars, methods, and scripts) and

    human performance in tasks related to information system

    development and use (Batra et al. 1990; Bodart et al. 2001;

    Burton-Jones and Meso 2006; Kim and March 1995). Oneimportant question deals with identifying a sound theoretical

    basis for the development and assessment of conceptual

    modeling grammars and practices (Wand and Weber 2002).

    Bunges (1977) ontology has been suggested as such a theo-

    retical basis (Wand et al. 1995; Wand et al. 1999; Wand and

    Weber 1993). Several articles have reported experiments that

    study the effects of conformance to Bunges ontology on

    human performance in various tasks (Bodart et al. 2001;

    1Dale L. Goodhue was the accepting senior editor for this paper. Sandeep

    Purao served as the associate editor.

    The appendices for this paper are located in the Online Supplements

    section of theMIS Quarterlys website (http://www.misq.org).

    MIS Quarterly Vol. 36 No. 3, pp. 945-964/September 2012 945

  • 7/30/2019 78164802

    2/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Bowen et al. 2006, 2009; Gemino and Wand 2005; Shanks et

    al. 2008.

    Establishing that a specific ontological foundation consis-

    tently leads to superior human performance in conceptual

    modeling tasks would be a significant contribution to our

    field. Shanks et al. (2008) draw this conclusion, contending

    that

    our results add strength to a growing body of empi-

    rical work that supports the usefulness of ontological

    theories, especially Bunges (1977) theory, as a

    means of predicting the strengths and weaknesses of

    conceptual modeling models and practices (p. 570).

    However, we found a number of apparent problems with the

    Shanks et al. study, in both its conceptual underpinnings and

    in the execution of the empirical tests that cast considerable

    doubt on such a conclusion.

    Articles in MIS Quarterly are often the starting point for

    further research. Because reliance on results with problematic

    conceptualization and methodology will have detrimental

    effects on the development of a cumulative research tradition

    within the discipline (Straub 2008), we feel it is important to

    discuss these problems, to recast and retest the assertions of

    Shanks et al., and to provide an exemplar of careful work in

    this difficult area of research.

    While we focus on what we see as critical concerns with the

    study reported by Shanks et al., it is important to note the

    significant contributions of the work. First, this is among avery small number of studies aimed at empirically testing the

    utility of a well-established philosophical ontology as the

    basis for conceptual modeling. Empirical studies of this

    nature are crucial. Second, it presents an innovate framework

    for data analysis that combines a quantitative assessment of

    subjects answers with a qualitative assessment of explana-

    tions for their answers and their interpretations of the modeled

    domain. It presents a detailed description of procedures used

    in this qualitative data analysis. Analysis and categorization

    of verbal protocols is a difficult and time-consuming process,

    but one that can yield significant insight into subjects

    reasoning processes.

    In undertaking this work, our intent is to encourage dialogue

    that will be beneficial for future endeavors as our field works

    toward identifying, developing, and evaluating appropriate

    foundations for the discipline of conceptual modeling. We

    identify and explicate two concerns with respect to the con-

    ceptualization and five with respect to the execution (method-

    ology) of the study presented in Shanks et al. Further, we

    conducted two experiments that validate our concerns and

    demonstrate how such research should be conducted.

    With respect to conceptualization, we have two fundamental

    concerns. First, the study is structured as an empirical evalua-

    tion of alternative conceptual-modeling representations of

    partwhole relations (p. 554) contending that, in practice,

    some analysts argue [that] composites should be represented

    as a relationship (p. 553) rather than as entities.2 We find no

    support for this contention in the literature. Composition is a

    relatively sophisticated construct available in the Unified

    Modeling Language (UML). It is used when a modeler deems

    it important to represent the fact that a relationship has a

    specialized meaning, that is, the association of part objects

    that constitute a composite object as opposed to a simple

    association (Fowler 2003). Representing the composite object

    itself as a relationship violates the basic definitions of the

    UML and generally accepted conceptual modeling practice

    (Carlis and Maguire 2001; Keet 2008; Kent 1978).

    Second, and more importantly, the study appeals to Bunges

    ontology as the theoretical foundation for predicting dif-

    ferences in human performance; but it does not conform to

    Bunges ontology in the conceptualization or design of the

    experiment. The experiment uses two conceptual models

    (diagrams). The first, termed ontologically clear, is argued to

    be consistent with Bunges ontology with respect to the

    representation of composite objects. The second, termed

    ontologically unclear, is argued to be inconsistent with

    Bunges ontology in this respect. The underlying proposition

    is that the ontologically clear model will result in superior

    user performance over the ontologically unclear model

    because of its conformance to Bunges ontology. However,

    upon close scrutiny and as discussed in detail below, neither

    of these models conforms to Bunges ontology with respect to

    the definition of composite objects.

    We have five concerns with respect to the studys execution.

    First, it does not successfully operationalize the independent

    variable, ontological clarity. Appealing to the theory of

    ontological clarity (Wand and Weber 1993), the study con-

    tends that the ontologically unclear model lacks ontological

    clarity because it exhibits construct overload. Specifically, it

    contends that the UML relationship construct is used to

    explicitly represent associations and to implicitly representcomposites (Shanks et al. 2008, p. 561). Association and

    composition are two distinct ontological constructs; however,

    2Although the precise terms for the diamonds and rectangles in an ER

    diagram are relationship types and entity types, to be consistent with Shanks

    et al. and for ease in reading, we will refer to them as relationships and

    entities. We will use the terms relationship instance and entity instance to

    refer to members of those sets represented in a diagram.

    946 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    3/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    UML has two graphical symbols to represent relationships:

    a labeled arc and a diamond (Rumbaugh et al. 2005). The

    ontologically unclear model exclusively uses labeled arcs to

    explicitly represent association relationships and exclusively

    uses diamonds to implicitly represent composite objects.

    Thus it does not overload either graphical symbol.

    Second is a concern with the development of the experimental

    materials. The study describes a procedure for generating the

    ontologically unclear model from the ontologically clear

    model by transforming composites into ternary relationships

    intended to implicitly represent those composites. It does not,

    however, follow that procedure. As a result there is an incon-

    sistent mapping between the models with respect to the

    representation of composite objects. Furthermore, the ternary

    relationships in the ontologically unclear model need not be

    interpreted as representing composites at all. They are more

    reasonably interpreted as representing relationships.

    Third is a concern that the treatments are confounded. The

    ontologically unclear model uses binary and ternary relation-

    ships while the ontologically clear model uses only binary

    relationships. The semantics of ternary relationships are

    significantly more difficult to understand than are the seman-

    tics of binary relationships, especially for novice users (Topi

    and Ramesh 2002). Inclusion of ternary relationships in the

    ontologically unclear treatment but not in the ontologically

    clear treatment represents a substantial confound, making it

    impossible to determine the source of observed performance

    differences. We examine the strength of this confound in our

    first experiment. Moreover the questions that make up the

    experimental task systematically favor the ontologically cleardiagram in a manner that is independent from the experi-

    mental treatment, further strengthening the confounding

    effects.

    Fourth is a concern that subjects were instructed that the

    diagrams they were asked to interpret were standard UML

    diagrams; however, the instructions they were given about

    interpreting the diagrams were not consistent with the defini-

    tions of the UML constructs used, specifically with respect to

    the definitions of ternary relationship constraints. Although

    this may not be a concern for novices who are unfamiliar with

    UML, many of the subjects had modeling experience and a

    number had explicit experience with UML. Such subjectslikely experienced cognitive dissonance with respect to the

    nonstandard definitions.

    Fifth is a concern with the analysis of participants perfor-

    mance. Correct responses are reported for only two of the

    questions used in the experimental task. In one of these

    questions, subjects answers were incorrectly scored for one

    of the experimental treatments.

    Each of these concerns is assessed and explained in the

    following sections. Our goal in this endeavor is to sharpen

    the understanding of researchers with respect to the difficulty

    and the care required in working in this challenging but

    important area of inquiry. We hope to encourage future

    research in the development and evaluation of appropriate

    theoretical underpinnings for conceptual modeling and to help

    improve the conceptualization and execution of empirical

    research in this area. Moreover, we seek to open a dialog

    among researchers in developing and using appropriate

    theoretical foundations for conceptual modeling and in

    developing and using appropriate experimental protocols in

    conceptual modeling research.

    Foundations in Conceptual ModelingPractice

    Shanks et al. base their study on the contention that in con-

    ceptual models composite things are sometimes represented

    as entities (classes) and sometimes represented via relation-

    ships (associations) between the components of the com-

    posite (p. 554). However, doing so violates the basic defi-

    nitions of both the entityrelationship (ER) grammar (Chen

    1976) and the Unified Modeling Language (UML) grammar

    (Rumbaugh et al. 2005). In both grammars, entities (classes)

    are defined to represent collections of objects (instances) and

    relationships are defined to represent collections of associa-

    tions among objects. An instance of a relationship is the

    association of one instance of each related entity.

    However, the definitions of conceptual modeling grammars

    and the way in which those grammars are appropriated in

    practice may vary widely (Feldman and Pentland 2003). The

    study presented by Shanks et al. would be important if,

    indeed, a significant proportion of analysts argue composites

    should be represented as a relationship or association (p.

    553), independent of the fact that doing so violates the defini-

    tion of the relationship construct. Shanks et al. contend that

    widely used conceptual modeling/database textbooks,

    including Elmasri and Navathe (2004) and Teorey et al.

    (2006), show a composite represented implicitly via a rela-

    tionship or association construct (p. 555). We disagree with

    this interpretation of the conceptual models presented in these

    textbooks, as do the authors of these textbooks.3 The strength

    of a conceptual model is that domain semantics are repre-

    sented explicitly. As discussed above, the entities in a con-

    ceptual model represent collections, or types of objects; the

    3Personal communications with the authors of these textbooks between

    May 12, 2008, and June 30, 2008.

    MIS Quarterly Vol. 36 No. 3/September 2012 947

  • 7/30/2019 78164802

    4/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    relationships represent collections or types of associations

    among them; object types not explicitly represented are not

    part of the conceptual model (Carlis and Maguire 2001).

    Figures 1 and 2 are the examples used by Shanks et al. to

    motivate their study. A detailed description of the syntax

    used in Figure 1 is found in Appendix A, Section 1; a detailed

    description of the syntax used in Figure 2 is found in Appen-

    dix A, Section 5. Figure 1 is a portion of a conceptual model

    taken from Elmasri and Navathe (2004, p. 102). It shows a

    committee relationship between a faculty member and a

    graduate student. Shanks et al. argue that a composite entity

    committee is represented implicitly by the so labeled

    diamond. They argue that Elmasri and Navathe clearly

    intend the committee to be a composite entity that has

    faculty entities and student entities as components (p.

    555). While readers of a conceptual modeling diagram will

    use their existing domain knowledge to make sense of a

    diagram, if the designer of a diagram intends to convey some

    meaning, it should be done explicitly rather than implicitly

    (Carlis and Maguire 2001).

    Our interpretation of the diagram is that the committee rela-

    tionship indicates that there is an association between

    individual graduate students and individual faculty members

    by virtue of their mutual participation in a committee, even

    though the committee itself is notrepresented in the model.

    If Elmasri and Navathe intended to represent the committee

    object they would have done so using an entity. The label

    committee on the relationship simply stands in place of

    another, potentially more descriptive label such as advises/

    is advised by representing the role played by each entity inthe relationship. This interpretation is more consistent with

    the way in which graduate committees are typically con-

    ceptualized. That is, an instance of the committee relationship

    indicates that a faculty member participates in a committee

    that advises a student; only faculty members can be members

    of a committee; the graduate student is not a member of (or

    any other kind of component of) a committee. Furthermore,

    a relationship instance is defined and identified by the com-

    bination of the related entity instances (Elmasri and Navathe

    2004). Hence, a committee relationship instance, as repre-

    sented in Figure1, has exactly one faculty member and exactly

    one graduate student. The cardinality constraints, M and N in

    the diagram, indicate that one faculty member may be asso-ciated with many graduate students and one graduate student

    may be associated with many faculty members.

    Using the name committee for the relationship may evoke

    reasoning processes related to academic committee objects

    with which the reader is familiar; however, all that can be

    known from the diagram is that faculty members and graduate

    students are associated with each other and that committee

    is deemed an appropriate, although potentially misleading,

    name for that relationship. If the common meaning of aca-

    demic committees is assumed (i.e., each student has one

    committee and each faculty member can serve on many

    committees), then Figure 1 does allow for reasoningabout

    committees. The members of a students committee are those

    faculty members associated with that student through the

    committee relationship; however, the committee itself is not

    represented in the diagram.

    Shanks et al. use a second motivating example (Figure 2),

    reproduced from Teorey et al. (2006, p. 91), and describe it as

    follows: Engineers are divided into groups for certain

    projects. Each group has a leader. By this, Shanks et al.

    contend that the association with the aggregation symbol in

    Figure 2 implicitly represents a group composite that has

    engineers as components (p. 555). Here the use of the

    language divided into groups may suggest that groups are

    composite entities made up of engineers; however, thatcomposite is not represented in the diagram. The relationship

    shows that some engineers are related to others by virtue of

    either being a group leader of other engineers or being led by

    another engineer who is a group leader.

    Moreover, the purpose of that figure is not to explicate the

    semantics of composition but rather to illustrate rules for

    transforming recursive one-to-many relationships into SQL

    for implementation (Teorey et al. 2006). The use of the

    diamond was not intended to represent a composition (aggre-

    gation) relationship, but rather an association relationship as

    in the corresponding ER diagram from the prior page in the

    textbook. If the intention were to implicitly represent an

    aggregation relationship as Shanks et al. contend, then the

    interpretation would be as follows: The UML notation for

    aggregation (the open diamond at one end of an association

    arc) indicates that members of the class touching the diamond

    are composed of members of the class at the other end of the

    association arc. Thus the semantics of the diagram do not

    indicate thatgroups are composed of engineers; rather they

    indicate that engineers are composed of engineers. This is

    neither congruent with the English language understanding of

    what an engineer is nor is it valid UML syntax. The UML

    does not allow aggregation hierarchies to be circular as in the

    recursive relationship of Figure 2; an object may not directlyor indirectly be a part of itself (Rumbaugh et al. 2005, p.

    164). Hence, Figure 2 cannot be interpreted to implicitly

    represent any composite.4

    4In a personal communication on June 10, 2008, Tom Nadeau (Teorey et al.)

    confirmed that the intended meaning of the diagram was simple association,

    not aggregation. He also confirmed that if the group composite object were

    to be modeled, an entity would have been used to explicitly represent it.

    948 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    5/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Figure 1. Portion of an ER Diagram that Shankset al. Argue Uses a Relationship to ImplicitlyRepresent a Committee Object (Source: Elmasri/

    Navathe FUNDAMENTALS OF DATABASE SYSTEMS, p. 102, Figure

    4.9, An EER Schema for a University Database, 2007, 2004

    Shamkant B. Navanthe & Ramez A. Elmasri. Reproduced by

    permission Pearson Education, Inc.)

    Figure 2. Portion of a UML Diagram that Shankset al. Argue Uses an Association to Implicitly

    Represent a Composite Group Object (Source:Database Modeling and Design: Logical Design (4th ed.), T. Teory, S.

    Lightstone, and T. Nadeau, p. 91. Copyright 2006, Elsevier)

    Therefore, neither of these textbook examples uses the rela-

    tionship construct to represent composites; in both examples

    the relationship construct represents association. Thus neither

    example serves as support for the claim that some textbook

    authors use relationships to represent composition. As dis-

    cussed above, such a claim is inconsistent with the definitions

    of both the ER and the UML grammars.

    Two related concepts have been discussed in the conceptual

    modeling literature, reification and association classes (El-

    masri and Navathe 2004; Halpin and Morgan 2008; Hansen

    and Hansen 1996; Rumbaugh et al. 2005). Reification, also

    termed objectification, refers to a process of transforming

    phenomena initially represented as a relationship into an

    entity or class. Association classes combine the properties of

    entities and relationships into a single construct. Neither,

    however, infers that an ordinary relationship can or should be

    used to represent a composite entity. In fact, these constructs

    were introduced because the relationship construct alone is

    insufficient to express phenomena that have characteristics of

    both entities and relationships, specifically the ability to

    participate in additional relationships.

    Finally, Shanks et al. use ternary relationships in the ontologi-

    cally unclear treatment. A ternary relationship associates

    three entities. As with any relationship, an instance of a ter-

    nary relationship associates exactly one instance of each of

    the related entities. However, the semantics of ternary rela-

    tionship constraints are complex and frequently misunder-

    stood (Topi and Ramesh 2002). Furthermore, there are

    significant differences in the definitions of ternary relation-

    ship cardinality constraints between UML and a number of

    extended ER model proposals. Rather than using UML,

    Shanks et al. use one of these extended ER proposals in the

    ontologically unclear treatment. Details of the relevant

    definitions are presented in Appendix A.

    Theoretical Foundations

    Shanks et al. indicate that they are testing the theory of onto-

    logical clarity posed by Wand and Weber (1993). This theory

    asserts that when the constructs of a conceptual modeling

    grammar exist in a bijective correspondence with the con-

    structs of an ontology, scripts expressed in that grammar will

    better communicate meaning to users than will scripts formed

    using grammars with ontological mappings that are either

    surjective orinjective in either direction. In a bijective corre-

    spondence, each construct in the conceptual modeling gram-

    mar maps to exactly one construct in the ontology and each

    construct in the ontology maps to exactly one construct in the

    grammar. In a surjective correspondence, multiple constructs

    in the conceptual modeling grammar map to the same con-

    struct in the ontology or multiple constructs in the ontology

    map to the same construct in the conceptual modeling gram-

    FACULTY

    COMMITTEE

    M

    N

    GRAD-STUDENT

    Engineer 0..*

    is-led-by

    is-group-leader-of

    1

    MIS Quarterly Vol. 36 No. 3/September 2012 949

  • 7/30/2019 78164802

    6/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    mar. In an injective correspondence, there may be constructs

    in the conceptual modeling grammar that have no corre-

    sponding construct in the ontology or constructs in the on-

    tology that have no corresponding construct in the ontology.

    The specific aspect of ontological clarity tested by Shanks et

    al. is the existence of a surjective correspondence where one

    construct in a conceptual modeling grammar is used to

    represent multiple constructs in an ontology. This type of

    surjective relationship is commonly termed construct over-

    load (Burton-Jones et al. 2009).

    The authors select the ontology of Mario Bunge (1977) as the

    referent ontology for their study. This presents problems for

    conceptual modeling because Bunges ontology is intended to

    represent concrete objects (things) that possess substantial

    properties. Concrete objects exist objectively in space and

    time, independent of human interpretation and ascription of

    meaning. Conceptual models, however, must frequentlyrepresent conceptualobjects and attributes that exist subjec-

    tively, are based exclusively on human interpretation, beliefs

    and collective agreement, and convey ascribed meaning (Kent

    1978; Searle 1995, 2006). These are explicitly excluded from

    Bunges ontology (Allen and March 2006a; Wyssusek 2006).

    In this regard Wand et al. (1999) deviate from Bunges

    ontology by asserting that the notion of a concrete thing

    applies to anythingperceivedas aspecific objectby someone,

    whether it exists in physical reality or only in someones

    mind (p. 497). In our opinion, this deviation severely

    reduces any guidance a modeler would obtain from Bunges

    ontology in the identification and classification of phenomena

    to be represented in a conceptual model. In this deviation,phenomena are represented as the modeler perceives to be

    appropriate, not as they exist in what Bunge (p. 157) calls

    real existence. A complete assessment of Bunges ontology

    as a foundation for conceptual modeling is beyond the scope

    of this paper and is the subject of ongoing research activities.

    Hence, we restrict our discussion to the ontological issues

    surrounding the diagrams used in the experimental treatments.

    Problems with the Experimental

    Treatments

    Bunges definition of a composite is quite different from the

    notion of composition in the UML. None of the composites

    in the ontologically clear UML diagram (Figure 3) conforms

    to the definition of composition in Bunges ontology.

    Appendix A contains a detailed review of the syntax used in

    Figure 3. Bunge (p. 29) states that an individual is com-

    posite [if and only if] it is composed of individuals other than

    itself and the null individual. The composite Team vio-

    lates this definition because it permits a Team to be composed

    of one Team Leader and a minimum of zero Team Members.

    That is, a Team Leader composed with no individuals other

    than itself (i.e., having no team members) is represented to be

    a Team. In Bunges ontology, a Team Leader composed with

    no other individual is the Team Leader, not a Team.

    Thus defining a team as the composition of a leader and some

    number of members but specifying that a team can exist with-

    out any members violates Bunges definition of composition.

    It is ontologically equivalent to defining a water molecule as

    being composed of oxygen and hydrogen but specifying that

    a water molecule can exist without any hydrogen. This

    violation could have been eliminated by specifying that a

    team requires a team leader and at least one member. Similar

    violations exist in the definitions of Project and Project

    Plana Project can be composed of zero Phases (i.e.,

    composed of nothing); a Project Plan can be composed ofzero Budgets and zero Scopes (i.e., composed of nothing).

    Although Purchase Requisition requires its component parts,

    the parts cannot exist without the whole, as indicated by the

    use of the solid diamond. In Bunges ontology, nothing is

    created or destroyed; composite concrete objects are com-

    posed from existingconcrete objects (p. 35). The components

    of a composite must have independent existence from the

    composite itself. Thus the ontologically clear conceptual

    model is not consistent with Bunges ontological definition of

    reality. To be consistent with Bunges ontology, each part

    must have existence prior to being composed. Hence, we

    conclude that Team, Project, Project Plan, and PurchaseRequisition as represented in the ontologically clear diagram

    are not composites as defined by Bunges ontology.

    Finally, although the ontologically unclear diagram (Figure 4)

    is purported to violate Bunges ontology because it uses ter-

    nary relationships to implicitly represent composites, it need

    not be interpreted as representing composites at all (see Ap-

    pendix A for a detailed review of the syntax used in Figure 4).

    It is wholly consistent with the predominant application of

    Bunges ontology to conceptual modeling to view these ter-

    nary relationships as representing mutual properties of distinct

    things rather than as representing composites (Wand et al.

    1999). Thus the ternary relationships in the ontologicallyunclear diagram need not be interpreted as a departure from

    ontological clarity. Given these problems with the instan-

    tiation of ontological concepts in the ontologically clear and

    ontologically unclear treatments (Figures 3 and 4, respec-

    tively), we argue that no conclusions should be drawn from

    this study regarding the suitability of Bunges ontology as a

    foundation for conceptual modeling.

    950 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    7/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Department

    Employee

    Team MemberTeam Leader

    Team

    Client

    Project

    Phase Consumable

    Project Plan

    Budget

    Scope

    Requisition Line Requisition Header

    Purchase

    Requisition

    Supplier

    Require

    Has

    Requests

    Ordered Via

    Sent to

    Responsible For0..* 0..*

    0..*

    0..*

    0..*

    0..*

    1

    0..1

    0..1

    0..1

    0..*

    1

    1

    1

    1 0..*

    1..* 1

    1..*

    1..*1..*

    0..*

    1

    1

    1

    UML Class Diagram 1

    "ET Technologies"

    - Client #- Address

    - Contact Phone #

    - Project Plan #

    - Project Plan Version #

    - Consumable #

    - Consumable Description

    - Department #- Department Name

    - Employee #- Employee Name- DOB- Internal Phone #- Skill Level

    - Qualification Level

    - Phase #

    - Phase Description- Phase Goal- Work Hours Required- Priority Level

    - Project #

    - Project Description- Required Completion Date

    - Actual Completion Date

    - Purchase Requisition #

    - Quantity

    - Purchase Header #- Date of Requisition- Total Value of Requisition

    - Supplier #- Supplier Name- ABN- ACN- Address- Contact Phone #

    - Budget #

    - Budget Description- Budget Version #

    - Team #- Team Name- Team Size- Date Formed

    Key Deliverable

    Belong To

    Assigned To

    0..*

    1

    0..*

    - Deliverable #- Deliverable Name- Date Due

    1 1

    - Scope #- Scope Description- Scope Version #

    Department

    Employee

    Team MemberTeam Leader

    Team

    Client

    Project

    Phase Consumable

    Project Plan

    Budget

    Scope

    Requisition Line Requisition Header

    Purchase

    Requisition

    Supplier

    Require

    Has

    Requests

    Ordered Via

    Sent to

    Responsible For0..* 0..*

    0..*

    0..*

    0..*

    0..*

    1

    0..1

    0..1

    0..1

    0..*

    1

    1

    1

    1 0..*

    1..* 1

    1..*

    1..*1..*

    0..*

    1

    1

    1

    UML Class Diagram 1

    "ET Technologies"

    - Client #- Address

    - Contact Phone #

    - Project Plan #

    - Project Plan Version #

    - Consumable #

    - Consumable Description

    - Department #- Department Name

    - Employee #- Employee Name- DOB- Internal Phone #- Skill Level

    - Qualification Level

    - Phase #

    - Phase Description- Phase Goal- Work Hours Required- Priority Level

    - Project #

    - Project Description- Required Completion Date

    - Actual Completion Date

    - Purchase Requisition #

    - Quantity

    - Purchase Header #- Date of Requisition- Total Value of Requisition

    - Supplier #- Supplier Name- ABN- ACN- Address- Contact Phone #

    - Budget #

    - Budget Description- Budget Version #

    - Team #- Team Name- Team Size- Date Formed

    Key Deliverable

    Belong To

    Assigned To

    0..*

    1

    0..*

    - Deliverable #- Deliverable Name- Date Due

    1 1

    - Scope #- Scope Description- Scope Version #

    Figure 3. Experimental Materials, Ontologically Clear Diagram (Source: Figure 5 from Shanks et al.

    2008, p. 561)

    Construct Operationalization

    As discussed above, the theory of ontological clarity predicts

    that users of conceptual models will better understand the

    domain semantics represented by the diagrams when there is

    a bijective mapping between constructs in the modeling

    grammar and constructs in the ontology than when there is a

    surjective or injective mapping between them. When a bijec-

    tive mapping exists in a given diagram, the diagram is said to

    be ontologically clear.

    Shanks et al. argue that using the relationship construct to

    represent both associations among objects and to implicitly

    represent composite objects results in construct overload, a

    surjective mapping from the ontology to the conceptual

    model. While we agree that using the same construct to

    represent both associations and composite objects would

    result in construct overload, the study does not permit such an

    assessment. In each experimental treatment, associations are

    represented with one graphical symbol while objects are

    represented with another. That is, with respect to the repre-

    sentation of association and composite objects there is no

    construct overload in either treatment. In the ontologically

    clear diagram (Figure 3), objects are represented using the

    rectangular class symbol, association relationships are

    represented using labeled arcs, and aggregation/composition

    relationships are represented using small diamonds on the

    composite object side of the relationship. Shanks et al. state

    that the ontologically unclear diagram (Figure 4) was pro-

    duced from the ontologically clear diagram by converting

    each aggregate/composite class to a diamond, that is, a

    UML association (p. 561).

    On the surface, this appears to operationalize different levels

    of ontological clarity by creating construct overload; the rela-

    tionship construct is intended to convey two different meanings.

    MIS Quarterly Vol. 36 No. 3/September 2012 951

  • 7/30/2019 78164802

    8/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Department

    Employee

    Team MemberTeam Leader

    Client

    Project

    Consumable

    Budget

    Scope

    Requisition Line Requisition Header

    Supplier

    Requests

    Ordered Via

    1..*

    0..*

    1..*

    0..1

    0..*

    1

    0..*

    1..*

    1 0..*

    11..*

    0..*

    1

    1

    Project Plan

    Phase

    PurchaseRequisition

    Team

    UML Class Diagram 2"ET Technologies"

    - Client #

    - Address- Contact Phone #

    - Consumable #

    - Consumable Description

    - Department #- Department Name

    - Employee #

    - Employee Name

    - DOB- Internal Phone #- Skill Level

    - Qualification Level

    - Project #

    - Project Description- Required Completion Date

    - Actual Completion Date

    - Consumable Type #- Quantity

    - Purchase Header #

    - Date of Requisition- Total Value of Requisition

    - Supplier #

    - Supplier Name- ABN- ACN

    - Address- Contact Phone #

    Key Deliverable

    0..*

    0..*

    - Deliverable #

    - Deliverable Name- Date Due

    0..*

    Belong To

    - Budget #- Budget Description- Budget Version #

    - Scope #- Scope Description- Scope Version #

    Figure 4. Experimental Materials, Ontologically Unclear Diagram (Source: Figure 6 from Shanks et al.

    p. 562)

    However, in UML a diamond is defined to represent an

    association, not a composite object. Hence, the diamonds in

    Figure 4 should be interpreted as representing ternary rela-

    tionships (associations), even though the authors intend them

    to communicate the existence of composite objects. More

    importantly, all and only ternary relationships intended to

    communicate the existence of composites are represented

    using the diamond symbol. All binary relationships (associa-

    tions) are represented using labeled arcs. Consequently we

    see no construct overload in Figure 4; rectangles are used to

    represent objects; labeled arcs are used to represent binary

    relationships; diamonds are used to represent ternary rela-

    tionships intended to communicate the existence of composite

    objects.

    We conclude that there are significant problems with the

    operationalization of the independent variable, construct

    overload.

    Experimental Implementation

    Shanks et al. indicate that they produced the ontologically

    unclear diagram (Figure 4) from the ontologically clear

    diagram by converting each aggregate/composite class to a

    diamond (p. 561). Figure 3 contains four aggregate/com-

    posite classes: Team, Project, Project Plan, and Purchase

    Requisition. If the stated procedure had been followed, then

    each of these would appear as a diamond in Figure 4. How-

    ever, only Team, Project Plan, and Purchase Requisition ap-

    pear as diamonds. Project has inexplicably been represented

    as a UML class (a rectangle). Conversely, each diamond in

    Figure 4 should correspond to an aggregate/ composite class

    in Figure 3. Again, one of the four is not mapped according

    to the specified procedure. The Phase relationship in Figure 4

    corresponds to a class in Figure 3 that is neither aggregate nor

    composite. However, if ternary relationships are intended to

    implicitly represent composites then, according to the rea-

    952 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    9/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    soning of Shanks et al., this change has the effect of asserting

    that projects are composed of phases in the ontologically clear

    diagram, but that phases are composed of projects (and con-

    sumables and key deliverables) in the ontologically unclear

    diagram. This is problematic itself; however, the implications

    for the studys validity are deeper than the introduction of

    counterintuitive domain semantics in one treatment.

    Recall that Shanks et al. set out to study differences in user

    performance when composites are represented explicitly as

    classes as compared to implicitly as relationships. To effect

    this examination, they implement the ontologically clear

    treatment where the four composite entities are represented as

    classes (Figure 3). Their experimental protocol calls for a

    second, ontologically unclear treatment, analogous to the first,

    where the four composites are represented implicitly by using

    relationships (Figure 4). However, in Figure 4, Project and

    Phase are represented in a way that is inconsistent with the

    experimental protocol. As described by Shanks et al., sub-

    jects answering questions involving projects should be

    examining a class in one treatment and a relationship in the

    other. This difference is the basis for drawing inferences

    about differences in subject performance. However, because

    Project is presented as a UML class in both diagrams, no such

    inferences can be made. Moreover, because Phase is repre-

    sented as a relationship in Figure 4 even though it is not a

    composite in Figure 3, differences in subject performance will

    undoubtedly be introduced that cannot be attributed to repre-

    senting a composite explicitly as a class versus implicitly as

    a relationship.

    These errors in the experimental materials might not be

    critical if Project and Phase were of only peripheral impor-

    tance to the experimental task. However, each of the 11

    questions refers directly to either Project or to Phase and most

    (6 of 11) refer to both (see Appendix B for the questions used

    in the Shanks et al. study). Accordingly, performance on

    every part of the experimental task is influenced by this error

    in the experimental materials.

    Confounds in the ExperimentalTreatment

    Two factors confound the experimental treatments: the

    imbalanced use of ternary relationships and the experimental

    task favoring one diagram in a manner distinct from the

    experimental treatment. The ontologically unclear treatment

    (Figure 4), with which subjects had inferior performance, uses

    ternary relationships while the ontologically clear treatment

    (Figure 3) does not. That is, even if the ontologically unclear

    treatment represented construct overload, the effects of this

    factor would be confounded by the use of ternary relation-

    ships. The ontologically unclear treatment would differ from

    the ontologically clear treatment in that it would include both

    construct overload and ternary relationships while the onto-

    logically clear treatment contained neither. The semantics of

    ternary relationships are more difficult to understand than the

    semantics of binary relationships, especially for novices

    (Batra et al. 1990; Rumbaugh et al. 2005; Topi and Ramesh

    2002). Thus, even without any ontology-based differences in

    the diagrams, one would expect Figure 4 to result in lower

    participant performance. The qualitative analysis presented

    by Shanks et al. suggests that confusion around the ternary

    relationship construct was fundamental to differences in

    observed performance. Reported subject statements (p. 568)

    included the ternary relationship is what confuses me and

    I cant understand why consumables and deliverables are

    related (via a ternary relationship). This conjecture is

    supported by our first experimental study presented below.

    Moreover the questions that make up the experimental task

    systematically favor the ontologically clear diagram in a

    manner that is independent from the intended experimental

    treatment (ontological clarity), further strengthening the

    confounding effects. Several of the questions in the experi-

    mental task are more difficult to answer in the ontologically

    unclear diagram than in the ontologically clear diagram.

    Other questions make statements that are incompatible with

    the semantics expressed in ontologically unclear diagram,

    likely causing confusion among subjects in this treatment.

    Details are discussed in Appendix B.

    Deviations from UML Definitions

    Shanks et al. indicate that they made a single deviation from

    standard UML class diagram notation: To avoid having

    cluttered diagrams, we placed attributes beside class boxes

    rather than in them. Otherwise we have used standard UML

    notation (p. 560). However, the experimental materials

    exhibit two additional notational deviations that may have had

    significant, unintended influence on the studys results.

    First is the positioning of the names of the ternary rela-

    tionships in Figure 4. In ER notation, the names of relation-

    ships appear inside the diamond; however, according to therules of UML the name of the association (if any) is shown

    near the diamond not in the diamond (Rumbaugh et al. 2005,

    p. 470).5 Of the symbols used in Figure 4, the only construct

    5We use Rumbaugh et al. (2005) as the authoritative reference for the UML

    because its authors are the creators of the UML and because it is the source

    cited by Shanks et al.

    MIS Quarterly Vol. 36 No. 3/September 2012 953

  • 7/30/2019 78164802

    10/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    allowed to have writing inside its borders is the rectangular

    class symbol. Placing the names of associations inside the

    symbol borders could have the effect of making them seem

    more like classes. This, in conjunction with using nouns to

    name the ternary relationships, encourages the misreading of

    the relationships as if they were classes. Such a misreading

    is likely to lead to the erroneous interpretation of the car-

    dinality constraints, which could cause subjects to conclude

    that an instance of a ternary relationship can exist without an

    instance ofeach of the related classes (e.g., a phase rela-

    tionship instance without a related key deliverable). This is

    clearly prohibited by the definition of a relationship and doing

    so would result in incorrect answers for several questions.

    The second problematic departure from standard UML

    notation pertains to multiplicity (cardinality constraints). For

    ternary relationships (associations), Rumbaugh et al. state,

    the multiplicity on an association end represents the potential

    number of values at the end when the values at the other [two]

    ends are fixed (p. 470). Clarifying this definition they con-

    tinue, for example, given a ternary association among classes

    (A, B, C), the multiplicity of the C end states how many C

    objects may appear in association with a particular pair of A

    and B objects. If the multiplicity of this association is (many,

    many, one),6 then for each possible (A, B) pair there is a

    unique C value (p. 472).

    Consider the Team ternary association from Figure 4. If the

    multiplicities are read according to the UML definition, the

    0..* near the Project class indicates that each possible

    combination of a team member and a team leader need not be

    associated with any project, but could be associated with

    several, which seems a reasonable constraint. However,

    consider the 1..* near the Team Leader class. This notation

    signifies that each possible combination of team member and

    project must be associated with at least one team leader.

    Accordingly, every team member must be associated with

    every project according to the UML definition.

    However, Shanks et al. describe in footnote 7 (p. 560) that

    subjects were taught to interpret the multiplicity symbols of

    ternary relationships differently. According to the instruction

    that subjects received, they should have read the 0..* near

    Project to indicate that any given project might not participate

    in the Team association or might participate in it many times.

    Likewise, the 1..* near Team Leader should be interpreted

    to mean that each Team Leader must participate in the Team

    association at least once but can participate in it multiple

    times. This definition conforms to the notion of participation

    constraints, as specified in some variants of the ER model

    (Liddle et al. 1993); however, it differs significantly from the

    definition of cardinality constraints in UML (see Ap-

    pendix A).

    The effects of using nonstandard definition of cardinality

    constraints depends on subjects experience with UML.

    Subjects with extensive UML experience were likely to be

    confused by the nonstandard notation and perhaps apply the

    standard definitions to the model, resulting in incorrect

    answers for some of the questions. While Shanks et al.

    disclose that most of their subjects had prior modeling

    experience, including experience with UML, neither the

    number of subjects having UML experience nor the level of

    that experience is reported. Accordingly, the severity of any

    concern resulting from this second departure from the UML

    standard is uncertain.

    Analysis of Performance

    Ascribing a numerical representation to subjects performance

    is fundamental to analyzing the results of any experiment.

    The validity of the statistical analysis is only as strong as the

    scoring of participant performance on the experimental task.

    In the study conducted by Shanks et al., the experimental task

    required participants to answer 11 questions by referencing

    the UML class diagram they received (each participant

    received either Figure 3 or Figure 4). Although all 11 ques-

    tions are presented, answers were provided for only questions

    2 and 10. As discussed below, question 2 was scored incor-rectly for the ontologically clear treatment. Question 2 reads

    as follows:

    A team leader has resigned. Does the model allow

    the team to continue to work on the project without

    him?

    Shanks et al. indicate that the correct answer for the onto-

    logically clear treatment (Figure 3) is possible and that the

    correct answer for the ontologically unclear treatment

    (Figure 4) is not possible. We agree that the correct answer

    for the ontologically unclear treatment is not possible. As

    discussed above, the existence of an instance of a relationship

    requires exactly one instance of each related entity; hence, a

    team cannot exist without its leader. However, that is also the

    correct answer for the ontologically clear treatment.

    Explaining their rationale for claiming that the correct answer

    for the ontologically clear model is possible, Shanks et al.

    state that because team leader is weakly aggregated in the

    6The designation (many, many, one) for the relationship (A, B, C) indicates

    that the cardinality constraint is 0..* on the A end, 0..* the B end, and 1..1 on

    the C end.

    954 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    11/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    ontologically clear diagram, the team can still exist if the

    leader resigns (p. 565). This statement is inconsistent with

    the definition of aggregation in the UML grammar. Shanks

    et al. assert that weak aggregation allows a team to exist

    without all of its constituents.7 However, aggregation does

    not indicate that the aggregate can exist without its parts; but

    rather, that a partmay exist independently from the

    aggregate (Rumbaugh et al. 2005, p. 164). Rumbaugh et al.

    further explain that constraints such as existence dependency

    are specified by the multiplicity, not the aggregation marker

    (p. 166). Thus, to answer the question of whether the team

    can exist without its leader, the multiplicity of the aggregate

    association must be examined. On the Team side of the

    relationship the multiplicity (1..*) indicates that a team leader

    may be associated with one or more teams; the multiplicity (1)

    on the Team Leader side indicates that a team must be

    associated with exactly one team leader.8

    Accordingly, a team must have a leader to exist and thecorrect answer for question 2 in the ontologically clear treat-

    ment (Figure 3) is not possible. Because of this misunder-

    standing of the UML notation for aggregation, subjects

    responses for question 2 in the ontologically clear treatment

    were scored incorrectly. According to Figure 3, the team

    cannot exist without its leader.

    Without more information about how the other nine questions

    were scored, it is not possible to determine if this is an

    isolated or a pervasive problem. However, with such prob-

    lems in the scoring of participant answers and the afore-

    mentioned confounds, the statistical analysis of the experi-

    mental results involving subjects performance cannot be

    interpreted as supporting the hypotheses.

    New Experimental Evidence

    We argue that the significant statistical results reported by

    Shanks et al. could be explained equally well by the asym-

    metric use of ternary relationships in their experimental

    materials, as opposed to Shanks et al.s contention they were

    caused by construct overload. To investigate this, we con-

    ducted two experiments. The first experiment tested the

    conjecture that users of conceptual modeling diagrams better

    understand domain semantics when expressed using binary

    relationships than when expressed using ternary relationships.

    Our results, described in more detail below, clearly support

    this conjecture.

    The second experiment further explicates the effects of con-

    founding construct overload and conceptual modeling con-

    structs and demonstrates requisite methodological procedures

    to test the effects of construct overload while considering the

    effects of differences in conceptual modeling constructs. As

    described below, the results of this study show that construct

    overload does not have a significant effect on subjects

    performance, at least in the context studied.

    Subjects in both experiments were university students

    majoring in Information Systems at a large private university.

    Each had received several months training on UML including

    binary and ternary associations, association classes, andaggregationthe main concepts used in the experimental

    treatments. Neither of the authors was involved in the

    training. In all, 33 subjects participated in the first experiment

    and 82 participated in the second experiment. There was no

    overlap of subjects in the two experiments. In each experi-

    ment, subjects were randomly assigned to treatments.

    The two treatments used for experiment 1 are illustrated in

    Figure 5. Subjects were told that, due to public health con-

    cerns, Wasatch Pork Distributors (WPD) had been required to

    track which pigs were involved in purchases made by each of

    its customers and that two different diagrams were proposed

    by competing consultants. Each subject was asked to answer

    nine questions about the semantics conveyed by each diagram

    (order of presentation was randomized). The decision of

    casting subjects in the role of evaluating proposed diagrams

    was made to allow subjects to identify what they perceived to

    be errors in the diagrams without feeling the need to reconcile

    the diagram to the given description of the business domain.

    The diagram proposed by Consultant 1 contained a ternary

    relationship while the diagram proposed by Consultant 2 con-

    tained three binary relationships. Each diagram expresses

    similar, though not identical, domain semantics. Both al-

    lowed WPD to track which customers purchased portions ofwhich pigs.

    Subjects overall performance for the diagram containing the

    ternary relationship was significantly worse than subjects

    performance for the binary-only diagram (n = 33; p = 0.0002;

    within subjects analysis). Subjects performed almost 50 per-

    cent better with the binary treatment than they did with the

    ternary treatment, correctly answering, on average, 6.6 as com-

    7The term weak aggregation is not expressly defined in the UML, although

    it may reasonably be used to distinguish aggregation from a stronger from

    of aggregation, called composition (Rumbaugh et al. 2005, p. 164). In

    general, the term used to distinguish aggregation from its stronger form

    (composition) is shared aggregation (Rumbaugh et al. 2005, p. 164).

    8When a single integer is used to express multiplicity, it stands for both the

    minimum and the maximum value (Rumbaugh et al. 2005, p. 466).

    MIS Quarterly Vol. 36 No. 3/September 2012 955

  • 7/30/2019 78164802

    12/29

  • 7/30/2019 78164802

    13/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Ontological Clarity

    UML Syntactic Construct Studied Overloaded Not Overloaded

    Association Class Figure 7a Figure 9a

    Relationship Figure 9b Figure 7b

    Figure 6. Experimental Treatment Summary

    a. Association Class Overloaded

    Association class notation is used to represent a simple association

    (i.e., Works For and Contracts With) as well as a composite (i.e.,

    Team and Assignment)

    b. Association Class Not Overloaded

    Labeled arc notation is used to represent a simple association (i.e.,

    Works For and Contract With) while aggregation is used to

    represent a composite (i.e., Team and Assignment).

    Figure 7. Experiment 2 (Form 1): Collaborative Auditing Incorporated

    Each experimental form individually replicates the Shanks etal. study illustrating the methodological difficulties they

    encountered. These can be overcome simply by a combined

    analysis of the two forms. In the first experimental form

    (Figure 7), two UML 2.0 class diagrams were presented that

    represent a common underlying business domain, each pur-

    ported to have been developed by a different consultant.

    Figure 7a exhibits construct overload by using association

    classes to represent both composition and regular association.

    Figure 7b avoids construct overload by using labeled arcs torepresent regular association and the aggregation notation (the

    open diamond at one end of a labeled arc) to represent com-

    position. Each subject answered the same set of 11 questions

    (shown with answers in Figure 8) for each representation. In

    all cases the answers to the questions were the same for both

    representations, thereby eliminating the possibility of perfor-

    mance differences arising as a result of unbalanced counter-

    intuitive domain semantics. Because each subject answered

    MIS Quarterly Vol. 36 No. 3/September 2012 957

  • 7/30/2019 78164802

    14/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Questions Answers

    1. An account rep is concerned that client employees who have more experience than auditors might unduly influence

    assignment findings. Can she list all teams for a given client where the client employee has a hire date that is earlier than

    the auditors?

    Yes

    2. A team is assigned to perform five tasks. Does the model make provisions for more than one auditor to work on some of

    those assignments?

    No

    3. A client entered a contract which will begin in five months time. CAI wants to create a team with an auditor now, and give

    the client the next five months to select an appropriate client employee. Is this possible?

    No

    4. A contract as a number of assignments that can be performed simultaneously. Is it possible for a single client employee to

    work on two concurrent assignments?

    Yes

    5. A client has supplied CAI with the list of employees that will be used for an audit; however, no decision has been made

    about which auditor will be assigned. Does the model allow CAI to record the names of the client employees?

    No

    6. An auditor wishes to check her skill appropriateness for an upcoming contract by checking the tasks to which she will be

    assigned. At this point, the client employees have been selected but they have not been paired with auditors to form

    teams. Does the model allow assignments to be made before auditors are paired with client employees?

    No

    7. An auditor and a client employee have been formed into a team on a contract that has been running for 3 months. Upon

    reviewing the contract details, they feel that the current set of assignments is inadequate to fulfill the contract. Can they

    produce a list of tasks which are planned for the current contract but which have not yet been assigned?

    No

    8. Because of financial difficulties, a client has requested a reduction in the scope of a contract. CAI is willing to renegotiate

    the contract scope. Can each team list the risks of their remaining assignments?

    Yes

    9. The Account Rep and client on a given contract are working to determine the set of tasks that will be completed to meet

    the contract terms. Does the model allow CAI to establish teams before any assignments are made?

    No

    10. CAI has hired a number of new auditors. Does the model allow CAI to track auditors who are not currently a part of any

    team?

    Yes

    11. A client has requested CAI to perform work that is not included in the task list. Does the model allow CAI to assign a team

    to perform that work without creating any new tasks?

    No

    Figure 8. Collaborative Auditing Incorporated Experimental Questions

    the questions for a diagram with construct overload and for adiagram without construct overload, we balance subjects

    ability across treatments, reducing exogenous variability in

    subject responses. We randomized the order in which sub-

    jects received the two treatments to eliminate any systematic

    ordering bias.

    However, this experimental form suffers some of the same

    methodological difficulties as the Shanks et al. study. As

    illustrated in Figure 6, it varies two factors, the UML con-

    struct studied and construct overload. That is, although we do

    not use ternary relationships, our study uses association

    classes in only one treatmentthe one with construct over-

    load. If subjects perform worse on that treatment, we cannottell if the performance difference is due to construct overload

    or to the complexities of the association class notation. Our

    experience in teaching UML suggests that association classes,

    like ternary relationships, are difficult to understand. To

    correct for this methodological concern, we concurrently

    administered a second form of the experiment, randomly

    assigning subjects to one form or to the other.

    In this form (Figure 9), again two UML 2.0 Class Diagramswere presented that represent the same underlying business

    domain, each purported to have been developed by a different

    consultant. Figure 9a is similar to Figure 7a in that both make

    use of association classes; however, Figure 9a avoids con-

    struct overload by using association classes to represent

    composition and labeled arcs to represent simple associations.

    Figure 9b exhibits construct overload by using labeled arcs to

    represent both simple association and to represent composi-

    tion. Again, as illustrated in Figure 6, this form varies two

    factors, the UML construct studied and construct overload.

    However, in this form, the more complex construct, associa-

    tion class, is not overloaded while the simpler construct,

    association, is overloaded. Combining the results from bothforms enables us to differentiate the effects of construct over-

    load from the effects of association classes. The complete

    experimental design is summarized in Figure 10.

    A total of 82 subjects participated in this experiment, each

    randomly assigned to one of the two experimental forms.

    Each subject answered each of the 11 questions for each of

    958 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    15/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    a. Labeled Arc Not Overloaded

    Labeled arc notation is used to represent a simple association (i.e.,

    Works For and Contracts With) while association class notation is

    used to represent composites (i.e., Team and Assignment)

    b. Labeled Arc Overloaded

    Labeled arc notation is used to represent simple association (i.e.,

    Works For and Contracts With) as well as to represent composition

    (i.e., Team and Assignment)

    Figure 9. Experiment 2 (Form 2): Collaborative Auditing Incorporated

    the two diagrams in that experimental form. A correct answer

    was given a value of 1; an incorrect answer was given a value

    of 0. To facilitate a within-subject analysis, each subject

    received a score of 1, 0, or 1 for each of the 11 questions. A

    score of 1 indicates that, for a particular question, the subject

    produced the correct answer under conditions of no overload

    and the incorrect answer under conditions of overload. A

    score of 1 indicates the opposite. A score of 0 indicates no

    treatment effect (the subject answered the question correctly

    or incorrectly for both treatments).

    If the two experimental forms are evaluated independently,

    the effect of construct overload is confounded by the degreeto which subjects understood association classes. Table 1

    illustrates the results of doing so. In the first experimental

    form, all significant performance differences favored the treat-

    ment with no overload and no association classes (Figure 7b

    over Figure 7a). However, in the second experimental form

    all significant performance differences favored the treatment

    with overload and no association classes (Figure 9b over

    Figure 9a). That is, if the first experimental form were used,

    and the effects of association classes were ignored, then the

    analysis would suggest that construct overload is detrimental

    to human performance. However, if the second experimental

    form were used, and the effects of association classes were

    ignored, then the analysis would suggest that construct over-

    load is beneficial to human performance. Only by combining

    the experimental forms can the analysis separate the effects of

    construct overload from the effects of association classes.

    The results of the combined analysis are summarized in

    Table 2. Each of the 11 questions is analyzed in two ways.

    First, combining the results from the treatments with constructoverload (Figure 7a and Figure 9b) compared to the

    treatments with no construct overload (Figure 7b and Figure

    9a), we performed a within-subjects (paired observations)

    analysis, using the scoring calculation described above.

    Second, we performed two between-subjects analyses using

    a rank sum mean difference scoring, and isolating the effects

    of association classes.

    MIS Quarterly Vol. 36 No. 3/September 2012 959

  • 7/30/2019 78164802

    16/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Figure 10. Experimental Analysis Summary

    Table 1. Independent Analysis of Experimental Forms

    Experimental Form 1 Experimental Form 2

    # n Mean Difference P-Value Mean Difference P-Value

    1 41 0.12 0.267 0.00 1.000

    2 41 0.37 < 0.001* -0.37 0.003*

    3 41 0.29 0.004* -0.15 0.210

    4 41 0.02 1.000 -0.05 0.688

    5 41 0.12 0.227 -.05 0.688

    6 41 -0.07 0.509 0.07 0.549

    7 41 0.10 0.388 -0.20 0.008*

    8 41 0.00 1.000 -0.10 0.289

    9 41 0.27 0.013* -0.24 0.013*

    10 41 -0.02 1.000 0.07 0.549

    11 41 -0.12 0.180 0.15 0.238*Significant at = 0.05.

    Note: Mean differences are calculated as performance on treatment without overload minus performance on treatment with overload.

    Composition

    Association Class

    Composition

    Aggregation

    Composition

    Association Class

    Composition

    Labeled Arc

    Composition

    Labeled Arc

    Composition

    Association Class

    Composition

    Labeled Arc

    Composition

    Labeled Arc

    Figure 7a: n = 41 Figure 7b: n = 41

    Figure 9b: n = 41 Figure 9a: n = 41

    n = 82n = 82

    Within-Subject Analysis

    Between-Subject Analysis

    Experimental

    Form 1

    Figure 7

    Experimental

    Form 2

    Figure 9

    960 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    17/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Table 2. Results of Statistical Tests for Experiment 2

    # N

    Within Subjects Analysis

    Between Subjects Analysis

    Figure 7a vs. Figure 9a Figure 7b vs. Figure 9b

    Mean

    Difference* p-value

    Rank Sum

    Mean

    Difference* p-value

    Rank Sum

    Mean

    Difference* p-Value1 82 0.061 0.332 8 0.030 -3 0.341

    2 82 0.000 1.000 3 0.514 -3 0.401

    3 82 0.073 0.377 4 0.384 2 0.615

    4 82 -0.012 1.000 -3 0.375 2 0.509

    5 82 0.037 0.629 -1 0.770 4 0.313

    6 82 0.000 1.000 0 1.000 0 1.000

    7 82 -0.049 0.503 0 1.000 -4 0.313

    8 82 -0.049 0.455 -2 0.586 -2 0.541

    9 82 0.012 1.000 4 0.295 -3 0.515

    10 82 0.024 0.832 -1 0.792 3 0.457

    11 82 0.012 1.000 -4 0.221 5 0.228

    *Null hypothesis: population mean difference = 0. Negative mean difference indicates that raw mean score on the diagram with construct overload

    as higher than its counterpart for that question.

    Note: Question 1 is not included in the performance analysis. It was included in the experimental treatment to force subjects using the overloaded

    diagrams from Figures 7a and 9b to engage the parts of the diagrams that overloaded constructs. Because of the unbalanced complexity this

    introduces in Figure 7a, performance differences cannot be attributed to construct overload for this question.

    The within-subjects analysis yielded no significant perfor-

    mance differences for any question. The p-values ranged

    from 0.332 to 1.000 (Table 2, Within-Subjects Analysis).

    This provides compelling evidence that, at least in this con-

    text, construct overload is not predictive of user performance.

    Beyond the within-subject analysis, our experimental design

    allows us to conduct two meaningful between-subjects

    analyses to further examine any effect of construct overload

    while isolating the confounding effects of variations in UML

    syntax, specifically association classes. By comparing the

    performance of subjects on the overload treatment in experi-

    mental form 1 (Figure 7a) with the performance of subjects on

    the no-overload treatment of experimental form 2 (Figure 9a)

    we can examine the effect of construct overload with only

    minor variation in UML syntax. Both treatments used asso-

    ciation classes. Likewise, we can compare performance on

    Figure 7b with that of Figure 9b to get another look at the

    effect of construct overload with very little variation in UMLsyntax. In this comparison, neither treatment uses association

    classes. In brief, none of the performance differences is signi-

    ficant even at an alpha of 0.20 (Table 2, Between-Subjects

    Analysis).

    In summary, if construct overload is predictive of human per-

    formance in problem solving tasks using conceptual models,

    we would expect to see a preponderance of statistically

    significant results among the more than 30 tests (question

    comparisons) conducted. In fact, we observe none. This

    finding strongly suggests that construct overload is not a

    salient predictor of human performance in this context.

    Hence, based on our own experimental evidence, we argue

    that the results presented by Shanks et al. are explained

    simply and parsimoniously by the complexities of ternary

    relationships and unbalanced counterintuitive domain seman-

    tics rather than by any differences in construct overload.

    Conclusions

    Based on the above analysis of conceptual underpinnings,

    experimental procedures, and data analysis, we argue that

    there is insufficient evidence upon which to base the conclu-

    sions reported in Shanks et al., that their results add strengthto a growing body of empirical work that supports the

    usefulness of ontological theories, especially Bunges (1977)

    theory, as a means of predicting the strengths and weaknesses

    of conceptual modeling grammars and practices (p. 570).

    Our own experimental results lead us to the opposite conclu-

    sion. Moreover, describing the theoretical underpinning for

    their study, Shanks et al. assert

    MIS Quarterly Vol. 36 No. 3/September 2012 961

  • 7/30/2019 78164802

    18/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    The theory of ontological clarity is also not a con-

    tingency theory. In other words, according to the

    theory, instances of construct overload, construct

    redundancy, construct deficit, and construct excess

    will always undermine users understanding of

    conceptual models (p. 557).

    Given that we have provided compelling evidence that

    construct overload does not result in inferior human perfor-

    mance in the use of conceptual models, the theory of

    ontological clarity has been falsified (Popper 1963) and must

    either be discarded or modified to account for the new

    experimental findings. To the extent that prior studies rely on

    or support the theory of ontological clarity, they should be

    reexamined to determine the conditions under which the

    theory is predictive. For example, while we have falsified the

    theory in the context of construct overload, Bodart et al.

    (2001) argue support in the context of construct excess. All

    such studies that have reported support for the predictions ofthe theory of ontological expressiveness should be reviewed

    in detail for clues about the proper disposition of the theory.

    We again emphasize the importance of experimental research

    on conceptual modeling and we appreciate the difficulty of

    formulating research questions and appropriately developing

    and executing experimental procedures. Our critical com-

    ments are offered in the spirit of improving subsequent

    research in this area. The difficulties we describe can be

    addressed in future studies. If Bunges definition of compo-

    sition is to be evaluated as a guide to conceptual modelers in

    their representation of phenomena, then that definition must

    be applied precisely and accurately and not confounded with

    differences in conceptual modeling constructs. Otherwise the

    experiment cannot be construed as providing evidence that

    supports the use of the ontology as a means of predicting the

    strengths and weaknesses of conceptual modeling models and

    practices (Shanks et al. 2008, p. 570).

    However, we see no compelling evidence to indicate that

    Bunges ontology holds significant promise as an underlying

    foundation for conceptual modeling. With no construct to

    represent conceptual objects or events, we contend that

    Bunges ontology suffers significant construct deficit with

    respect to the representation of business domains. This is byno means a criticism of Bunges ontology. Bunge developed

    his ontology for the expressed purpose of representing objec-

    tive, scientific knowledge. It was not developed to represent

    human commerce or phenomena that occur in human

    industry. It was developed to represent concrete objects and

    only concrete objects, objects that exist in space and time,

    independent of human knowledge or intentions. We render

    no judgment on its value for the representation of such

    objects. However, we assert that it was not intended to be

    used as a referent ontology for conceptual modeling of busi-

    ness systems and using it in that context is a misappropriation.

    We contend that social ontology, epistemology, cognitive

    theories, and theories of human memory structures and

    communication are a more appropriate basis for the evaluation

    of conceptual modeling grammars and practices (Allen and

    March 2006b; Gemino and Wand 2005). Searle (1995, 2006),

    for example, recognizes the significance of socially con-

    structed objects such as corporations, contracts, agreements,

    commitments, money, and intellectual property in the

    execution of human endeavor. Such objects are socially

    defined and created by performative acts of social entities.

    They depend on human recognition and beliefs (episte-

    mology) for their existence. That is, they exist only because

    people believe (or agree) they exist. Such objects are expli-

    citly excluded from Bunges ontology, yet they are crucial

    components of business domains and must be included in

    conceptual models.

    The fundamental question for researchers, elegantly expli-

    cated by Wand and Weber (2002), is how can we model such

    domains to better facilitate our developing, implementing,

    using, and maintaining more valuable information systems?

    To answer this question we must first establish a concep-

    tualization for information systems within business contexts.

    A fundamental role of an information system is that of a

    communication mechanism, providing a medium in which not

    only facts but intentions, obligations, commands, and inter-

    pretations of facts (knowledge and meaning) are recorded and

    disseminated within an organization (Kent 1978). Informa-

    tion systems play an active role in business processes (March

    and Allen 2007), participating in the gathering of business

    intelligence and the execution of business activities based

    upon the interpretation of the information gathered. These are

    cognitive tasks. Conceptual modeling grammars and prac-

    tices should attempt to leverage human information pro-

    cessing capabilities rather than merely follow philosophical

    conceptualizations of existence.

    For example, humans have an innate competency for pro-

    cessing events. Human memory for events and past experi-

    ences is psychologically and physiologically different fromhuman memory for facts and concepts (Nyberg 1998; Tulving

    1983, 2002). Moreover, events are fundamental to narrative

    thinking (Robinson and Hawpe 1986) and to the represen-

    tation of causality (Pillemer 1998; Ramesh and Browne

    1999). Both are principal processes in human sense-making

    (Gee 1985). Furthermore, humans use this narrative/event

    processing competency as a powerful tool for verbal and

    written communication (Orr 1990).

    962 MIS Quarterly Vol. 36 No. 3/September 2012

  • 7/30/2019 78164802

    19/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Ontologically, events are categorically distinct from objects

    (Bunge 1977; Davidson 1980). Hence, it can be argued that

    using the UML class construct to represent both events and

    objects will result in construct overload. However, studies in

    psychology (e.g., Wakslak et al. 2006; Zacks et al. 2007;

    Zacks and Tversky 2001) indicate that people conceptualize

    events and objects in a similar manner. That is, people

    ascribe identification and attributes to events and recall them

    in the same manner in which they recall physical objects.

    They conceptualize events as having parts (sub-events) and

    events constitute a significant component of human memory

    structures. Using the class construct to represent both events

    and objects can be viewed as an abstraction mechanism used

    to manage the complexity inherent in the modeled domain.

    This has been recognized by a number of researchers (e.g.,

    Geerts and McCarthy 2002) and empirical research has

    indicated that representing events as entities is beneficial to

    human performance in certain information processing tasks

    (Allen and March 2006b). We encourage future researchefforts aimed at gaining an understanding of the effects of

    using such abstraction mechanisms in conceptual modeling

    grammars and practices.

    We hope that our assessment of the work reported by Shanks

    et al. and the discussion of alternative foundational disciplines

    for conceptual modeling will stimulate discussion and will

    lead to the design and execution of additional experiments to

    test the implications of such proposed foundations.

    Acknowledgments

    We thank Gordon B. Davis, John V. Carlis, and Toby Teorey for

    their insightful comments and suggestions on an earlier version of

    this paper. We also thank Sham Navathe and Tom Nadeau for

    clarifying the intended meaning of data models used in their

    textbooks, Stephen W. Liddle for clarifying the meaning of the

    cardinality constraints for ternary relationships in the UML, and

    Mario Bunge for clarifying the definition of concrete objects in his

    ontology. Finally, we thank the senior editor, associate editor, and

    four anonymous reviewers for helpful comments and suggestions.

    References

    Allen, G. N., and March, S. T. 2006a. A Critical Assessment of

    the BungeWandWeber Ontology for Conceptual Modeling,

    in Proceedings of the 16th Annual Workshop on Information

    Technologies and Systems, Milwaukee WI, pp. 25-30.

    Allen, G. N., and March, S. T. 2006b. The Effects of State-Based

    and Event-Based Data Representation on User Performance in

    Query Formulation Tasks, MIS Quarterly (30:2), pp. 269-290.

    Batra, D., Hoffer, J. A., and Bostrom, R. P. 1990. Comparing

    Representations with Relational and EER Models, Com-

    munications of the ACM(33:2), pp. 126-139.

    Bodart, F., Patel, A., Sim, M., and Weber, R. 2001. Should

    Optional Properties Be Used in Conceptual Modeling? A Theory

    and Three Empirical Tests, Information Systems Research

    (12:4), pp. 384-405.Bowen, P. L., OFarrell, R. A., and Rohde, F. H. 2006. Analysis

    of Competing Data Structures: Does Ontological Clarity Produce

    Better End User Query Performance,Journal of the AIS(7:1),

    Article 22.

    Bowen, P. L., OFarrell, R. A., and Rohde, F. H. 2009. An

    Empirical Investigation of End User Query Development: The

    Effects of Improved Model Expressiveness versus Complexity,

    Information Systems Research (20:4), pp. 565-584.

    Bunge, M. 1977. Treatise on Basic Philosophy: Volume 3:

    Ontology I: The Furniture of the World, Boston: Reidel.

    Burton-Jones, A., and Meso, P. 2006. Conceptualizing Systems

    for Understanding: An Empirical Test of Decomposition

    Principles in Object-Oriented Analysis, Information SystemsResearch (17:1), pp. 38-60.

    Burton-Jones, A., Wand, Y., and Weber, R. 2009 Guidelines for

    Empirical Evaluations of Conceptual Modeling Grammars,

    Journal of the AIS(10:6), Article 1.

    Carlis, J., and Maguire, J. 2001. Mastering Data Modeling: A

    User-Driven Approach, Boston: Addison-Wesley.

    Chen, P. P-S. 1976. The Entity-Relationship ModelToward a

    Unified View of Data,ACM Transactions on Database Systems

    (1:1), pp. 9-36.

    Davidson, D. 1980. Essays on Actions and Events, New York:

    Oxford University Press.

    Elmasri, R., and Navathe, S. B. 2004. Fundamentals of Database

    Systems (4

    th

    ed.), Boston: Pearson/Addison-Wesley.Feldman, M. S., and Pentland, B. T. 2003. Reconceptualizing

    Organizational Routines as a Source of Flexibility and Change,

    Administrative Science Quarterly (48:1), pp. 94-118.

    Fowler, M. 2003. UML Distilled: A Brief Guide to the Standard

    Object Modeling Language, Boston: Addison-Wesley.

    Gee, J. P. 1985. The Narrativization of Experience in the Oral

    Style,Journal of Education (167), pp. 9-35.

    Geerts, G. L., and McCarthy, W. E. 2002. An Ontological

    Analysis of the Economic Primitives of the Extended-REA

    Enterprise Information Architecture,International Journal of

    Accounting Information Systems (3:1), pp. 1-16.

    Gemino, A., and Wand, Y. 2005. Complexity and Clarity in

    Conceptual Modeling: Comparison of Mandatory and Optional

    Properties,Data and Knowledge Engineering(55), pp. 301-326.

    Halpin, T., and Morgan, T. 2008. Information Modeling and

    Relational Databases (2nd ed.), Burlington, MA: Morgan

    Kaufmann Publishers.

    Hansen, G., and Hansen, J. 1996. Database Management and

    Design (2nd ed.), Upper Saddle River, NJ: Prentice-Hall.

    Keet, C. M. 2008. Representing and Reasoning Over a Taxonomy

    of PartWhole Relations,Applied Ontology (3), pp. 91-110.

    MIS Quarterly Vol. 36 No. 3/September 2012 963

  • 7/30/2019 78164802

    20/29

    Allen & March/Representing PartWhole Relations in Conceptual Meaning

    Kent, W. 1978. Data and Reality. Amsterdam: Elsevier Science

    Limited.

    Kim, Y. G., and March, S. T. 1995. Comparing Data Modeling

    Formalisms, Communications of the ACM(38:6), pp.103-115.

    Liddle, S., Embley, D., and Woodfield, S. 1993. Cardinality

    Constraints in Semantic Data Models, Data and Knowledge

    Engineering, (11), pp. 235-270.March, S. T., and Allen, G. 2007. Ontological Foundations for