78164802

7/30/2019 78164802

1/29

RESEARCH NOTE

ARESEARCH NOTE ON REPRESENTING PARTWHOLE

RELATIONS IN CONCEPTUAL MODELING1

Gove N. Allen

Marriot School of Management, Brigham Young University, 783 Tanner Building,

Provo, UT 84602 U.S.A. {[email protected]}

Salvatore T. March

Owen Graduate School of Management, Vanderbilt University,

Nashville, TN 37203 U.S.A. {[email protected]}

Empirical research is an important methodology for the study of conceptual modeling practices. The recently

published article Representing PartWhole Relations in Conceptual Modeling: An Empirical Evaluation

(Shanks et al. 2008) uses the lens of ontology to study a relatively sophisticated aspect of conceptual modeling

practice, the representation of aggregation and composition. It contends that some analysts argue that a

composite should be represented as a relationship while others argue that a composite should be represented

as an entity. We find no evidence of such a dispute in the data modeling literature. We observe that composites

are objects. By definition, all object-types should be represented as entities. Therefore, using the relationship

construct to represent composites should not be seen as a viable alternative. Additionally, we found significant

conceptual and methodological issues within the study that call its conclusions into question. As a way to offer

insight into the requisite methodological procedures for research in this area, we conducted two experiments

that both explicate and address the issues raised. Our results call into question the utility of using ontology

as a foundation for conceptual modeling practice. Furthermore, they suggest a contrary but at least equally

plausible explanation for the results reported by Shanks et al. In conducting this work we hope to encourage

dialogue that will be beneficial for future endeavors aimed at identifying, developing, and evaluating

appropriate foundations for the discipline of conceptual modeling.

Keywords: Conceptual modeling, empirical research, ontology, information systems development, compo-

sition, UML, entityrelationship model

Introduction1

Academic research in the field of information systems haslong been interested in examining the relationship between

conceptual models (grammars, methods, and scripts) and

human performance in tasks related to information system

development and use (Batra et al. 1990; Bodart et al. 2001;

Burton-Jones and Meso 2006; Kim and March 1995). Oneimportant question deals with identifying a sound theoretical

basis for the development and assessment of conceptual

modeling grammars and practices (Wand and Weber 2002).

Bunges (1977) ontology has been suggested as such a theo-

retical basis (Wand et al. 1995; Wand et al. 1999; Wand and

Weber 1993). Several articles have reported experiments that

study the effects of conformance to Bunges ontology on

human performance in various tasks (Bodart et al. 2001;

1Dale L. Goodhue was the accepting senior editor for this paper. Sandeep

Purao served as the associate editor.

The appendices for this paper are located in the Online Supplements

section of theMIS Quarterlys website (http://www.misq.org).

MIS Quarterly Vol. 36 No. 3, pp. 945-964/September 2012 945

7/30/2019 78164802

2/29

Allen & March/Representing PartWhole Relations in Conceptual Meaning

Bowen et al. 2006, 2009; Gemino and Wand 2005; Shanks et

al. 2008.

Establishing that a specific ontological foundation consis-

tently leads to superior human performance in conceptual

modeling tasks would be a significant contribution to our

field. Shanks et al. (2008) draw this conclusion, contending

that

our results add strength to a growing body of empi-

rical work that supports the usefulness of ontological

theories, especially Bunges (1977) theory, as a

means of predicting the strengths and weaknesses of

conceptual modeling models and practices (p. 570).

However, we found a number of apparent problems with the

Shanks et al. study, in both its conceptual underpinnings and

in the execution of the empirical tests that cast considerable

doubt on such a conclusion.

Articles in MIS Quarterly are often the starting point for

further research. Because reliance on results with problematic

conceptualization and methodology will have detrimental

effects on the development of a cumulative research tradition

within the discipline (Straub 2008), we feel it is important to

discuss these problems, to recast and retest the assertions of

Shanks et al., and to provide an exemplar of careful work in

this difficult area of research.

While we focus on what we see as critical concerns with the

study reported by Shanks et al., it is important to note the

significant contributions of the work. First, this is among avery small number of studies aimed at empirically testing the

utility of a well-established philosophical ontology as the

basis for conceptual modeling. Empirical studies of this

nature are crucial. Second, it presents an innovate framework

for data analysis that combines a quantitative assessment of

subjects answers with a qualitative assessment of explana-

tions for their answers and their interpretations of the modeled

domain. It presents a detailed description of procedures used

in this qualitative data analysis. Analysis and categorization

of verbal protocols is a difficult and time-consuming process,

but one that can yield significant insight into subjects

reasoning processes.

In undertaking this work, our intent is to encourage dialogue

that will be beneficial for future endeavors as our field works

toward identifying, developing, and evaluating appropriate

foundations for the discipline of conceptual modeling. We

identify and explicate two concerns with respect to the con-

ceptualization and five with respect to the execution (method-

ology) of the study presented in Shanks et al. Further, we

conducted two experiments that validate our concerns and

demonstrate how such research should be conducted.

With respect to conceptualization, we have two fundamental

concerns. First, the study is structured as an empirical evalua-

tion of alternative conceptual-modeling representations of

partwhole relations (p. 554) contending that, in practice,

some analysts argue [that] composites should be represented

as a relationship (p. 553) rather than as entities.2 We find no

support for this contention in the literature. Composition is a

relatively sophisticated construct available in the Unified

Modeling Language (UML). It is used when a modeler deems

it important to represent the fact that a relationship has a

specialized meaning, that is, the association of part objects

that constitute a composite object as opposed to a simple

association (Fowler 2003). Representing the composite object

itself as a relationship violates the basic definitions of the

UML and generally accepted conceptual modeling practice

(Carlis and Maguire 2001; Keet 2008; Kent 1978).

Second, and more importantly, the study appeals to Bunges

ontology as the theoretical foundation for predicting dif-

ferences in human performance; but it does not conform to

Bunges ontology in the conceptualization or design of the

experiment. The experiment uses two conceptual models

(diagrams). The first, termed ontologically clear, is argued to

be consistent with Bunges ontology with respect to the

representation of composite objects. The second, termed

ontologically unclear, is argued to be inconsistent with

Bunges ontology in this respect. The underlying proposition

is that the ontologically clear model will result in superior

user performance over the ontologically unclear model

because of its conformance to Bunges ontology. However,

upon close scrutiny and as discussed in detail below, neither

of these models conforms to Bunges ontology with respect to

the definition of composite objects.

We have five concerns with respect to the studys execution.

First, it does not successfully operationalize the independent

variable, ontological clarity. Appealing to the theory of

ontological clarity (Wand and Weber 1993), the study con-

tends that the ontologically unclear model lacks ontological

clarity because it exhibits construct overload. Specifically, it

contends that the UML relationship construct is used to

explicitly represent associations and to implicitly representcomposites (Shanks et al. 2008, p. 561). Association and

composition are two distinct ontological constructs; however,

2Although the precise terms for the diamonds and rectangles in an ER

diagram are relationship types and entity types, to be consistent with Shanks

et al. and for ease in reading, we will refer to them as relationships and

entities. We will use the terms relationship instance and entity instance to

refer to members of those sets represented in a diagram.

946 MIS Quarterly Vol. 36 No. 3/September 2012

7/30/2019 78164802

3/29


UML has two graphical symbols to represent relationships:

a labeled arc and a diamond (Rumbaugh et al. 2005). The

ontologically unclear model exclusively uses labeled arcs to

explicitly represent association relationships and exclusively

uses diamonds to implicitly represent composite objects.

Thus it does not overload either graphical symbol.

Second is a concern with the development of the experimental

materials. The study describes a procedure for generating the

ontologically unclear model from the ontologically clear

model by transforming composites into ternary relationships

intended to implicitly represent those composites. It does not,

however, follow that procedure. As a result there is an incon-

sistent mapping between the models with respect to the

representation of composite objects. Furthermore, the ternary

relationships in the ontologically unclear model need not be

interpreted as representing composites at all. They are more

reasonably interpreted as representing relationships.

Third is a concern that the treatments are confounded. The

ontologically unclear model uses binary and ternary relation-

ships while the ontologically clear model uses only binary

relationships. The semantics of ternary relationships are

significantly more difficult to understand than are the seman-

tics of binary relationships, especially for novice users (Topi

and Ramesh 2002). Inclusion of ternary relationships in the

ontologically unclear treatment but not in the ontologically

clear treatment represents a substantial confound, making it

impossible to determine the source of observed performance

differences. We examine the strength of this confound in our

first experiment. Moreover the questions that make up the

experimental task systematically favor the ontologically cleardiagram in a manner that is independent from the experi-

mental treatment, further strengthening the confounding

effects.

Fourth is a concern that subjects were instructed that the

diagrams they were asked to interpret were standard UML

diagrams; however, the instructions they were given about

interpreting the diagrams were not consistent with the defini-

tions of the UML constructs used, specifically with respect to

the definitions of ternary relationship constraints. Although

this may not be a concern for novices who are unfamiliar with

UML, many of the subjects had modeling experience and a

number had explicit experience with UML. Such subjectslikely experienced cognitive dissonance with respect to the

nonstandard definitions.

Fifth is a concern with the analysis of participants perfor-

mance. Correct responses are reported for only two of the

questions used in the experimental task. In one of these

questions, subjects answers were incorrectly scored for one

of the experimental treatments.

Each of these concerns is assessed and explained in the

following sections. Our goal in this endeavor is to sharpen

the understanding of researchers with respect to the difficulty

and the care required in working in this challenging but

important area of inquiry. We hope to encourage future

research in the development and evaluation of appropriate

theoretical underpinnings for conceptual modeling and to help

improve the conceptualization and execution of empirical

research in this area. Moreover, we seek to open a dialog

among researchers in developing and using appropriate

theoretical foundations for conceptual modeling and in

developing and using appropriate experimental protocols in

conceptual modeling research.

Foundations in Conceptual ModelingPractice

Shanks et al. base their study on the contention that in con-

ceptual models composite things are sometimes represented

as entities (classes) and sometimes represented via relation-

ships (associations) between the components of the com-

posite (p. 554). However, doing so violates the basic defi-

nitions of both the entityrelationship (ER) grammar (Chen

1976) and the Unified Modeling Language (UML) grammar

(Rumbaugh et al. 2005). In both grammars, entities (classes)

are defined to represent collections of objects (instances) and

relationships are defined to represent collections of associa-

tions among objects. An instance of a relationship is the

association of one instance of each related entity.

However, the definitions of conceptual modeling grammars

and the way in which those grammars are appropriated in

practice may vary widely (Feldman and Pentland 2003). The

study presented by Shanks et al. would be important if,

indeed, a significant proportion of analysts argue composites

should be represented as a relationship or association (p.

553), independent of the fact that doing so violates the defini-

tion of the relationship construct. Shanks et al. contend that

widely used conceptual modeling/database textbooks,

including Elmasri and Navathe (2004) and Teorey et al.

(2006), show a composite represented implicitly via a rela-

tionship or association construct (p. 555). We disagree with

this interpretation of the conceptual models presented in these

textbooks, as do the authors of these textbooks.3 The strength

of a conceptual model is that domain semantics are repre-

sented explicitly. As discussed above, the entities in a con-

ceptual model represent collections, or types of objects; the

3Personal communications with the authors of these textbooks between

May 12, 2008, and June 30, 2008.

MIS Quarterly Vol. 36 No. 3/September 2012 947

7/30/2019 78164802

4/29


relationships represent collections or types of associations

among them; object types not explicitly represented are not

part of the conceptual model (Carlis and Maguire 2001).

Figures 1 and 2 are the examples used by Shanks et al. to

motivate their study. A detailed description of the syntax

used in Figure 1 is found in Appendix A, Section 1; a detailed

description of the syntax used in Figure 2 is found in Appen-

dix A, Section 5. Figure 1 is a portion of a conceptual model

taken from Elmasri and Navathe (2004, p. 102). It shows a

committee relationship between a faculty member and a

graduate student. Shanks et al. argue that a composite entity

committee is represented implicitly by the so labeled

diamond. They argue that Elmasri and Navathe clearly

intend the committee to be a composite entity that has

faculty entities and student entities as components (p.

555). While readers of a conceptual modeling diagram will

use their existing domain knowledge to make sense of a

diagram, if the designer of a diagram intends to convey some

meaning, it should be done explicitly rather than implicitly

(Carlis and Maguire 2001).

Our interpretation of the diagram is that the committee rela-

tionship indicates that there is an association between

individual graduate students and individual faculty members

by virtue of their mutual participation in a committee, even

though the committee itself is notrepresented in the model.

If Elmasri and Navathe intended to represent the committee

object they would have done so using an entity. The label

committee on the relationship simply stands in place of

another, potentially more descriptive label such as advises/

is advised by representing the role played by each entity inthe relationship. This interpretation is more consistent with

the way in which graduate committees are typically con-

ceptualized. That is, an instance of the committee relationship

indicates that a faculty member participates in a committee

that advises a student; only faculty members can be members

of a committee; the graduate student is not a member of (or

any other kind of component of) a committee. Furthermore,

a relationship instance is defined and identified by the com-

bination of the related entity instances (Elmasri and Navathe

2004). Hence, a committee relationship instance, as repre-

sented in Figure1, has exactly one faculty member and exactly

one graduate student. The cardinality constraints, M and N in

the diagram, indicate that one faculty member may be asso-ciated with many graduate students and one graduate student

may be associated with many faculty members.

Using the name committee for the relationship may evoke

reasoning processes related to academic committee objects

with which the reader is familiar; however, all that can be

known from the diagram is that faculty members and graduate

students are associated with each other and that committee

is deemed an appropriate, although potentially misleading,

name for that relationship. If the common meaning of aca-

demic committees is assumed (i.e., each student has one

committee and each faculty member can serve on many

committees), then Figure 1 does allow for reasoningabout

committees. The members of a students committee are those

faculty members associated with that student through the

committee relationship; however, the committee itself is not

represented in the diagram.

Shanks et al. use a second motivating example (Figure 2),

reproduced from Teorey et al. (2006, p. 91), and describe it as

follows: Engineers are divided into groups for certain

projects. Each group has a leader. By this, Shanks et al.

contend that the association with the aggregation symbol in

Figure 2 implicitly represents a group composite that has

engineers as components (p. 555). Here the use of the

language divided into groups may suggest that groups are

composite entities made up of engineers; however, thatcomposite is not represented in the diagram. The relationship

shows that some engineers are related to others by virtue of

either being a group leader of other engineers or being led by

another engineer who is a group leader.

Moreover, the purpose of that figure is not to explicate the

semantics of composition but rather to illustrate rules for

transforming recursive one-to-many relationships into SQL

for implementation (Teorey et al. 2006). The use of the

diamond was not intended to represent a composition (aggre-

gation) relationship, but rather an association relationship as

in the corresponding ER diagram from the prior page in the

textbook. If the intention were to implicitly represent an

aggregation relationship as Shanks et al. contend, then the

interpretation would be as follows: The UML notation for

aggregation (the open diamond at one end of an association

arc) indicates that members of the class touching the diamond

are composed of members of the class at the other end of the

association arc. Thus the semantics of the diagram do not

indicate thatgroups are composed of engineers; rather they

indicate that engineers are composed of engineers. This is

neither congruent with the English language understanding of

what an engineer is nor is it valid UML syntax. The UML

does not allow aggregation hierarchies to be circular as in the

recursive relationship of Figure 2; an object may not directlyor indirectly be a part of itself (Rumbaugh et al. 2005, p.

164). Hence, Figure 2 cannot be interpreted to implicitly

represent any composite.4

4In a personal communication on June 10, 2008, Tom Nadeau (Teorey et al.)

confirmed that the intended meaning of the diagram was simple association,

not aggregation. He also confirmed that if the group composite object were

to be modeled, an entity would have been used to explicitly represent it.


7/30/2019 78164802

5/29


Figure 1. Portion of an ER Diagram that Shankset al. Argue Uses a Relationship to ImplicitlyRepresent a Committee Object (Source: Elmasri/

Navathe FUNDAMENTALS OF DATABASE SYSTEMS, p. 102, Figure

4.9, An EER Schema for a University Database, 2007, 2004

Shamkant B. Navanthe & Ramez A. Elmasri. Reproduced by

permission Pearson Education, Inc.)

Figure 2. Portion of a UML Diagram that Shankset al. Argue Uses an Association to Implicitly

Represent a Composite Group Object (Source:Database Modeling and Design: Logical Design (4th ed.), T. Teory, S.

Lightstone, and T. Nadeau, p. 91. Copyright 2006, Elsevier)

Therefore, neither of these textbook examples uses the rela-

tionship construct to represent composites; in both examples

the relationship construct represents association. Thus neither

example serves as support for the claim that some textbook

authors use relationships to represent composition. As dis-

cussed above, such a claim is inconsistent with the definitions

of both the ER and the UML grammars.

Two related concepts have been discussed in the conceptual

modeling literature, reification and association classes (El-

masri and Navathe 2004; Halpin and Morgan 2008; Hansen

and Hansen 1996; Rumbaugh et al. 2005). Reification, also

termed objectification, refers to a process of transforming

phenomena initially represented as a relationship into an

entity or class. Association classes combine the properties of

entities and relationships into a single construct. Neither,

however, infers that an ordinary relationship can or should be

used to represent a composite entity. In fact, these constructs

were introduced because the relationship construct alone is

insufficient to express phenomena that have characteristics of

both entities and relationships, specifically the ability to

participate in additional relationships.

Finally, Shanks et al. use ternary relationships in the ontologi-

cally unclear treatment. A ternary relationship associates

three entities. As with any relationship, an instance of a ter-

nary relationship associates exactly one instance of each of

the related entities. However, the semantics of ternary rela-

tionship constraints are complex and frequently misunder-

stood (Topi and Ramesh 2002). Furthermore, there are

significant differences in the definitions of ternary relation-

ship cardinality constraints between UML and a number of

extended ER model proposals. Rather than using UML,

Shanks et al. use one of these extended ER proposals in the

ontologically unclear treatment. Details of the relevant

definitions are presented in Appendix A.

Theoretical Foundations

Shanks et al. indicate that they are testing the theory of onto-

logical clarity posed by Wand and Weber (1993). This theory

asserts that when the constructs of a conceptual modeling

grammar exist in a bijective correspondence with the con-

structs of an ontology, scripts expressed in that grammar will

better communicate meaning to users than will scripts formed

using grammars with ontological mappings that are either

surjective orinjective in either direction. In a bijective corre-

spondence, each construct in the conceptual modeling gram-

mar maps to exactly one construct in the ontology and each

construct in the ontology maps to exactly one construct in the

grammar. In a surjective correspondence, multiple constructs

in the conceptual modeling grammar map to the same con-

struct in the ontology or multiple constructs in the ontology

map to the same construct in the conceptual modeling gram-

FACULTY

COMMITTEE

M

N

GRAD-STUDENT

Engineer 0..*

is-led-by

is-group-leader-of

1


7/30/2019 78164802

6/29


mar. In an injective correspondence, there may be constructs

in the conceptual modeling grammar that have no corre-

sponding construct in the ontology or constructs in the on-

tology that have no corresponding construct in the ontology.

The specific aspect of ontological clarity tested by Shanks et

al. is the existence of a surjective correspondence where one

construct in a conceptual modeling grammar is used to

represent multiple constructs in an ontology. This type of

surjective relationship is commonly termed construct over-

load (Burton-Jones et al. 2009).

The authors select the ontology of Mario Bunge (1977) as the

referent ontology for their study. This presents problems for

conceptual modeling because Bunges ontology is intended to

represent concrete objects (things) that possess substantial

properties. Concrete objects exist objectively in space and

time, independent of human interpretation and ascription of

meaning. Conceptual models, however, must frequentlyrepresent conceptualobjects and attributes that exist subjec-

tively, are based exclusively on human interpretation, beliefs

and collective agreement, and convey ascribed meaning (Kent

1978; Searle 1995, 2006). These are explicitly excluded from

Bunges ontology (Allen and March 2006a; Wyssusek 2006).

In this regard Wand et al. (1999) deviate from Bunges

ontology by asserting that the notion of a concrete thing

applies to anythingperceivedas aspecific objectby someone,

whether it exists in physical reality or only in someones

mind (p. 497). In our opinion, this deviation severely

reduces any guidance a modeler would obtain from Bunges

ontology in the identification and classification of phenomena

to be represented in a conceptual model. In this deviation,phenomena are represented as the modeler perceives to be

appropriate, not as they exist in what Bunge (p. 157) calls

real existence. A complete assessment of Bunges ontology

as a foundation for conceptual modeling is beyond the scope

of this paper and is the subject of ongoing research activities.

Hence, we restrict our discussion to the ontological issues

surrounding the diagrams used in the experimental treatments.

Problems with the Experimental

Treatments

Bunges definition of a composite is quite different from the

notion of composition in the UML. None of the composites

in the ontologically clear UML diagram (Figure 3) conforms

to the definition of composition in Bunges ontology.

Appendix A contains a detailed review of the syntax used in

Figure 3. Bunge (p. 29) states that an individual is com-

posite [if and only if] it is composed of individuals other than

itself and the null individual. The composite Team vio-

lates this definition because it permits a Team to be composed

of one Team Leader and a minimum of zero Team Members.

That is, a Team Leader composed with no individuals other

than itself (i.e., having no team members) is represented to be

a Team. In Bunges ontology, a Team Leader composed with

no other individual is the Team Leader, not a Team.

Thus defining a team as the composition of a leader and some

number of members but specifying that a team can exist with-

out any members violates Bunges definition of composition.

It is ontologically equivalent to defining a water molecule as

being composed of oxygen and hydrogen but specifying that

a water molecule can exist without any hydrogen. This

violation could have been eliminated by specifying that a

team requires a team leader and at least one member. Similar

violations exist in the definitions of Project and Project

Plana Project can be composed of zero Phases (i.e.,

composed of nothing); a Project Plan can be composed ofzero Budgets and zero Scopes (i.e., composed of nothing).

Although Purchase Requisition requires its component parts,

the parts cannot exist without the whole, as indicated by the

use of the solid diamond. In Bunges ontology, nothing is

created or destroyed; composite concrete objects are com-

posed from existingconcrete objects (p. 35). The components

of a composite must have independent existence from the

composite itself. Thus the ontologically clear conceptual

model is not consistent with Bunges ontological definition of

reality. To be consistent with Bunges ontology, each part

must have existence prior to being composed. Hence, we

conclude that Team, Project, Project Plan, and PurchaseRequisition as represented in the ontologically clear diagram

are not composites as defined by Bunges ontology.

Finally, although the ontologically unclear diagram (Figure 4)

is purported to violate Bunges ontology because it uses ter-

nary relationships to implicitly represent composites, it need

not be interpreted as representing composites at all (see Ap-

pendix A for a detailed review of the syntax used in Figure 4).

It is wholly consistent with the predominant application of

Bunges ontology to conceptual modeling to view these ter-

nary relationships as representing mutual properties of distinct

things rather than as representing composites (Wand et al.

1999). Thus the ternary relationships in the ontologicallyunclear diagram need not be interpreted as a departure from

ontological clarity. Given these problems with the instan-

tiation of ontological concepts in the ontologically clear and

ontologically unclear treatments (Figures 3 and 4, respec-

tively), we argue that no conclusions should be drawn from

this study regarding the suitability of Bunges ontology as a

foundation for conceptual modeling.


7/30/2019 78164802

7/29


Department

Employee

Team MemberTeam Leader

Team

Client

Project

Phase Consumable

Project Plan

Budget

Scope

Requisition Line Requisition Header

Purchase

Requisition

Supplier

Require

Has

Requests

Ordered Via

Sent to

Responsible For0..* 0..*

0..*

0..*

0..*

0..*

1

0..1

0..1

0..1

0..*

1

1

1

1 0..*

1..* 1

1..*

1..*1..*

0..*

1

1

1

UML Class Diagram 1

"ET Technologies"

- Client #- Address

- Contact Phone #

- Project Plan #

- Project Plan Version #

- Consumable #

- Consumable Description

- Department #- Department Name

- Employee #- Employee Name- DOB- Internal Phone #- Skill Level

- Qualification Level

- Phase #

- Phase Description- Phase Goal- Work Hours Required- Priority Level

- Project #

- Project Description- Required Completion Date

- Actual Completion Date

- Purchase Requisition #

- Quantity

- Purchase Header #- Date of Requisition- Total Value of Requisition

- Supplier #- Supplier Name- ABN- ACN- Address- Contact Phone #

- Budget #

- Budget Description- Budget Version #

- Team #- Team Name- Team Size- Date Formed

Key Deliverable

Belong To

Assigned To

0..*

1

0..*

- Deliverable #- Deliverable Name- Date Due

1 1

- Scope #- Scope Description- Scope Version #

Department

Employee


Team

Client

Project

Phase Consumable

Project Plan

Budget

Scope


Purchase

Requisition

Supplier

Require

Has

Requests

Ordered Via

Sent to

Responsible For0..* 0..*

0..*

0..*

0..*

0..*

1

0..1

0..1

0..1

0..*

1

1

1

1 0..*

1..* 1

1..*

1..*1..*

0..*

1

1

1

UML Class Diagram 1

"ET Technologies"

- Client #- Address

- Contact Phone #

- Project Plan #

- Project Plan Version #

- Consumable #



- Employee #- Employee Name- DOB- Internal Phone #- Skill Level


- Phase #

- Phase Description- Phase Goal- Work Hours Required- Priority Level

- Project #



- Purchase Requisition #

- Quantity

- Purchase Header #- Date of Requisition- Total Value of Requisition

- Supplier #- Supplier Name- ABN- ACN- Address- Contact Phone #

- Budget #

- Budget Description- Budget Version #

- Team #- Team Name- Team Size- Date Formed

Key Deliverable

Belong To

Assigned To

0..*

1

0..*

- Deliverable #- Deliverable Name- Date Due

1 1


Figure 3. Experimental Materials, Ontologically Clear Diagram (Source: Figure 5 from Shanks et al.

2008, p. 561)

Construct Operationalization

As discussed above, the theory of ontological clarity predicts

that users of conceptual models will better understand the

domain semantics represented by the diagrams when there is

a bijective mapping between constructs in the modeling

grammar and constructs in the ontology than when there is a

surjective or injective mapping between them. When a bijec-

tive mapping exists in a given diagram, the diagram is said to

be ontologically clear.

Shanks et al. argue that using the relationship construct to

represent both associations among objects and to implicitly

represent composite objects results in construct overload, a

surjective mapping from the ontology to the conceptual

model. While we agree that using the same construct to

represent both associations and composite objects would

result in construct overload, the study does not permit such an

assessment. In each experimental treatment, associations are

represented with one graphical symbol while objects are

represented with another. That is, with respect to the repre-

sentation of association and composite objects there is no

construct overload in either treatment. In the ontologically

clear diagram (Figure 3), objects are represented using the

rectangular class symbol, association relationships are

represented using labeled arcs, and aggregation/composition

relationships are represented using small diamonds on the

composite object side of the relationship. Shanks et al. state

that the ontologically unclear diagram (Figure 4) was pro-

duced from the ontologically clear diagram by converting

each aggregate/composite class to a diamond, that is, a

UML association (p. 561).

On the surface, this appears to operationalize different levels

of ontological clarity by creating construct overload; the rela-

tionship construct is intended to convey two different meanings.


7/30/2019 78164802

8/29


Department

Employee


Client

Project

Consumable

Budget

Scope


Supplier

Requests

Ordered Via

1..*

0..*

1..*

0..1

0..*

1

0..*

1..*

1 0..*

11..*

0..*

1

1

Project Plan

Phase

PurchaseRequisition

Team

UML Class Diagram 2"ET Technologies"

- Client #

- Address- Contact Phone #

- Consumable #



- Employee #

- Employee Name

- DOB- Internal Phone #- Skill Level


- Project #



- Consumable Type #- Quantity

- Purchase Header #

- Date of Requisition- Total Value of Requisition

- Supplier #

- Supplier Name- ABN- ACN

- Address- Contact Phone #

Key Deliverable

0..*

0..*

- Deliverable #

- Deliverable Name- Date Due

0..*

Belong To

- Budget #- Budget Description- Budget Version #


Figure 4. Experimental Materials, Ontologically Unclear Diagram (Source: Figure 6 from Shanks et al.

p. 562)

However, in UML a diamond is defined to represent an

association, not a composite object. Hence, the diamonds in

Figure 4 should be interpreted as representing ternary rela-

tionships (associations), even though the authors intend them

to communicate the existence of composite objects. More

importantly, all and only ternary relationships intended to

communicate the existence of composites are represented

using the diamond symbol. All binary relationships (associa-

tions) are represented using labeled arcs. Consequently we

see no construct overload in Figure 4; rectangles are used to

represent objects; labeled arcs are used to represent binary

relationships; diamonds are used to represent ternary rela-

tionships intended to communicate the existence of composite

objects.

We conclude that there are significant problems with the

operationalization of the independent variable, construct

overload.

Experimental Implementation

Shanks et al. indicate that they produced the ontologically

unclear diagram (Figure 4) from the ontologically clear

diagram by converting each aggregate/composite class to a

diamond (p. 561). Figure 3 contains four aggregate/com-

posite classes: Team, Project, Project Plan, and Purchase

Requisition. If the stated procedure had been followed, then

each of these would appear as a diamond in Figure 4. How-

ever, only Team, Project Plan, and Purchase Requisition ap-

pear as diamonds. Project has inexplicably been represented

as a UML class (a rectangle). Conversely, each diamond in

Figure 4 should correspond to an aggregate/ composite class

in Figure 3. Again, one of the four is not mapped according

to the specified procedure. The Phase relationship in Figure 4

corresponds to a class in Figure 3 that is neither aggregate nor

composite. However, if ternary relationships are intended to

implicitly represent composites then, according to the rea-


7/30/2019 78164802

9/29


soning of Shanks et al., this change has the effect of asserting

that projects are composed of phases in the ontologically clear

diagram, but that phases are composed of projects (and con-

sumables and key deliverables) in the ontologically unclear

diagram. This is problematic itself; however, the implications

for the studys validity are deeper than the introduction of

counterintuitive domain semantics in one treatment.

Recall that Shanks et al. set out to study differences in user

performance when composites are represented explicitly as

classes as compared to implicitly as relationships. To effect

this examination, they implement the ontologically clear

treatment where the four composite entities are represented as

classes (Figure 3). Their experimental protocol calls for a

second, ontologically unclear treatment, analogous to the first,

where the four composites are represented implicitly by using

relationships (Figure 4). However, in Figure 4, Project and

Phase are represented in a way that is inconsistent with the

experimental protocol. As described by Shanks et al., sub-

jects answering questions involving projects should be

examining a class in one treatment and a relationship in the

other. This difference is the basis for drawing inferences

about differences in subject performance. However, because

Project is presented as a UML class in both diagrams, no such

inferences can be made. Moreover, because Phase is repre-

sented as a relationship in Figure 4 even though it is not a

composite in Figure 3, differences in subject performance will

undoubtedly be introduced that cannot be attributed to repre-

senting a composite explicitly as a class versus implicitly as

a relationship.

These errors in the experimental materials might not be

critical if Project and Phase were of only peripheral impor-

tance to the experimental task. However, each of the 11

questions refers directly to either Project or to Phase and most

(6 of 11) refer to both (see Appendix B for the questions used

in the Shanks et al. study). Accordingly, performance on

every part of the experimental task is influenced by this error

in the experimental materials.

Confounds in the ExperimentalTreatment

Two factors confound the experimental treatments: the

imbalanced use of ternary relationships and the experimental

task favoring one diagram in a manner distinct from the

experimental treatment. The ontologically unclear treatment

(Figure 4), with which subjects had inferior performance, uses

ternary relationships while the ontologically clear treatment

(Figure 3) does not. That is, even if the ontologically unclear

treatment represented construct overload, the effects of this

factor would be confounded by the use of ternary relation-

ships. The ontologically unclear treatment would differ from

the ontologically clear treatment in that it would include both

construct overload and ternary relationships while the onto-

logically clear treatment contained neither. The semantics of

ternary relationships are more difficult to understand than the

semantics of binary relationships, especially for novices

(Batra et al. 1990; Rumbaugh et al. 2005; Topi and Ramesh

2002). Thus, even without any ontology-based differences in

the diagrams, one would expect Figure 4 to result in lower

participant performance. The qualitative analysis presented

by Shanks et al. suggests that confusion around the ternary

relationship construct was fundamental to differences in

observed performance. Reported subject statements (p. 568)

included the ternary relationship is what confuses me and

I cant understand why consumables and deliverables are

related (via a ternary relationship). This conjecture is

supported by our first experimental study presented below.

Moreover the questions that make up the experimental task

systematically favor the ontologically clear diagram in a

manner that is independent from the intended experimental

treatment (ontological clarity), further strengthening the

confounding effects. Several of the questions in the experi-

mental task are more difficult to answer in the ontologically

unclear diagram than in the ontologically clear diagram.

Other questions make statements that are incompatible with

the semantics expressed in ontologically unclear diagram,

likely causing confusion among subjects in this treatment.

Details are discussed in Appendix B.

Deviations from UML Definitions

Shanks et al. indicate that they made a single deviation from

standard UML class diagram notation: To avoid having

cluttered diagrams, we placed attributes beside class boxes

rather than in them. Otherwise we have used standard UML

notation (p. 560). However, the experimental materials

exhibit two additional notational deviations that may have had

significant, unintended influence on the studys results.

First is the positioning of the names of the ternary rela-

tionships in Figure 4. In ER notation, the names of relation-

ships appear inside the diamond; however, according to therules of UML the name of the association (if any) is shown

near the diamond not in the diamond (Rumbaugh et al. 2005,

p. 470).5 Of the symbols used in Figure 4, the only construct

5We use Rumbaugh et al. (2005) as the authoritative reference for the UML

because its authors are the creators of the UML and because it is the source

cited by Shanks et al.


7/30/2019 78164802

10/29


allowed to have writing inside its borders is the rectangular

class symbol. Placing the names of associations inside the

symbol borders could have the effect of making them seem

more like classes. This, in conjunction with using nouns to

name the ternary relationships, encourages the misreading of

the relationships as if they were classes. Such a misreading

is likely to lead to the erroneous interpretation of the car-

dinality constraints, which could cause subjects to conclude

that an instance of a ternary relationship can exist without an

instance ofeach of the related classes (e.g., a phase rela-

tionship instance without a related key deliverable). This is

clearly prohibited by the definition of a relationship and doing

so would result in incorrect answers for several questions.

The second problematic departure from standard UML

notation pertains to multiplicity (cardinality constraints). For

ternary relationships (associations), Rumbaugh et al. state,

the multiplicity on an association end represents the potential

number of values at the end when the values at the other [two]

ends are fixed (p. 470). Clarifying this definition they con-

tinue, for example, given a ternary association among classes

(A, B, C), the multiplicity of the C end states how many C

objects may appear in association with a particular pair of A

and B objects. If the multiplicity of this association is (many,

many, one),6 then for each possible (A, B) pair there is a

unique C value (p. 472).

Consider the Team ternary association from Figure 4. If the

multiplicities are read according to the UML definition, the

0..* near the Project class indicates that each possible

combination of a team member and a team leader need not be

associated with any project, but could be associated with

several, which seems a reasonable constraint. However,

consider the 1..* near the Team Leader class. This notation

signifies that each possible combination of team member and

project must be associated with at least one team leader.

Accordingly, every team member must be associated with

every project according to the UML definition.

However, Shanks et al. describe in footnote 7 (p. 560) that

subjects were taught to interpret the multiplicity symbols of

ternary relationships differently. According to the instruction

that subjects received, they should have read the 0..* near

Project to indicate that any given project might not participate

in the Team association or might participate in it many times.

Likewise, the 1..* near Team Leader should be interpreted

to mean that each Team Leader must participate in the Team

association at least once but can participate in it multiple

times. This definition conforms to the notion of participation

constraints, as specified in some variants of the ER model

(Liddle et al. 1993); however, it differs significantly from the

definition of cardinality constraints in UML (see Ap-

pendix A).

The effects of using nonstandard definition of cardinality

constraints depends on subjects experience with UML.

Subjects with extensive UML experience were likely to be

confused by the nonstandard notation and perhaps apply the

standard definitions to the model, resulting in incorrect

answers for some of the questions. While Shanks et al.

disclose that most of their subjects had prior modeling

experience, including experience with UML, neither the

number of subjects having UML experience nor the level of

that experience is reported. Accordingly, the severity of any

concern resulting from this second departure from the UML

standard is uncertain.

Analysis of Performance

Ascribing a numerical representation to subjects performance

is fundamental to analyzing the results of any experiment.

The validity of the statistical analysis is only as strong as the

scoring of participant performance on the experimental task.

In the study conducted by Shanks et al., the experimental task

required participants to answer 11 questions by referencing

the UML class diagram they received (each participant

received either Figure 3 or Figure 4). Although all 11 ques-

tions are presented, answers were provided for only questions

2 and 10. As discussed below, question 2 was scored incor-rectly for the ontologically clear treatment. Question 2 reads

as follows:

A team leader has resigned. Does the model allow

the team to continue to work on the project without

him?

Shanks et al. indicate that the correct answer for the onto-

logically clear treatment (Figure 3) is possible and that the

correct answer for the ontologically unclear treatment

(Figure 4) is not possible. We agree that the correct answer

for the ontologically unclear treatment is not possible. As

discussed above, the existence of an instance of a relationship

requires exactly one instance of each related entity; hence, a

team cannot exist without its leader. However, that is also the

correct answer for the ontologically clear treatment.

Explaining their rationale for claiming that the correct answer

for the ontologically clear model is possible, Shanks et al.

state that because team leader is weakly aggregated in the

6The designation (many, many, one) for the relationship (A, B, C) indicates

that the cardinality constraint is 0..* on the A end, 0..* the B end, and 1..1 on

the C end.


7/30/2019 78164802

11/29


ontologically clear diagram, the team can still exist if the

leader resigns (p. 565). This statement is inconsistent with

the definition of aggregation in the UML grammar. Shanks

et al. assert that weak aggregation allows a team to exist

without all of its constituents.7 However, aggregation does

not indicate that the aggregate can exist without its parts; but

rather, that a partmay exist independently from the

aggregate (Rumbaugh et al. 2005, p. 164). Rumbaugh et al.

further explain that constraints such as existence dependency

are specified by the multiplicity, not the aggregation marker

(p. 166). Thus, to answer the question of whether the team

can exist without its leader, the multiplicity of the aggregate

association must be examined. On the Team side of the

relationship the multiplicity (1..*) indicates that a team leader

may be associated with one or more teams; the multiplicity (1)

on the Team Leader side indicates that a team must be

associated with exactly one team leader.8

Accordingly, a team must have a leader to exist and thecorrect answer for question 2 in the ontologically clear treat-

ment (Figure 3) is not possible. Because of this misunder-

standing of the UML notation for aggregation, subjects

responses for question 2 in the ontologically clear treatment

were scored incorrectly. According to Figure 3, the team

cannot exist without its leader.

Without more information about how the other nine questions

were scored, it is not possible to determine if this is an

isolated or a pervasive problem. However, with such prob-

lems in the scoring of participant answers and the afore-

mentioned confounds, the statistical analysis of the experi-

mental results involving subjects performance cannot be

interpreted as supporting the hypotheses.

New Experimental Evidence

We argue that the significant statistical results reported by

Shanks et al. could be explained equally well by the asym-

metric use of ternary relationships in their experimental

materials, as opposed to Shanks et al.s contention they were

caused by construct overload. To investigate this, we con-

ducted two experiments. The first experiment tested the

conjecture that users of conceptual modeling diagrams better

understand domain semantics when expressed using binary

relationships than when expressed using ternary relationships.

Our results, described in more detail below, clearly support

this conjecture.

The second experiment further explicates the effects of con-

founding construct overload and conceptual modeling con-

structs and demonstrates requisite methodological procedures

to test the effects of construct overload while considering the

effects of differences in conceptual modeling constructs. As

described below, the results of this study show that construct

overload does not have a significant effect on subjects

performance, at least in the context studied.

Subjects in both experiments were university students

majoring in Information Systems at a large private university.

Each had received several months training on UML including

binary and ternary associations, association classes, andaggregationthe main concepts used in the experimental

treatments. Neither of the authors was involved in the

training. In all, 33 subjects participated in the first experiment

and 82 participated in the second experiment. There was no

overlap of subjects in the two experiments. In each experi-

ment, subjects were randomly assigned to treatments.

The two treatments used for experiment 1 are illustrated in

Figure 5. Subjects were told that, due to public health con-

cerns, Wasatch Pork Distributors (WPD) had been required to

track which pigs were involved in purchases made by each of

its customers and that two different diagrams were proposed

by competing consultants. Each subject was asked to answer

nine questions about the semantics conveyed by each diagram

(order of presentation was randomized). The decision of

casting subjects in the role of evaluating proposed diagrams

was made to allow subjects to identify what they perceived to

be errors in the diagrams without feeling the need to reconcile

the diagram to the given description of the business domain.

The diagram proposed by Consultant 1 contained a ternary

relationship while the diagram proposed by Consultant 2 con-

tained three binary relationships. Each diagram expresses

similar, though not identical, domain semantics. Both al-

lowed WPD to track which customers purchased portions ofwhich pigs.

Subjects overall performance for the diagram containing the

ternary relationship was significantly worse than subjects

performance for the binary-only diagram (n = 33; p = 0.0002;

within subjects analysis). Subjects performed almost 50 per-

cent better with the binary treatment than they did with the

ternary treatment, correctly answering, on average, 6.6 as com-

7The term weak aggregation is not expressly defined in the UML, although

it may reasonably be used to distinguish aggregation from a stronger from

of aggregation, called composition (Rumbaugh et al. 2005, p. 164). In

general, the term used to distinguish aggregation from its stronger form

(composition) is shared aggregation (Rumbaugh et al. 2005, p. 164).

8When a single integer is used to express multiplicity, it stands for both the

minimum and the maximum value (Rumbaugh et al. 2005, p. 466).


7/30/2019 78164802

12/29

7/30/2019 78164802

13/29


Ontological Clarity

UML Syntactic Construct Studied Overloaded Not Overloaded

Association Class Figure 7a Figure 9a

Relationship Figure 9b Figure 7b

Figure 6. Experimental Treatment Summary

a. Association Class Overloaded

Association class notation is used to represent a simple association

(i.e., Works For and Contracts With) as well as a composite (i.e.,

Team and Assignment)

b. Association Class Not Overloaded

Labeled arc notation is used to represent a simple association (i.e.,

Works For and Contract With) while aggregation is used to

represent a composite (i.e., Team and Assignment).

Figure 7. Experiment 2 (Form 1): Collaborative Auditing Incorporated

Each experimental form individually replicates the Shanks etal. study illustrating the methodological difficulties they

encountered. These can be overcome simply by a combined

analysis of the two forms. In the first experimental form

(Figure 7), two UML 2.0 class diagrams were presented that

represent a common underlying business domain, each pur-

ported to have been developed by a different consultant.

Figure 7a exhibits construct overload by using association

classes to represent both composition and regular association.

Figure 7b avoids construct overload by using labeled arcs torepresent regular association and the aggregation notation (the

open diamond at one end of a labeled arc) to represent com-

position. Each subject answered the same set of 11 questions

(shown with answers in Figure 8) for each representation. In

all cases the answers to the questions were the same for both

representations, thereby eliminating the possibility of perfor-

mance differences arising as a result of unbalanced counter-

intuitive domain semantics. Because each subject answered


7/30/2019 78164802

14/29


Questions Answers

1. An account rep is concerned that client employees who have more experience than auditors might unduly influence

assignment findings. Can she list all teams for a given client where the client employee has a hire date that is earlier than

the auditors?

Yes

2. A team is assigned to perform five tasks. Does the model make provisions for more than one auditor to work on some of

those assignments?

No

3. A client entered a contract which will begin in five months time. CAI wants to create a team with an auditor now, and give

the client the next five months to select an appropriate client employee. Is this possible?

No

4. A contract as a number of assignments that can be performed simultaneously. Is it possible for a single client employee to

work on two concurrent assignments?

Yes

5. A client has supplied CAI with the list of employees that will be used for an audit; however, no decision has been made

about which auditor will be assigned. Does the model allow CAI to record the names of the client employees?

No

6. An auditor wishes to check her skill appropriateness for an upcoming contract by checking the tasks to which she will be

assigned. At this point, the client employees have been selected but they have not been paired with auditors to form

teams. Does the model allow assignments to be made before auditors are paired with client employees?

No

7. An auditor and a client employee have been formed into a team on a contract that has been running for 3 months. Upon

reviewing the contract details, they feel that the current set of assignments is inadequate to fulfill the contract. Can they

produce a list of tasks which are planned for the current contract but which have not yet been assigned?

No

8. Because of financial difficulties, a client has requested a reduction in the scope of a contract. CAI is willing to renegotiate

the contract scope. Can each team list the risks of their remaining assignments?

Yes

9. The Account Rep and client on a given contract are working to determine the set of tasks that will be completed to meet

the contract terms. Does the model allow CAI to establish teams before any assignments are made?

No

10. CAI has hired a number of new auditors. Does the model allow CAI to track auditors who are not currently a part of any

team?

Yes

11. A client has requested CAI to perform work that is not included in the task list. Does the model allow CAI to assign a team

to perform that work without creating any new tasks?

No

Figure 8. Collaborative Auditing Incorporated Experimental Questions

the questions for a diagram with construct overload and for adiagram without construct overload, we balance subjects

ability across treatments, reducing exogenous variability in

subject responses. We randomized the order in which sub-

jects received the two treatments to eliminate any systematic

ordering bias.

However, this experimental form suffers some of the same

methodological difficulties as the Shanks et al. study. As

illustrated in Figure 6, it varies two factors, the UML con-

struct studied and construct overload. That is, although we do

not use ternary relationships, our study uses association

classes in only one treatmentthe one with construct over-

load. If subjects perform worse on that treatment, we cannottell if the performance difference is due to construct overload

or to the complexities of the association class notation. Our

experience in teaching UML suggests that association classes,

like ternary relationships, are difficult to understand. To

correct for this methodological concern, we concurrently

administered a second form of the experiment, randomly

assigning subjects to one form or to the other.

In this form (Figure 9), again two UML 2.0 Class Diagramswere presented that represent the same underlying business

domain, each purported to have been developed by a different

consultant. Figure 9a is similar to Figure 7a in that both make

use of association classes; however, Figure 9a avoids con-

struct overload by using association classes to represent

composition and labeled arcs to represent simple associations.

Figure 9b exhibits construct overload by using labeled arcs to

represent both simple association and to represent composi-

tion. Again, as illustrated in Figure 6, this form varies two

factors, the UML construct studied and construct overload.

However, in this form, the more complex construct, associa-

tion class, is not overloaded while the simpler construct,

association, is overloaded. Combining the results from bothforms enables us to differentiate the effects of construct over-

load from the effects of association classes. The complete

experimental design is summarized in Figure 10.

A total of 82 subjects participated in this experiment, each

randomly assigned to one of the two experimental forms.

Each subject answered each of the 11 questions for each of


7/30/2019 78164802

15/29


a. Labeled Arc Not Overloaded

Labeled arc notation is used to represent a simple association (i.e.,

Works For and Contracts With) while association class notation is

used to represent composites (i.e., Team and Assignment)

b. Labeled Arc Overloaded

Labeled arc notation is used to represent simple association (i.e.,

Works For and Contracts With) as well as to represent composition

(i.e., Team and Assignment)

Figure 9. Experiment 2 (Form 2): Collaborative Auditing Incorporated

the two diagrams in that experimental form. A correct answer

was given a value of 1; an incorrect answer was given a value

of 0. To facilitate a within-subject analysis, each subject

received a score of 1, 0, or 1 for each of the 11 questions. A

score of 1 indicates that, for a particular question, the subject

produced the correct answer under conditions of no overload

and the incorrect answer under conditions of overload. A

score of 1 indicates the opposite. A score of 0 indicates no

treatment effect (the subject answered the question correctly

or incorrectly for both treatments).

If the two experimental forms are evaluated independently,

the effect of construct overload is confounded by the degreeto which subjects understood association classes. Table 1

illustrates the results of doing so. In the first experimental

form, all significant performance differences favored the treat-

ment with no overload and no association classes (Figure 7b

over Figure 7a). However, in the second experimental form

all significant performance differences favored the treatment

with overload and no association classes (Figure 9b over

Figure 9a). That is, if the first experimental form were used,

and the effects of association classes were ignored, then the

analysis would suggest that construct overload is detrimental

to human performance. However, if the second experimental

form were used, and the effects of association classes were

ignored, then the analysis would suggest that construct over-

load is beneficial to human performance. Only by combining

the experimental forms can the analysis separate the effects of

construct overload from the effects of association classes.

The results of the combined analysis are summarized in

Table 2. Each of the 11 questions is analyzed in two ways.

First, combining the results from the treatments with constructoverload (Figure 7a and Figure 9b) compared to the

treatments with no construct overload (Figure 7b and Figure

9a), we performed a within-subjects (paired observations)

analysis, using the scoring calculation described above.

Second, we performed two between-subjects analyses using

a rank sum mean difference scoring, and isolating the effects

of association classes.


7/30/2019 78164802

16/29


Figure 10. Experimental Analysis Summary

Table 1. Independent Analysis of Experimental Forms

Experimental Form 1 Experimental Form 2

# n Mean Difference P-Value Mean Difference P-Value

1 41 0.12 0.267 0.00 1.000

2 41 0.37 < 0.001* -0.37 0.003*

3 41 0.29 0.004* -0.15 0.210

4 41 0.02 1.000 -0.05 0.688

5 41 0.12 0.227 -.05 0.688

6 41 -0.07 0.509 0.07 0.549

7 41 0.10 0.388 -0.20 0.008*

8 41 0.00 1.000 -0.10 0.289

9 41 0.27 0.013* -0.24 0.013*

10 41 -0.02 1.000 0.07 0.549

11 41 -0.12 0.180 0.15 0.238*Significant at = 0.05.

Note: Mean differences are calculated as performance on treatment without overload minus performance on treatment with overload.

Composition

Association Class

Composition

Aggregation

Composition

Association Class

Composition

Labeled Arc

Composition

Labeled Arc

Composition

Association Class

Composition

Labeled Arc

Composition

Labeled Arc

Figure 7a: n = 41 Figure 7b: n = 41

Figure 9b: n = 41 Figure 9a: n = 41

n = 82n = 82

Within-Subject Analysis

Between-Subject Analysis

Experimental

Form 1

Figure 7

Experimental

Form 2

Figure 9


7/30/2019 78164802

17/29


Table 2. Results of Statistical Tests for Experiment 2

# N

Within Subjects Analysis

Between Subjects Analysis

Figure 7a vs. Figure 9a Figure 7b vs. Figure 9b

Mean

Difference* p-value

Rank Sum

Mean

Difference* p-value

Rank Sum

Mean

Difference* p-Value1 82 0.061 0.332 8 0.030 -3 0.341

2 82 0.000 1.000 3 0.514 -3 0.401

3 82 0.073 0.377 4 0.384 2 0.615

4 82 -0.012 1.000 -3 0.375 2 0.509

5 82 0.037 0.629 -1 0.770 4 0.313

6 82 0.000 1.000 0 1.000 0 1.000

7 82 -0.049 0.503 0 1.000 -4 0.313

8 82 -0.049 0.455 -2 0.586 -2 0.541

9 82 0.012 1.000 4 0.295 -3 0.515

10 82 0.024 0.832 -1 0.792 3 0.457

11 82 0.012 1.000 -4 0.221 5 0.228

*Null hypothesis: population mean difference = 0. Negative mean difference indicates that raw mean score on the diagram with construct overload

as higher than its counterpart for that question.

Note: Question 1 is not included in the performance analysis. It was included in the experimental treatment to force subjects using the overloaded

diagrams from Figures 7a and 9b to engage the parts of the diagrams that overloaded constructs. Because of the unbalanced complexity this

introduces in Figure 7a, performance differences cannot be attributed to construct overload for this question.

The within-subjects analysis yielded no significant perfor-

mance differences for any question. The p-values ranged

from 0.332 to 1.000 (Table 2, Within-Subjects Analysis).

This provides compelling evidence that, at least in this con-

text, construct overload is not predictive of user performance.

Beyond the within-subject analysis, our experimental design

allows us to conduct two meaningful between-subjects

analyses to further examine any effect of construct overload

while isolating the confounding effects of variations in UML

syntax, specifically association classes. By comparing the

performance of subjects on the overload treatment in experi-

mental form 1 (Figure 7a) with the performance of subjects on

the no-overload treatment of experimental form 2 (Figure 9a)

we can examine the effect of construct overload with only

minor variation in UML syntax. Both treatments used asso-

ciation classes. Likewise, we can compare performance on

Figure 7b with that of Figure 9b to get another look at the

effect of construct overload with very little variation in UMLsyntax. In this comparison, neither treatment uses association

classes. In brief, none of the performance differences is signi-

ficant even at an alpha of 0.20 (Table 2, Between-Subjects

Analysis).

In summary, if construct overload is predictive of human per-

formance in problem solving tasks using conceptual models,

we would expect to see a preponderance of statistically

significant results among the more than 30 tests (question

comparisons) conducted. In fact, we observe none. This

finding strongly suggests that construct overload is not a

salient predictor of human performance in this context.

Hence, based on our own experimental evidence, we argue

that the results presented by Shanks et al. are explained

simply and parsimoniously by the complexities of ternary

relationships and unbalanced counterintuitive domain seman-

tics rather than by any differences in construct overload.

Conclusions

Based on the above analysis of conceptual underpinnings,

experimental procedures, and data analysis, we argue that

there is insufficient evidence upon which to base the conclu-

sions reported in Shanks et al., that their results add strengthto a growing body of empirical work that supports the

usefulness of ontological theories, especially Bunges (1977)

theory, as a means of predicting the strengths and weaknesses

of conceptual modeling grammars and practices (p. 570).

Our own experimental results lead us to the opposite conclu-

sion. Moreover, describing the theoretical underpinning for

their study, Shanks et al. assert


7/30/2019 78164802

18/29


The theory of ontological clarity is also not a con-

tingency theory. In other words, according to the

theory, instances of construct overload, construct

redundancy, construct deficit, and construct excess

will always undermine users understanding of

conceptual models (p. 557).

Given that we have provided compelling evidence that

construct overload does not result in inferior human perfor-

mance in the use of conceptual models, the theory of

ontological clarity has been falsified (Popper 1963) and must

either be discarded or modified to account for the new

experimental findings. To the extent that prior studies rely on

or support the theory of ontological clarity, they should be

reexamined to determine the conditions under which the

theory is predictive. For example, while we have falsified the

theory in the context of construct overload, Bodart et al.

(2001) argue support in the context of construct excess. All

such studies that have reported support for the predictions ofthe theory of ontological expressiveness should be reviewed

in detail for clues about the proper disposition of the theory.

We again emphasize the importance of experimental research

on conceptual modeling and we appreciate the difficulty of

formulating research questions and appropriately developing

and executing experimental procedures. Our critical com-

ments are offered in the spirit of improving subsequent

research in this area. The difficulties we describe can be

addressed in future studies. If Bunges definition of compo-

sition is to be evaluated as a guide to conceptual modelers in

their representation of phenomena, then that definition must

be applied precisely and accurately and not confounded with

differences in conceptual modeling constructs. Otherwise the

experiment cannot be construed as providing evidence that

supports the use of the ontology as a means of predicting the

strengths and weaknesses of conceptual modeling models and

practices (Shanks et al. 2008, p. 570).

However, we see no compelling evidence to indicate that

Bunges ontology holds significant promise as an underlying

foundation for conceptual modeling. With no construct to

represent conceptual objects or events, we contend that

Bunges ontology suffers significant construct deficit with

respect to the representation of business domains. This is byno means a criticism of Bunges ontology. Bunge developed

his ontology for the expressed purpose of representing objec-

tive, scientific knowledge. It was not developed to represent

human commerce or phenomena that occur in human

industry. It was developed to represent concrete objects and

only concrete objects, objects that exist in space and time,

independent of human knowledge or intentions. We render

no judgment on its value for the representation of such

objects. However, we assert that it was not intended to be

used as a referent ontology for conceptual modeling of busi-

ness systems and using it in that context is a misappropriation.

We contend that social ontology, epistemology, cognitive

theories, and theories of human memory structures and

communication are a more appropriate basis for the evaluation

of conceptual modeling grammars and practices (Allen and

March 2006b; Gemino and Wand 2005). Searle (1995, 2006),

for example, recognizes the significance of socially con-

structed objects such as corporations, contracts, agreements,

commitments, money, and intellectual property in the

execution of human endeavor. Such objects are socially

defined and created by performative acts of social entities.

They depend on human recognition and beliefs (episte-

mology) for their existence. That is, they exist only because

people believe (or agree) they exist. Such objects are expli-

citly excluded from Bunges ontology, yet they are crucial

components of business domains and must be included in

conceptual models.

The fundamental question for researchers, elegantly expli-

cated by Wand and Weber (2002), is how can we model such

domains to better facilitate our developing, implementing,

using, and maintaining more valuable information systems?

To answer this question we must first establish a concep-

tualization for information systems within business contexts.

A fundamental role of an information system is that of a

communication mechanism, providing a medium in which not

only facts but intentions, obligations, commands, and inter-

pretations of facts (knowledge and meaning) are recorded and

disseminated within an organization (Kent 1978). Informa-

tion systems play an active role in business processes (March

and Allen 2007), participating in the gathering of business

intelligence and the execution of business activities based

upon the interpretation of the information gathered. These are

cognitive tasks. Conceptual modeling grammars and prac-

tices should attempt to leverage human information pro-

cessing capabilities rather than merely follow philosophical

conceptualizations of existence.

For example, humans have an innate competency for pro-

cessing events. Human memory for events and past experi-

ences is psychologically and physiologically different fromhuman memory for facts and concepts (Nyberg 1998; Tulving

1983, 2002). Moreover, events are fundamental to narrative

thinking (Robinson and Hawpe 1986) and to the represen-

tation of causality (Pillemer 1998; Ramesh and Browne

1999). Both are principal processes in human sense-making

(Gee 1985). Furthermore, humans use this narrative/event

processing competency as a powerful tool for verbal and

written communication (Orr 1990).


7/30/2019 78164802

19/29


Ontologically, events are categorically distinct from objects

(Bunge 1977; Davidson 1980). Hence, it can be argued that

using the UML class construct to represent both events and

objects will result in construct overload. However, studies in

psychology (e.g., Wakslak et al. 2006; Zacks et al. 2007;

Zacks and Tversky 2001) indicate that people conceptualize

events and objects in a similar manner. That is, people

ascribe identification and attributes to events and recall them

in the same manner in which they recall physical objects.

They conceptualize events as having parts (sub-events) and

events constitute a significant component of human memory

structures. Using the class construct to represent both events

and objects can be viewed as an abstraction mechanism used

to manage the complexity inherent in the modeled domain.

This has been recognized by a number of researchers (e.g.,

Geerts and McCarthy 2002) and empirical research has

indicated that representing events as entities is beneficial to

human performance in certain information processing tasks

(Allen and March 2006b). We encourage future researchefforts aimed at gaining an understanding of the effects of

using such abstraction mechanisms in conceptual modeling

grammars and practices.

We hope that our assessment of the work reported by Shanks

et al. and the discussion of alternative foundational disciplines

for conceptual modeling will stimulate discussion and will

lead to the design and execution of additional experiments to

test the implications of such proposed foundations.

Acknowledgments

We thank Gordon B. Davis, John V. Carlis, and Toby Teorey for

their insightful comments and suggestions on an earlier version of

this paper. We also thank Sham Navathe and Tom Nadeau for

clarifying the intended meaning of data models used in their

textbooks, Stephen W. Liddle for clarifying the meaning of the

cardinality constraints for ternary relationships in the UML, and

Mario Bunge for clarifying the definition of concrete objects in his

ontology. Finally, we thank the senior editor, associate editor, and

four anonymous reviewers for helpful comments and suggestions.

References

Allen, G. N., and March, S. T. 2006a. A Critical Assessment of

the BungeWandWeber Ontology for Conceptual Modeling,

in Proceedings of the 16th Annual Workshop on Information

Technologies and Systems, Milwaukee WI, pp. 25-30.

Allen, G. N., and March, S. T. 2006b. The Effects of State-Based

and Event-Based Data Representation on User Performance in

Query Formulation Tasks, MIS Quarterly (30:2), pp. 269-290.

Batra, D., Hoffer, J. A., and Bostrom, R. P. 1990. Comparing

Representations with Relational and EER Models, Com-

munications of the ACM(33:2), pp. 126-139.

Bodart, F., Patel, A., Sim, M., and Weber, R. 2001. Should

Optional Properties Be Used in Conceptual Modeling? A Theory

and Three Empirical Tests, Information Systems Research

(12:4), pp. 384-405.Bowen, P. L., OFarrell, R. A., and Rohde, F. H. 2006. Analysis

of Competing Data Structures: Does Ontological Clarity Produce

Better End User Query Performance,Journal of the AIS(7:1),

Article 22.

Bowen, P. L., OFarrell, R. A., and Rohde, F. H. 2009. An

Empirical Investigation of End User Query Development: The

Effects of Improved Model Expressiveness versus Complexity,

Information Systems Research (20:4), pp. 565-584.

Bunge, M. 1977. Treatise on Basic Philosophy: Volume 3:

Ontology I: The Furniture of the World, Boston: Reidel.

Burton-Jones, A., and Meso, P. 2006. Conceptualizing Systems

for Understanding: An Empirical Test of Decomposition

Principles in Object-Oriented Analysis, Information SystemsResearch (17:1), pp. 38-60.

Burton-Jones, A., Wand, Y., and Weber, R. 2009 Guidelines for

Empirical Evaluations of Conceptual Modeling Grammars,

Journal of the AIS(10:6), Article 1.

Carlis, J., and Maguire, J. 2001. Mastering Data Modeling: A

User-Driven Approach, Boston: Addison-Wesley.

Chen, P. P-S. 1976. The Entity-Relationship ModelToward a

Unified View of Data,ACM Transactions on Database Systems

(1:1), pp. 9-36.

Davidson, D. 1980. Essays on Actions and Events, New York:

Oxford University Press.

Elmasri, R., and Navathe, S. B. 2004. Fundamentals of Database

Systems (4

th

ed.), Boston: Pearson/Addison-Wesley.Feldman, M. S., and Pentland, B. T. 2003. Reconceptualizing

Organizational Routines as a Source of Flexibility and Change,

Administrative Science Quarterly (48:1), pp. 94-118.

Fowler, M. 2003. UML Distilled: A Brief Guide to the Standard

Object Modeling Language, Boston: Addison-Wesley.

Gee, J. P. 1985. The Narrativization of Experience in the Oral

Style,Journal of Education (167), pp. 9-35.

Geerts, G. L., and McCarthy, W. E. 2002. An Ontological

Analysis of the Economic Primitives of the Extended-REA

Enterprise Information Architecture,International Journal of

Accounting Information Systems (3:1), pp. 1-16.

Gemino, A., and Wand, Y. 2005. Complexity and Clarity in

Conceptual Modeling: Comparison of Mandatory and Optional

Properties,Data and Knowledge Engineering(55), pp. 301-326.

Halpin, T., and Morgan, T. 2008. Information Modeling and

Relational Databases (2nd ed.), Burlington, MA: Morgan

Kaufmann Publishers.

Hansen, G., and Hansen, J. 1996. Database Management and

Design (2nd ed.), Upper Saddle River, NJ: Prentice-Hall.

Keet, C. M. 2008. Representing and Reasoning Over a Taxonomy

of PartWhole Relations,Applied Ontology (3), pp. 91-110.


7/30/2019 78164802

20/29


Kent, W. 1978. Data and Reality. Amsterdam: Elsevier Science

Limited.

Kim, Y. G., and March, S. T. 1995. Comparing Data Modeling

Formalisms, Communications of the ACM(38:6), pp.103-115.

Liddle, S., Embley, D., and Woodfield, S. 1993. Cardinality

Constraints in Semantic Data Models, Data and Knowledge

Engineering, (11), pp. 235-270.March, S. T., and Allen, G. 2007. Ontological Foundations for

Documents

78164802