Upload
v-ramesh
View
215
Download
1
Embed Size (px)
Citation preview
Expressing casual relationships in conceptual database schemas
V. Ramesh a,*, Glenn J. Browne b
a Department of Accounting and Information Systems, Kelley School of Business, Indiana University, 1309 E, 10th St., Bloomington, IN 47405, USAb Information Systems and Quantitative Sciences, College of Business Administration, Texas Tech University, Lubbock, Texas, USA
Received 1 April 1998; accepted 22 October 1998
Abstract
Conceptual schema design is a crucial phase in the database design process. The quality of the ®nal database (regardless of logical
implementation model) is dependent largely upon the quality of the conceptual schema. Since conceptual schemas serve as formal
representations of the requirements speci®cation for a database, it is critical that a schema capture the requirements as completely
and unambiguously as possible. Many studies have shown that semantic models, such as the Extended Entity±Relationship model,
are better for conceptual database design than traditional models such as relational, hierarchical, and network models. This is
primarily because of their ability to capture explicitly many ``natural'' cognitive relationship types that are likely to occur in re-
quirements speci®cations, e.g., association, generalization/specialization, and aggregation. However, the relationships that can be
speci®ed in a semantic model represent only a subset of the relationships that are likely to be used by people in describing an
application environment. Thus, using current semantic models for conceptual database design may result in abstractions of ap-
plication environments in which some important information from the requirements is either not represented or is represented
inappropriately.This paper seeks to help bridge the gap between requirements speci®cations and data modeling by hypothesizing the
need for supporting additional cognitive relationship types in conceptual models. In the paper, we demonstrate the need for one such
relationship type, causation. Speci®cally, we investigate the e�ects of the lack of constructs in semantic models for capturing
causation on analysts' ability to express causal relationships mentioned in a requirements document.We found that subjects not
familiar with data modeling expressed causal relationships better in their representations than did subjects who had some prior
exposure to data modeling. This seems to indicate that the lack of constructs for capturing causation in semantic models hinders the
ability of people trained in data modeling techniques to recognize and express causal relationships in conceptual schemas. The
results also suggest the need to develop semantic models that provide constructs for capturing causation and other cognitive re-
lationships. Ó 1999 Elsevier Science Inc. All rights reserved.
Keywords: Semantic modeling; Conceptual schemas; Requirements determination; Causation; Cognitive relationships; Database design
1. Introduction
Conceptual schema design is a crucial phase in thedatabase design process. The quality of the ®nal data-base (regardless of logical implementation model) andapplications are dependent largely upon the quality ofthe conceptual schema. A conceptual model is intendedto serve as a formal representation of the requirementsspeci®cations. Hence, it is important that a conceptualschema capture the requirements speci®ed as completelyand unambiguously as possible (Jarvenpaa et al., 1989).The need to bridge the gap between requirements spec-i®cations and data modeling has been identi®ed as acritical area of research (Navathe, 1992). This paper
seeks to help bridge this gap by hypothesizing the needfor additional cognitive relationship types in conceptualschemas, and demonstrating the need for one suchconstruct, causation.
To understand the weakness of current data models,it is important to examine the kinds of relationships thatare likely to be present in a requirements document.Managers and other end-users are typically not trainedin database design, and therefore express requirementsas they perceive them in the world using natural rela-tionships among objects. These expressions of require-ments re¯ect the natural relationship types humans useto organize knowledge. Research in ®elds as diverse ascognitive psychology, philosophy, and rhetoric hasidenti®ed numerous such relationships between entities,such as causal, motivational, and hierarchical relation-ships (Brockriede et al., 1960; Browne et al., 1998;
The Journal of Systems and Software 45 (1999) 225±232
* Corresponding author. Tel.: +1-812-855-2641; fax: +1-812-855-
8679; e-mail: [email protected]
0164-1212/99/$ ± see front matter Ó 1999 Elsevier Science Inc. All rights reserved.
PII: S 0 1 6 4 - 1 2 1 2 ( 9 8 ) 1 0 0 8 1 - X
Curley et al., 1995). A list of these relationships appearsin Table 1.
Given the breadth of relationship types that is likelyto occur in requirements speci®cations, it is not sur-prising that semantically-rich data models that permitexplicit speci®cation of cognitive relationships such asassociation (sign in Table 1), generalization/specializa-tion (generalization and individuation in Table 1), andaggregation (various hierarchical relationships in Ta-ble 1) have been found to be better for conceptualschema design than traditional models such as rela-tional, hierarchical, and network models (Hull et al.,1987). For example, in a study of the literature com-paring the usability of various conceptual data models(traditional and semantic), Batra et al. (1994) found thatsemantic models such as the Entity±Relationship Model(Chen, 1976) and its derivatives were best suited forsupporting conceptual database design. Further, Jar-venpaa et al. (1989) found that end-users were able toexpress relationships better using semantic models.Navathe (1992) identi®ed ®ve characteristics that a goodconceptual model must possess: Expressiveness, Sim-plicity, Minimality, Formality, and Unique Interpr-etation. The key characteristic distinguishing semanticmodels from traditional models is the expressiveness ofthe relationship constructs supported by them (Burt etal., 1990). This expressiveness allows designers to createabstractions of real-world information by mapping thatinformation into basic human concepts (Tsichritzis etal., 1982), Thus, a semantic model can better capture theuser's perception of data relevant to an application (asde®ned by the requirements) (Navathe, 1992).
As noted, most semantic models allow explicit spec-i®cation of association, generalization/specialization,and aggregation relationships. However, a review ofTable 1 shows that these represent only a subset of therelationships that are likely to be used by people in de-scribing an application environment. Hence, while se-mantic models may have higher image ®delity thantraditional data models, i.e., schemas created using se-mantic models may conform better to users' views of theworld (Everest, 1986), they are still limited in the typesof relationships available in them. Thus, using currentsemantic models for conceptual schema design may re-sult in abstractions of application environments inwhich some important information is either not repre-sented or is represented inappropriately. 1
1.1. Causation
One relationship type that is not supported in currentsemantic data models is causation. Causation is a fun-damental aspect of cognition, and is the most commontype of relationship revealed in studies of human rea-soning (Curley et al., 1995; Schustack, 1988). For ex-ample, in an empirical study of managerial reasoning,two-thirds of the relationships expressed by subjectswere causal in nature (Curley et al., 1995). Hence, causalrelationships undoubtedly are part of users' represen-tations of problem representations, and it is likely thatsuch relationships will be found in requirements speci-®cations.
Data modelers are most likely to encounter causalrelationships in the form of business rules (McFadden etal., 1999) or conditional requirements statements. Sucha rule or statement, though not representing causality inits purest form, is an informal use of causation; it pro-vides a condition whose presence makes a critical dif-ference to the occurrence of an outcome (Schustack,1988). The importance of causal statements in require-ments documents is likely to increase in the future, be-cause embedding business rules in the form of triggers isbecoming increasingly prevalent in commercial data-bases.
Although the pervasiveness of causation in problemsolvers' representations has been empirically demon-strated (e.g., Curley et al., 1995; Tversky et al., 1980;Wilkin, 1996), none of the models used for databasedesign provide su�cient means for capturing causal re-lationships (Hull et al., 1987). The inability to expressthese relationships is likely to lead to conceptual sche-mas that do not completely represent the requirements.The focus of this paper is on investigating the e�ects ofthe lack of constructs in semantic models for capturingcausation on analysts' ability to express causal rela-tionships mentioned in a requirements document.
2. Hypotheses
Two groups of subjects were sought for the study, onefamiliar with semantic data modeling techniques andone unfamiliar. The rationale for the two groups was asfollows. Research has demonstrated that people orga-nize information using causation under appropriatecircumstances (Schustack, 1988). Hence, subjects unfa-miliar with database modeling (the database-naivegroup) should use causal relationships as naturally ap-propriate in modeling an application environment.However, because current data modeling training andpractice do not support the representation of causation,we hypothesize that subjects familiar with data modeling(the database-knowledgeable group) will not use suchrelationships. Rather they will force causal relationships
1 It should be noted that we are not implying that all relationships
mentioned in Table 1 need to be supported in semantic models. In fact,
some of the relationships that people use in their cognitive represen-
tations may not have relevance for data modeling. The usefulness of
such relationships is an empirical question.
226 V. Ramesh, G.J. Browne / The Journal of Systems and Software 45 (1999) 225±232
Tab
le1
Exa
mp
lere
lati
on
ship
s(A
da
pte
dfr
om
Bro
wn
ea
nd
Cu
rley
,1998;
Cu
rley
,B
row
ne,
Sm
ith
,an
dB
enso
n,
1995)
Rel
ati
on
ship
cate
go
ry
Rel
ati
on
ship
typ
e
Des
crip
tio
nE
xam
ple
Ca
usa
lC
au
sal
Sta
tes
con
dit
ion
sw
ho
sep
rese
nce
mak
esa
crit
ical
di�
eren
cein
the
occ
urr
ence
of
an
even
t.T
he
com
pan
yis
sued
ago
od
earn
ings
fore
cast
;
ther
efo
re,
its
sto
ckp
rice
rose
the
nex
td
ay.
Mo
tiv
ati
on
al
Inte
nti
on
so
fh
um
an
ag
ents
are
ass
um
edas
reaso
ns
for
act
ion
s.T
he
emp
loyee
sw
ill
use
this
info
rmati
on
syst
em
bec
au
seth
eyk
no
wit
wil
lh
elp
them
wit
hth
eir
job
s.
Co
va
ria
tio
na
lS
ign
An
ob
serv
edo
ra
ssu
med
covari
ati
on
bet
wee
nen
titi
esall
ow
so
ne
enti
tyto
be
use
das
asy
mp
tom
or
clu
efo
rco
ncl
ud
ing
the
oth
eren
tity
isp
rese
nt.
Th
ep
rod
uct
man
ager
bel
ieved
that
the
hig
hin
itia
l
sale
so
fth
en
ewp
rod
uct
ind
icate
dth
at
itw
ou
ldb
ea
bes
tse
ller
.
Hie
rarc
hic
al
Gen
era
liza
tio
nT
he
ind
uct
ive
arg
um
ent:
reaso
nin
gth
at
wh
at
istr
ue
for
spec
i®c
inst
an
ces
wil
lals
ob
etr
ue
for
oth
er
inst
an
ces
wit
hin
am
ore
gen
eral
cate
go
ry.
Ih
ave
trie
dtw
ob
ott
les
of
the
new
Do
lph
inso
ftd
rin
k
an
dh
ave
lik
edth
eta
ste;
ther
efo
re,
Ili
ke
the
new
soft
dri
nk
.
Ind
ivid
ua
tio
nR
easo
nin
gfr
om
the
gen
eral
toth
esp
eci®
c;w
hat
istr
ue
for
the
gen
eral
cate
go
ryis
claim
edto
be
tru
efo
rin
div
idu
al
inst
an
ces
wit
hin
that
cate
go
ry.
All
pro
du
cts
intr
od
uce
db
yth
isco
mp
an
yh
ave
bee
n
succ
essf
ul;
ther
efo
re,
this
new
pro
du
ctw
ill
be
suc-
cess
ful.
Ca
teg
ori
zati
on
Use
dto
sup
po
rtg
ener
ali
zati
on
an
din
div
idu
ati
on
arg
um
ents
;ap
pli
esw
hen
the
pre
sen
ceo
ffe
atu
res
is
su�
cien
tto
con
clu
de
that
an
enti
tyb
elo
ngs
toa
sup
ero
rdin
ate
cate
go
ry.
Th
isp
rod
uct
had
sale
so
f$100
mil
lio
nit
s®
rst
yea
r;
ther
efo
re,
this
isa
succ
essf
ul
pro
du
ct.
Hie
rarc
hic
al
Ex
clu
sio
n
Wh
ena
cate
go
ryco
nta
ins
ase
to
fm
utu
all
yex
clu
sive
inst
an
ces,
the
pre
sen
ceo
fo
ne
inst
an
ceall
ow
sth
e
arg
uer
toco
ncl
ud
eth
eab
sen
ceo
fth
ere
main
ing
inst
an
ces.
Th
ep
erso
nw
ho
com
mit
ted
the
crim
ew
as
aw
om
an
;
ther
efo
re,
itw
as
no
ta
man
.
Hie
rarc
hic
al
Co
mb
ina
tio
n
Wh
ena
cate
go
ryco
nta
ins
ase
to
fco
llec
tivel
yex
hau
stiv
ein
stan
ces,
the
pre
sen
ceo
fall
inst
an
ces
all
ow
s
the
arg
uer
toco
ncl
ud
eth
ep
rese
nce
of
the
sup
erse
tca
tego
ry.
Th
ep
rod
uct
isn
ow
sold
in50
state
s;th
eref
ore
,it
no
w
has
an
ati
on
al
pre
sen
ce.
Sim
ila
rity
Pa
rall
elC
ase
An
intr
a-d
om
ain
sim
ilari
tyex
ists
bet
wee
ntw
oen
titi
es.
Th
ela
stti
me
we
intr
od
uce
da
pro
du
ctu
nd
erth
ese
circ
um
stan
ces,
itw
as
succ
essf
ul;
ther
efo
re,
this
pro
du
ctsh
ou
ldals
ob
esu
cces
sfu
l.
An
alo
gy
An
inte
r-d
om
ain
sim
ila
rity
exis
tsb
etw
een
two
enti
ties
.S
ale
so
fa
new
pro
du
ctare
lik
ea
seed
lin
g;
togro
w,
they
mu
stb
eca
refu
lly
nu
rtu
red
.
Tes
tim
on
yA
uth
ori
tyT
he
arg
uer
uti
lize
sa
sta
tem
ent
mad
eb
yan
exte
rnal
kn
ow
led
ge
sou
rce.
Th
ed
oct
or
said
Iam
per
fect
lyh
ealt
hy.
V. Ramesh, G.J. Browne / The Journal of Systems and Software 45 (1999) 225±232 227
into types supported by current data models or not ex-press them at all. Therefore, our ®rst hypothesis is:
(1) Database-naive modelers will express causal re-lationships in an application scenario to a greater extentthan database-knowledgeable modelers.
Further, since current semantic data models do notprovide adequate means for expressing causal relation-ships, we hypothesize that database-knowledgeablemodelers will use textual statements to represent anycausal relationships they could not express in their datamodel. However, since we expect that the database-na-ive modelers will not restrict themselves to a particulargraphical representation, we anticipate that they willrepresent causal relationships diagrammatically. Hence,our second hypothesis is:
(2) Database-knowledgeable modelers will use agreater number of textual statements to express causalrelationships than database-naive modelers.
3. Methodology
An experimental hypothesis testing methodology wasused to investigate users' ability to represent causal re-lationships in an application scenario. Subjects were 78students recruited from information systems classes atan eastern US university who received course credit fortheir participation. Subjects were categorized as eitherdatabase-naive (``Naive'') or database-knowledgeable(``Knowledgeable'') for purposes of analysis. A briefquestionnaire distributed after the experimental taskwas used to facilitate this determination (the principalquestion concerned subjects' ability to de®ne the terms``entity'' and ``cardinality'' in an E/R modeling context).35 subjects were categorized as database-naive and 43subjects were categorized as database-knowledgeable. 2
The stimulus material was a short case describing therequirements for a hospital database application. Thecase appears in the appendix to this paper. Four informalcausal relationships were deliberately and explicitly em-bedded in the case. 3 These statements appear in Table 2.Subjects in both groups were instructed to sketch agraphical description of the case situation using paper
and pencil (note that the knowledgeable subjects werenot explicitly asked to create an ER diagram). Subjectswere given 45 minutes to complete the task. All subjects®nished their representations during this time period.
4. Results
As a check on whether the naive group was handi-capped by the lack of formal training in modeling, wetested to see whether members of the two groups cap-tured the essential entities in the model to the same ex-tent. Six entities were identi®ed by the researchers ascritical to representing the content of the scenario. Thenumber of entities expressed by each subject was tallied(to be counted, the entity had to be explicitly stated bythe subject). The mean number of entities expressed bygroup was as follows: Naive� 5.17; Knowledge-able� 5.28. These means were not signi®cantly di�erent(t�63� � 0.53; p � 0.60), indicating that subjects in thenaive group were able to express the important entitiesin the scenario as well as subjects in the knowledgeablegroup. 4
Subjects' representations were coded independentlyby two coders. To prevent the possibility of subcon-scious biases, the coders were unaware of which group aparticular subject fell within during coding. The codersused a four point scale to rate the extent to which sub-jects expressed the causal relationships embedded in thecase scenario. The coding scale is described in Table 3.
The coding scheme is necessarily subjective, and in-terrater reliabilities were calculated to assess the extentto which the coding was performed consistently. For allfour statements across 78 subjects, the two codersagreed on 83.3% of the statements (260 out of 312).Codes for relationships on which there was disagree-ment were resolved through discussion between the twocoders. These agreed-upon codes were used for the an-alyses that follow.
Of primary interest was whether the naive group andknowledgeable group di�ered in the extent to which theyexpressed the causal relationships present in the sce-nario. As a preliminary procedure, we removed all re-lationships that had been coded as zero by the coders.As noted, a code of zero indicated that a relationshipwas not expressed at all by a subject. In other words,some subjects simply did not express the relationships ofinterest. 5 Since the ultimate question of interest iswhether subjects expressed the causation present in re-
2 Note that ``database-knowledgeable'' does not mean ``expert.''
Subjects in the knowledgeable group simply had had some training in
constructing entity-relationship diagrams. They were not necessarily
expert data modelers. However, our argument is that people trained in
data modeling will not express causality because semantic modeling
does not support causation. Thus, there is no reason to believe that
experienced data modelers (in organizations) will be any more sensitive
to causality than the ``knowledgeable'' subjects in our study.3 The causal relationships included in the case may be termed
``informal'' because they did not explicitly meet all the criteria for
causation; e.g., they did not explicitly rule out all other possible causes.
The relationships re¯ected causation in the common everyday sense,
i.e., providing a condition that leads to an e�ect.
4 Although number of entities captured is relevant to a model's
quality, we do not mean to imply anything regarding the quality of
subjects' models in the two groups. No judgments of overall quality
were made.5 For the naive group, 31% of the codes were zeros. For the
knowledgeable group, 27% of the codes were zeros.
228 V. Ramesh, G.J. Browne / The Journal of Systems and Software 45 (1999) 225±232
lationships they captured, we treated these subjects asnot having performed that portion of the task. For thosesubjects who were able to capture some relationshipbetween entities in the data, were there di�erences in theextent to which they expressed causation?
In analyzing the data, signi®cant violations of thehomogeneity of variance assumption underlying theanalysis of variance procedure were observed. Hence,the Wilcoxen Sum of Ranks test (corrected for tiedvalues), a non-parametric procedure, was performed totest whether the groups di�ered in terms of the causalratings assigned to their representations of the causalrelationships. A sum of ranks was performed withineach causal statement, to control for possible di�erencesbetween the statements. The results are shown in Ta-ble 4. As can be seen, di�erences between the naive andknowledgeable groups were found for statements 3 and4. The ratings assigned to these statements for subjectsin the naive group were signi®cantly higher, at ana� 0.05 level (indicating greater expression of causa-tion). Naive subjects had higher mean ratings forstatements 1 and 2 as well, although these di�erences didnot reach statistical signi®cance. These results support
Hypothesis 1. 6 The following conclusion may be drawn:Subjects in the naive group expressed the causationpresent in the application scenario to a greater extent thansubjects in the knowledgeable group.
As a second method of analyzing the extent to whichsubjects expressed the causal relationships in the sce-nario, we counted the number of times subjects' repre-sentations for each relationship were rated as a ``2''(indicating) implicit expression of causation) or a ``3''(indicating explicit expression of causation). Table 5lists these numbers, and Figs. 1 and 2 show examples ofsubjects' representations that were coded as ``2'' or ``3''(only a portion of each subject's diagram is shown). Ascan be seen in Table 5, subjects in the naive group weremuch more likely to express causal relationships eitherimplicitly or explicitly than were subjects in the knowl-edgeable group. Although only 14 of the 78 total sub-jects expressed causation in their representations, 13 of35 subjects in the naive group expressed causation.These results are implicit in Table 4 above, but Table 5provides further explicit support for Hypothesis 1.
The second question of interest in the current studywas whether subjects in the knowledgeable group wouldutilize textual statements to a greater extent to representcausal relationships than subjects in the naive group.Supplementary textual statements made by subjectswere coded independently by two coders for their causalcontent. 7 A three-point scale was used (Table 6).
Table 2
List of causal statements in scenario
Number Causal statements
1 When a physician is on leave or vacation, this causes changes in hospital scheduling
2 The equipment located in each room causes certain patients to be given certain rooms
3 The availability of physicians causes a particular physician to be assigned to a particular patient
4 Each nurse on practical training is supervised by a physician, with the particular physician assigned determined
by the physician's specialization
Table 3
Scheme for coding graphical representations
Level Description
0 Subject did not express the relationship in his or her diagram
1 Subject expressed the relationship with no indication of
causation
2 Subject expressed the relationship and expressed causation in
some implicit way
3 Subject expressed the causal relationship explicitly, using
either causal language or speci®cally-de®ned causal symbols
Table 4
Mean ratings and signi®cance for four causal statements
Causal
Statement
Naive group
mean ratings
Knowledgeable
group mean
ratings
p-value
1 1.06 1.00 .11
2 1.19 1.03 .10
3 1.19 1.00 .01
4 1.21 1.00 .01
6 We should note that although we found a statistically signi®cant
di�erence between naive and knowledgeable subjects in the extent to
which they expressed the causation present in relationships, the mean
ratings for both groups of subjects were quite low. This may indicate
that people have di�culty representing causation graphically. That is,
these data seem to indicate that anyone asked to represent causal
relationships using graphical representations may have di�culty
expressing the causation. Further research is needed to test whether
certain representational forms are more useful than others for
expressing causal relationships. (There are, however, graphical tools
explicitly designed to help people express causal relationships. Cause
maps are one example of such a tool (for a review, see Hu�, 1990).
However, in the absence of explicit instructions to use such tools,
causation may be di�cult to express).7 Textual statements that were exact copies of statements in the
requirements were not counted by the coders. We interpreted such
statements as simple restatements of the problem for ``problem-
solving'' purposes.
V. Ramesh, G.J. Browne / The Journal of Systems and Software 45 (1999) 225±232 229
Table 7 shows examples of textual statements coded as``1'' and ``2''.
The mean ratings for textual statements made bythe two groups were as follows: Naive group� 0.46;Knowledgeable group� 0.48. These means were notstatistically di�erent (using Wilcoxen Sum of Ranks,z � 0.05; p � 0.98); thus, hypothesis 2 was not sup-ported. Subjects in the naive group made a total of 28textual statements to qualify their diagrams, and sub-jects in the knowledgeable group used a total of 27
qualifying statements. Of the 28 statements made bynaive group members, one was coded as a 1 (implicitexpression of causation) by the coders, and six werecoded as a 2 (explicit expression of causation). Coin-cidentally, of the 27 statements made by knowledge-able group members, one was also coded as a 1 andsix were also coded as a 2. Hence, in both groups, onlyabout 25% of the textual statements made by subjectsto supplement their diagrams were related to causa-tion.
Fig. 1. Example Representation (Naive group) coded as a 2.
Fig. 2. Example Representation (Naive group) coded as a 3.
Table 5
Causal representations rated as 2 or 3
Statement 1 Statement 2 Statement 3 Statement 4
2s 3s 2s 3s 2s 3s 2s 3s
Naive 1 0 1 2 3 1 4 1
Knowledgeable 0 0 1 0 0 0 0 0
Table 7
Example of textual statements coded as ``1'' and ``2''
Example of
``1''
``Hospital scheduling needs to know each physician's
leave or vacation to assign available physicians to day or
night hours.''
Example of
``2''
``Availability of physicians causes physicians to be
assigned to patients.''
Table 6
Scheme for coding graphical representations
Level Description
0 Textual statement makes no attempt to express causal
relationship
1 Textual statement makes an implicit attempt to express
causal relationship
2 Textual statement explicitly expresses causal relationship
230 V. Ramesh, G.J. Browne / The Journal of Systems and Software 45 (1999) 225±232
Several possible conclusions can be drawn from theseresults. The relatively low number of textual statementsused to express causal relationships (14/312 � 4.5%)suggests that both groups relied heavily on their dia-grams to communicate the information in the scenario.As expected, subjects in the naive group may have feltthat they could adequately represent the causal rela-tionships in their diagrams, and in fact they did a betterjob of doing so overall than did the knowledgeablegroup. Subjects in the knowledgeable group may haveignored the causation present in the scenario becausecausation is not one of the types of relationships ana-lysts are trained to look for when creating semanticmodels.
5. Conclusions and future research
Our objective in this paper has been to investigate thee�ects of the lack of causal constructs in semanticmodels on analysts' ability to express causal relation-ships in conceptual schemas. We reported the outcomeof an experiment that examined the extent to whichdatabase-naive and database-knowledgeable peoplewere able to model causal statements embedded in ashort case. We found that the conceptual representa-tions created by the naive subjects expressed causal re-lationships better than those created by moreknowledgeable subjects.
Although causation is a natural construct used bypeople to represent relationships, the results of our studysuggest that the lack of adequate support for capturingcausation in semantic models hinders the ability of moreknowledgeable subjects to recognize and express causalrelationships during conceptual modeling (hence thelack of causation in the knowledgeable group's repre-sentations). The data also suggest that exposure to amodeling technique causes people to suppress theirnatural inclination to express certain types of relation-ships and to use only the relationship constructs sup-ported by the semantic models. The result is a lessfaithful representation of the requirements in knowl-edgeable subjects' models.
Therefore, future research might investigate ways inwhich causation can be incorporated into semanticmodels. Such incorporation must be accomplishedwithout adversely a�ecting a semantic model's ability toserve as a medium of communication among people(Tsichritzis et al., 1982). Another natural extension ofthe current research is an evaluation of the role thatother types of cognitive relationships (identi®ed in Ta-ble 1), such as similarity relationships, might play inreducing di�erences between requirements speci®cationsand conceptual schemas intended to represent such re-quirements. Since these relationships are used by peoplein their descriptions of problem domains, they need to
be recognized and represented by analysts. Promptsdesigned to elicit such relationships explicitly can beuseful in helping users to articulate their beliefs about anapplication environment (Browne et al., 1997). How-ever, representational forms must be available to con-nect the requirements to conceptual schema design. Thekey point is that various types of relationships will bestated when managers and end-users specify require-ments, and it is important to develop techniques thatexplicitly map these relationships from the requirementsto the conceptual model.
Finally, semantic modeling practice suggests that themere existence of useful modeling constructs is not en-ough to guarantee their widespread use. For example,although most semantic models provide support foraggregation, the construct is seldom used during con-ceptual schema design (e.g., an aggregation is oftenrepresented as multiple association relationships).Hence, another important goal in this research is toinvestigate how training in conceptual schema designcan be modi®ed to encourage the use of new relationshiptypes.
The authors contributed equally to the preparation ofthis article.
Appendix A. Instructions
Please organize the information in the following case.In particular, please use pencil and paper to sketch agraphical description of the company's business pro-cesses and information important to those businessprocesses. Please also carefully describe any informationor relationships that you see but cannot capture in yourdiagram.
Mountain View Community HospitalMountain View Community Hospital serves several
cities in northwestern Baltimore County. MountainView is planning to implement an information system tohelp manage its operations, particularly information onpatient administration and hospital personnel.
The hospital needs to keep records concerning itsphysicians, patients, departments, equipment, and bill-ing information. Physicians specialize in only one area,and the hospital is particularly interested in keepingtrack of pediatricians, heart specialists, and cancer-treatment specialists. When a physician is on leave orvacation, this causes changes in hospital schedulingprocedures. Therefore, the hospital needs this informa-tion to assign available physicians to day or night dutyhours.
Patients are treated as in-patients or out-patients. In-patients are assigned individual rooms. The equipmentlocated in each room causes certain patients to be givencertain rooms. Each patient may be treated by one ormore physicians; the availability of the physicians causes
V. Ramesh, G.J. Browne / The Journal of Systems and Software 45 (1999) 225±232 231
a particular physician to be assigned to a particularpatient. Each physician may treat one or more patients.Treatment history for each patient is maintained by thehospital. This includes such factors as when a patientwas treated, who treated him or her, what medicationwas prescribed, and any side e�ects that the treatmentmay have had.
Mountain View keeps track of two categories ofemployees, full-time and practical-training. The hospitalis particularly interested in monitoring the progress ofnurses who are on practical training. Informationtracked includes the date practical training began, theexpected end date, university a�liation (if any), status ofpractical training, and progress reports (if any). Eachnurse on practical training is supervised by a physician,with the particular physician assigned determined by thephysician's specialization.
References
Batra, D., Antony, S.R., 1994. E�ects of data model and task
characteristics on designer performance: a laboratory study.
International Journal of Human-Computer Studies 41, 481±508.
Brockriede, W., Ehninger, D., 1960. Toulmin on argument: an
interpretation and application. Quarterly Journal of Speech,
44±53.
Browne, G.J., Curley, S.P., Benson, P.G., 1997. Evoking information
in probability assessment: knowledge maps and reasoning-based
directed questions. Management Science 43, 1±14.
Browne, G.J., Curley, S.P., 1998. Reasoning with category knowledge
in probability forecasting: typicality and perceived variability
e�ects. In: Wright, G., Goodwin, P. (Eds), Forecasting with
Judgment, Wiley, Chichester, pp. 169±200.
Burt, P.V., Kinnucan, M.T., 1990. Information models and modeling
techniques for information systems. Annual Review of Informa-
tion Science and Technology 25, 175±208.
Chen, P.P., 1976. The entity-relationship model: toward a uni®ed view
of data. ACM Transactions on Database Systems. 1 (1), 9±36.
Curley, S.P., Browne, G.J., Smith, G.F., Benson, P.G., 1995. Argu-
ments in the practical reasoning underlying constructed proba-
bility responses. Journal of Behavioral Decision Making 8, 1±20.
Everest, G.C., 1986. Database management: objectives, system func-
tions, and administration. McGraw-Hill, New York.
Hu�, A.S., 1990. Mapping Strategic Thought, In: Hu�, A.S. (Ed.),
Mapping Strategic Thought, Wiley, Chichester, pp. 11±49.
Hull, R., King, R., 1987. Semantic database modeling: survey,
applications and research issues. ACM Computing Surveys 19,
201±260.
Jarvenpaa, S., Machesky, J., 1989. Data analysis and learning: an
experimental study of data modeling tools. International Journal
of Man-Machine Studies 31, 367±391.
McFadden, F.R., Ho�er, J.A., Prescott, M.B., 1999. Modern Data-
base Management (5th ed.), Addison±Wesley, Reading, MA.
Navathe, S.B., 1992. Evolution of data modeling formalisms. Com-
munications of the ACM 35, 112±123.
Schustack, M.W., 1988. Thinking about causality, In: Sternberg, R.J.,
Smith, E.E. (Eds.), The Psychology of Human Thought, Cam-
bridge University Press, Cambridge, pp. 92±115.
Tsichritzis, D.C., Lochovsky, F.H., 1982. Data Models. Prentice-Hall,
Englewood Cli�s, NJ.
Tversky, A., Kahneman, D., 1980. Causal schemas in judgments under
uncertainty. In: Fishbein, M. (Ed.), Progress in Social Psychol-
ogy, Erlbaum, Hillsdale, NJ, pp.49±72.
Wilkin, N.E., 1996. An Empirical Investigation of Practical Reasoning
in the Construction of Beliefs Regarding Medication by Arthritis
Patients, Doctoral Dissertation, University of Maryland, Balti-
more.
V. Ramesh is an Assistant Professor in the Department of Accountingand Information Systems, Kelley School of Business at Indiana Uni-versity. His research interests are in heterogeneous databases, databasemodeling, and group support systems. His papers have been publishedin ACM Transactions on Information Systems, IEEE Expert, Infor-mation Systems and other journals. He received his Ph.D. in BusinessAdministration (MIS) from the University of Arizona. He also holds aM.S. in Computer Science from the University of Iowa and a B.E. inComputer Science from the Birla Institute of Technology, Mesra(Ranchi), India.
Glenn J. Browne received his Ph.D. in MIS and Decision Sciences fromthe University of Minnesota. His research interests include systemsdevelopment, semantic modeling, and basic decision-making processes.His papers have appeared in Management Science and other journals.
232 V. Ramesh, G.J. Browne / The Journal of Systems and Software 45 (1999) 225±232