Evaluating the Usability of a Professional Modeling ToolRepurposed for Middle School Learning
Vanessa L. Peters • Nancy Butler Songer
Published online: 23 October 2012
� Springer Science+Business Media New York 2012
Abstract This paper reports the results of a three-stage
usability test of a modeling tool designed to support
learners’ deep understanding of the impacts of climate
change on ecosystems. The design process involved
repurposing an existing modeling technology used by
professional scientists into a learning tool specifically
designed for middle school students. To evaluate usability,
we analyzed students’ task performance and task comple-
tion time as they worked on an activity with the repurposed
modeling technology. In stage 1, we conducted remote
testing of an early modeling prototype with urban middle
school students (n = 84). In stages 2 and 3, we used
screencasting software to record students’ mouse and
keyboard movements during collaborative think-alouds
(n = 22) and conducted a qualitative analysis of their peer
discussions. Taken together, the study findings revealed
two kinds of usability issues that interfered with students’
productive use of the tool: issues related to the use of data
and information, and issues related to the use of the
modeling technology. The study findings resulted in design
improvements that led to stronger usability outcomes and
higher task performance among students. In this paper, we
describe our methods for usability testing, our research
findings, and our design solutions for supporting students’
use of the modeling technology and use of data. The paper
concludes with implications for the design and study of
modeling technologies for science learning.
Keywords Usability � Computer modeling �Learning technologies � GIS � Climate change
Introduction
Recent policy documents in the United States, such as the
Framework for K-12 Science Education (NRC 2012), point to a
new sense of urgency on science, technology, engineering, and
mathematics (STEM) education. These and other policy doc-
uments highlight the importance of integrating complex tech-
nologies in teaching students how scientific knowledge is
developed and applied. Complex technologies are routinely
used in most scientific domains, and they are a fundamental part
of developing new forms of knowledge. Indeed, some of our
most pressing scientific problems, such as predicting the
impacts of global climate change on humans and other organ-
isms, can only be studied with the use of sophisticated com-
puter-based modeling systems. In STEM education, using
models and modeled data is identified as critical practices that
are needed for constructing arguments or making predictions
with scientific evidence (College Board 2009; NRC 2012).
To prepare a citizenry of informed science consumers for
both personal and civic decision-making, STEM education
must also provide opportunities for developing a deep
understanding of fundamental knowledge in science and
engineering. According to the National Research Council
(NRC 2012), this fundamental knowledge consists of three
dimensions that are inextricably linked with one another.
These dimensions include the following: (1) Practices—the
work used by scientists and engineers for investigating
questions, solving problems, and building models and the-
ories about the world; (2) Crosscutting concepts—large
themes that have applications across many domains of sci-
ence and engineering (e.g., cause and effect); and (3) Core
disciplinary ideas—the fundamental principles and disci-
plinary knowledge in science and engineering (NRC 2012).
In the Framework for K-12 Science Education, the integra-
tion of these three dimensions of knowledge is expressed in
V. L. Peters (&) � N. B. Songer
School of Education, University of Michigan, 610 East
University Avenue, Ann Arbor, MI 48109, USA
e-mail: [email protected]
123
J Sci Educ Technol (2013) 22:681–696
DOI 10.1007/s10956-012-9422-8
performance expectations. Table 1 presents examples of two
performance expectations that are associated with climate
change and the history of the Earth (NRC 2012). Beside each
example are the practices, disciplinary core ideas, and
crosscutting concepts that are associated with that perfor-
mance expectation.
While the framework presents an articulate goal for fos-
tering tridimensional-fused knowledge, it provides little
information on how to design curricular or technological
resources that support the development of fused knowledge.
Indeed, the document states that the research on the design of
resources is beyond the scope of their work (NRC 2012).
The goal of the current study was to gather empirical
data on the design of a modeling tool designed to foster
fused knowledge associated with the impact of climate
change on ecosystems. In particular, we focus on the fol-
lowing dimensions of science knowledge:
Practices: The use of models to make predictions and
develop explanations about the effects of climate change
on natural phenomenon
Crosscutting concepts: Scale, proportion, and quantity
Core disciplinary ideas: Global climate change,
weather, and climate
The study was driven by the following research
questions:
1. What kinds of usability problems do students experi-
ence when working with a professional modeling tool
that has been redesigned to foster fused knowledge
about climate change impacts?
2. What are the most important design features to consider
when repurposing a professionalmodeling tool that supports
students’ development of fused knowledge in science?
Our design resource was a professional Geographic
Information System (GIS) tool called Lifemapper (http://
www.lifemapper.org). Lifemapper provides predictive dis-
tribution modeling information and modeled data for pro-
fessional scientists’ predictions about the possible impacts of
a changing climate on the environment. In the first phase of
our work, the Lifemapper tool was redesigned and simplified
to support middle school students’ learning about the effects
of global climate change on species distribution.
Our research study included three cycles of iterative
design and usability evaluation of our middle school
modeling tool. In this paper, we report our research find-
ings and describe our interface design solutions for sub-
sequent development work. We conclude with a discussion
about the role of modeling technologies for supporting
middle school students’ development of fused knowledge.
Fused Knowledge About Climate Modeling
In STEM education, providing students with opportunities
to experience the practice of modeling is important for
preparing them to become informed decision-makers:
Science and engineering affect diverse domains—agri-
culture, medicine, housing, transportation … modeling
… can help provide insight into the consequences of
actions beyond the scale of place, time or system com-
plexity that individual human judgments can readily
encompass, thereby informing both personal and soci-
etal decision-making. (NRC 2012, p. 212).
Using models (both computerized and non-computer-
ized) for making predictions about the natural world
requires a sophisticated understanding of all three dimen-
sions of science knowledge. For example, to make a pre-
diction that answers the question, ‘‘Will a future climate
scenario impact the distribution of the red squirrel?’’ stu-
dents will require an understanding of the following
knowledge dimensions:
Practice: Students must understand what a model is, and
how modeled data are used as evidence when answering
scientific questions
Crosscutting concepts: Students must understand scale
and proportion when applying modeled data about species
distribution and future scenarios
Table 1 Sample performance expectations focused on climate
change and Earth history
Performance
expectation
Practices Disciplinary
core ideas
Crosscutting
concepts
HS.ESS-CC climate
change
e. Use global climate
models in
combination with
other geologic data
to predict and
explain how human
activities and
natural phenomena
affect climate,
providing the
scientific basis for
planning for
humanity’s future
needs
Developing
and using
models
Weather and
climate;
global
climate
change
Cause and
effect;
stability
and change
MS.ESS-HE the
history of Earth
b. Use models of the
geological
timescale in order
to organize major
events in Earth’s
history
Developing
and using
models
The history of
the Planet
Earth
Scale,
proportion,
and
quantity
682 J Sci Educ Technol (2013) 22:681–696
123
Core disciplinary ideas: Students must understand that if
Earth’s mean temperature continues to rise, the lives of humans
and other organisms will be affected in many different ways.
To truly appreciate the nature of scientific knowledge, it is
important that students experience the practice of science
while learning core disciplinary ideas and concepts (NRC
2012). With the use of advanced computer models, students
can develop an understanding of the consequences of human
activity on natural systems by analyzing climate data and
species distribution data that are represented in models.
Although modeling technologies provide a valuable tool for
fostering fused knowledge about science, they often come with
an associated cost. Learning a new technology can take a tre-
mendous amount of time, taking away valuable class time that
would otherwise be available for learning. Science teachers, in
particular, are already challenged to cover the curriculum
content standards within their allotted class time. When learn-
ing technologies present usability problems, it becomes even
more difficult to promote students’ fused understanding of core
disciplinary ideas and modeling. In industry, usability testing is
standard, and significant resources are spent on obtaining user
feedback from the target audience. In education, however, few
studies have specifically investigated how K-12 students
engage and interact with complex technologies (Shapiro 2008).
More often than not, efforts to design computer-based learning
tools have relied on developers’ assumptions of how students
engage with digital material (Nielsen 2010).
For many years, professional scientists have used models to
visually represent data and make sound predictions in scien-
tific experimentation. Models can be described as analogous
representations of real-world systems that enable scientists to
test hypotheses about non-observable phenomena (Lehrer and
Schauble 2006). In STEM education, the use of computerized
models allows students to participate more deeply in the
practice of science and supports their deep understanding of
core content and prediction making that are not feasible
through other means (Baker and White 2003; Edelson and
Gordin 1998; Kaplan and Black 2003). Justifying predictions
and reasoning with data are a central aspect of using models to
investigate real-world problems. The NRC (2012) recom-
mends that ‘‘… more sophisticated types of models should
increasingly be used across the grades, both in instruction and
curriculum materials, as students progress through their sci-
ence education.’’ (pp. 3–9). In the classroom, educators have
used computer-based models to teach students about various
scientific phenomena including physical processes (e.g.,
molecular diffusion), Earth systems (e.g., plate tectonics),
chemical reactions (e.g., photosynthesis), and celestial
astronomy (e.g., planetary formation). Recent advances in
computer technology and increases in computer processing
capacity provide new opportunities for model-based scientific
experiments that, if used appropriately, can be adapted for
classroom instruction.
Few would argue of the potential of sophisticated mod-
eling tools for teaching students science. However, to be
productive for learning, the technology must be designed in a
way that is appropriate and accessible for learners (Edelson
and Gordin 1998). Using modeling technologies effectively
for teaching science requires a carefully designed resource
that is sensitive to the audience and the purpose of learning
(Soloway et al. 1994). Modeling tools that project geospatial
data may be particularly difficult for students to understand.
The digital representation of maps, for example, can be
confusing for students because maps often combine different
data sets such as species distribution, temperature, and
landscape features that must be analyzed together. To sup-
port a deep understanding of science, modeling resources
need to be purpose-driven and appropriate for the topic
(Taylor et al. 2003), and structured in ways that support a
meaningful progression of science content knowledge
(Alibrandi 2003). Learners also benefit when models are
revisited and practiced multiple times in a curriculum (Baker
2005), and when their understanding about phenomena are
transferred to new contexts and situations (Pallant and Tin-
ker 2004; Schwarz and White 2005).
Researchers have uncovered several strategies for lever-
aging the affordances of modeling tools. Analogies, for
example, can be useful for helping students make sense of new
scientific ideas (Coll et al. 2005). Students’ experimentation
strategies can become more explicit by manipulating indi-
vidual variables associated with modeled data (Varma and
Linn 2011). Collins (2011) describes an epistemological
framework for models that scaffolds students in forming
hypothesis, identifying variables, and evaluating alternative
explanations when constructing theories. Researchers have
also used peer critique (Gobert and Pallant 2004) and three-
dimensional representations (Keating et al. 2002) as mecha-
nisms for fostering a more sophisticated understanding of
models. More recently, researchers have explored how multi-
agent-based models can support students’ understanding of
complex emergent processes by building on their prior
knowledge about systems phenomena (Sengupta and Wilen-
sky 2011). As modeling technologies continue to assume a
central role in knowledge development, it will become even
more important to develop coherent curricular materials and
instructional strategies for working with complex modeling
tools.
Usability for Learning
Like any technology, a modeling tool will only be purposeful
if it is usable for the target audience. Many K-12 students,
especially those in younger grades, lack the prior knowledge
and experience with abstract representations that are required
for making sense of a complicated modeling interface. Along
with conceptual challenges and infrastructure compatibility
J Sci Educ Technol (2013) 22:681–696 683
123
issues, usability is considered to be one of the biggest chal-
lenges to using modeling technologies in the classroom
(Edelson 2004).
Several researchers have investigated the role of usability in
developing educational technologies. Crowther et al. (2004)
report a case study that examines the relationship between
usability testing and educational assessment, explaining how
the former is essential to improving the quality and effective-
ness of computer-supported learning and instruction. They
point out that instructional environments that are ill designed
are unlikely to have much pedagogical value and that early
usability tests can avoid time-consuming and costly modifica-
tions or replacements of technology. Unfortunately, however,
most usability testing in educational studies is conducted on
fully developed products, which lessens the potential for
usability findings to improve the design and functionality of the
learning technology (Johnson et al. 2007; Sullivan 1989).
Usability problems can compromise productive learning in a
number of ways. Poorly designed interfaces are distracting and
can interrupt the completion of a task, resulting in user frus-
tration and anxiety with computers (Shneiderman and Beder-
son 2005). Research on computer use in the workplace has
shown that up to fifty percent of workers’ computer time is
spent dealing with frustrating experiences (Ceaparu et al. 2004).
When frustration happens on a regular basis, users waste large
amounts of time and feel helpless and resigned in completing
their tasks. When responding to frustrating situations, users are
more likely to ask someone for help rather than consult a
manual or online help guide (Ceaparu et al. 2004). Common
causes of computer frustration include system error messages,
slow Internet connections, application crashes, and unpredict-
able and unclear interface features (Lazar et al. 2006; Preece
et al. 2002).
Frustration with computers is particularly detrimental to
student learning and can lead to maladaptive behavior that
lowers motivation and goal-oriented performance (Shorkey and
Crocker 1981), thus diminishing the time spent on relevant
tasks. To move forward with learning, students need to work
unencumbered in their activities in a state of uninterrupted
flow and sustained concentration (Csikszentmihalyi 1997).
Addressing usability issues is a clear opportunity for reducing
episodes of frustration and increasing student productivity as
the benefits are immediate and are applicable to most users
(Nielsen 2003). Moreover, even small interface design changes
have been shown to have a significant positive impact on the
usability and functionality of a technology (Benyon 1993) and
on the level of user enjoyment (Sim et al. 2006).
When evaluating the usability of a learning technology,
it is important that testing be performed on users from the
target audience. The ideal sample size for a usability study
has been investigated by a number of researchers. Virzi
(1992), for example, used Monte Carlo simulations and
data from three large usability evaluations to compute the
probability of problem-finding by participants. Virzi’s
study revealed that over 80 % of usability problems could
be identified in evaluations with only four to five study
participants and that additional participants did not increase
the likelihood of identifying additional problems. Further-
more, it was found that it was the first few participants in a
study that discovered the most serious usability problems.
Based on these and other study findings (e.g., Nielsen and
Landauer 1993; Turner et al. 2006), usability experts agree
that an iterative study design with a small sample size is
ideal for usability evaluations.
Research Context
The modeling tool evaluated in this study was developed as
part of a larger effort funded by the National Science Foun-
dation to develop dynamic, age-appropriate modeling tools,
curriculum units, and assessment instruments to foster fused
knowledge focused on the impacts of global climate change.
The work began with the development of a learning pro-
gression (Songer 2006; Songer et al. 2009) that articulates the
sequence of core disciplinary ideas (e.g., weather and climate,
Earth history), crosscutting concepts (e.g., scale, proportion,
and quantity; system models), and practices (e.g., use of
models to make predictions) that should be emphasized and
revisited throughout a multi-week curricular unit.
Conceptualizing the knowledge for our learning progres-
sions and curricular units has been a particularly challenging
part of the development process because of the need to bring
together core disciplinary ideas from different subject areas
into a coherent sequence of activities focused on climate
change and its impacts. For example, our learning progression
includes science topics that are often taught separately in units
of chemistry, biodiversity, ecology, and atmospheric science.
Additionally, we have found that although many states
address climate change and climate change impacts within
their state standards, students poorly understand the core
disciplinary ideas and they are rarely, if ever, discussed rela-
tive to each other (e.g., Michigan GLCEs v. 1.09). Our cur-
ricular decisions involved both the selection of the dimensions
of the knowledge to emphasize in our activities (the what) and
our plans for presenting them (the how) so they may be best
understood by teachers and students.
In addition to identifying the essential knowledge to
include in our learning progression, we were challenged by
the strategic simplification of the material so that it was well
suited for middle school students. Prioritizing knowledge
was important so that students could have enough time on
each core idea to support deep understandings about climate
change and its impacts. Our challenge was confounded by
the amount of relevant material, as many scientific ideas are
interrelated and deeply connected to the topic of climate
684 J Sci Educ Technol (2013) 22:681–696
123
change. It would be undesirable, for example, for a climate
change biology curriculum to not include the carbon cycle,
and yet to deeply examine the nature of carbon involves a
foray into chemistry that may not be within the scope of a
particular curricular program. We were also challenged by
the complexity of the science. For example, the driving
factors used by the Intergovernmental Panel on Climate
Change (IPCC) for determining the various future climate
scenarios include economic, scientific, and sociocultural
dimensions (see IPCC 2007). Our decisions are, therefore,
both dynamic and ongoing, involving much negotiation
between scientists, educational researchers, and the tech-
nology specialists who were involved in developing our
modeling tool.
Repurposing a Professional Modeling Tool
An important part of curriculum development involved re-
purposing the GIS system Lifemapper (Beach et al. 2002)
into a web-based learning environment called SPECIES
(Students Predicting the Effects of Climate In EcoSystems).
Central to the SPECIES technology is a learner-focused
predicted distribution modeling (PDM) tool for teaching
students about the possible impacts of climate change on
species distribution. PDM is an innovative GIS-based
method used to produce current and predictive maps of
where elements (i.e., species, ecological elements) are likely
to occur and not occur under different predicted climate
scenarios. Modeling species distribution data is itself a
complex process, made all the more so by the complicated
GIS technologies that are typically used by scientists (e.g.,
SAGA, Quantum). Since learning these programs can be
daunting even for adults, we needed to repurpose Lifemapper
into a modeling tool that would be compatible with sophis-
ticated geospatial data, yet simple enough to be easily nav-
igated by middle school students. Working with a technology
developer, we created a customized mapping tool using the
platform MapServer and the open source JavaScript code
from OpenLayers. The result was an interactive Google-like
map application that could be used with authentic species
distribution data and the IPCC future climate scenarios.
To produce climate models, Lifemapper leverages mas-
sive caches of online geospatial occurrence data to create
maps for current and future predictions of different animal
species. Lifemapper does this by combining data on species
observations (i.e., known locations of where animals live)
with environmental data (e.g., precipitation and temperature
data) to create map models that predict where animals can
live in the future based on where they are known to live now.
Repurposing these data into digestible forms required
modifications to the database of species and niche distribu-
tions to support the future climate scenarios developed by the
IPCC and that are used in our curriculum. To format the data
for our modeling tool, the Lifemapper system required web
server refactoring, database redesign, geographic informa-
tion system scripting, and new server hardware for sup-
porting our data needs. Although data preparation is a
behind-the-scenes part of the modeling tool development
process, it is a critical one given that importing and exporting
GIS data are among the biggest challenges to using modeling
systems in schools (Edelson 2004).
Method
For our usability evaluation, we used an iterative study
design based on recommendations from experts in the fields
of computer engineering and interface design (e.g., Bury
1984; Nielsen 1993). In an iterative study design, usability
evaluation coincides with different stages of the technology
development process. In test cases of usability studies, an
iterative evaluation approach was shown to improve overall
usability by 165 %, with an average increase of 38 % per
iteration (Nielsen 1993). Unlike a pilot test, an iterative study
design is not a small-scale replication of a larger research
project, but a stand-alone study based on cycles of evaluation
and refinement of a technology innovation.
Participants and Data Sources
We evaluated the SPECIES modeling tool in three stages of
the development process. In stage 1, students completed an
online survey after completing a series of tasks using a
modeling tool prototype (n = 84). In stage 2, we collected
screencast video data using collaborative think-alouds as
students worked with a fully functional version of the SPE-
CIES modeling tool (n = 8). In stage 3, after a redesign of
the interface, we collected additional screencast data from a
second round of student think-alouds (n = 14) and per-
formed a qualitative analysis of students’ think-aloud dis-
cussions. The study procedure and data analysis for stages
1–3 are described in further detail in their respective sections.
Materials
We used the screencasting software ScreenFlowTM to record
students’ moment-to-moment computer interactions as they
completed tasks with the modeling tool. Screencasts are video
recordings of a computer monitor; they record a user’s mouse
and keyboard actions and keep time logs of all on-screen
activity. The software uses the computer’s built-in video
camera and microphone for recording, which enabled us to
capture students’ facial expressions and dialog as they com-
pleted tasks with the modeling tool. The screencast videos
provided us with an in-depth view of how students were
engaging with the technology, and identified where in the task
J Sci Educ Technol (2013) 22:681–696 685
123
sequence students were experiencing difficulties. Figure 1
shows a screenshot of ScreenFlow’sTM interface, including
the video annotation tool (circled) and audio timeline used for
data analysis.
Stage 1: Remote Testing of Modeling Prototype
The goal of the first stage of usability testing was to gain a
broad overview of how students worked with data overlays in
the modeling interface. To achieve this, it was necessary to
obtain feedback from a larger number of users than is typically
collected in usability evaluations. Evaluating a technology
early in the development cycle is particularly valuable, since
early feedback still has potential for altering the course of the
design (Nielsen 2003). In our case, the design of the modeling
tool would inform the design of the associated curricular
activities that were simultaneously under development.
One of our first design decisions involved the selection
of data layers to include in the modeling tool. Although it
was desirable to let students choose from a wide range of
data sets (e.g., animal species, biomes, and cover, etc.), we
felt too many options might be distracting for students and
impede their completion of the activity. To learn how
students would interact with multiple data layers, we
developed a prototype map activity using Google maps and
MapServer. Since the Lifemapper GIS data were not yet
available, we created mock data layers for the purpose of
usability testing (see Fig. 2). These data layers were not
based on authentic geospatial data, but rather were simple
colored overlays that were deliberately designed in a way
that required students to access different areas of the map.
In the modeling interface, the colored data layers were
labeled to represent three different tree types: deciduous,
coniferous, and pine forest.
Procedure
To recruit student participants, we sent an email to several
middle school teachers inviting them to participate in the
study. Four teachers replied and agreed to implement the
online modeling activity in their classrooms.
Before beginning the modeling activity, students were
presented with written instructions for navigating the map
interface. Using the modeling prototype, students were
asked to complete four tasks using the tree distribution data
layers in the map application. Students were then asked to
answer five questions about the distribution of trees in the
United States (all states were clearly labeled on the map).
For example, one question asked students, ‘‘Can you find
coniferous trees in Texas?’’ To answer this question cor-
rectly, students had to first click on the appropriate data
layer in the interface (i.e., coniferous trees) and then use
the navigation and panning features to see if the coniferous
tree data layer covered all or part of Texas. After answering
the tree questions, students were presented with four Likert
scale questions that asked them about their experience
using the modeling tool. The Likert scale questions asked
Fig. 1 Screenshot of screencasting software used for student think-alouds
686 J Sci Educ Technol (2013) 22:681–696
123
students to rate their agreement to a statement using a five-
point scale (strongly agree, agree, neutral, disagree, and
strongly disagree).
Analysis
A total of 84 students completed the activity using the mod-
eling tool prototype. The data from all four classes were
compiled and analyzed for task completion and accuracy
based on students’ responses to the tree distribution questions.
Each question was coded as either ‘‘correct’’ or ‘‘incorrect’’;
questions that were left blank were coded as ‘‘not attempted.’’
All data were tabulated and summarized with descriptive
statistics. Since there was no reliable method for determining
how long students spent on each task, stage 1 did not include
an analysis of task completion time.
Results
Students’ answers to the tree distribution questions suggested
they experienced some usability challenges when using the
map data layers. Since the questions were straightforward
with only one correct response, we expected a relatively high
accuracy across the four tasks. However, as shown in Fig. 3,
the percentage of correct responses was low, with a combined
mean of 55.29 % (SD = 10.08) for all four classes.
Students’ task performance level was not consistent with
their reported experiences of using the modeling tool.
When asked about the instructions for navigating the map
interface, the majority of students (82.1 %) agreed or
strongly agreed that the navigation instructions were easy
to understand. In addition, approximately three-quarters of
students (70.2 %) agreed or strongly agreed to having no
problems when finding the information they needed to
answer the tree distribution questions. Based on these early
findings, we decided that in subsequent development work,
we would limit the number of available data sets that
students could work with at any one time to no more than
four.
Stage 2: Alpha Testing of Modeling Tool
In stage 2, we evaluated a fully functional version of the
modeling tool that used the authentic species distribution
data provided by Lifemapper. Although remote testing in
stage 1 enabled us to obtain feedback from a large number
of students, it did not provide us with a detailed account of
how students were interacting with the modeling technol-
ogy. In order to obtain a more in-depth view of usability,
we conducted collaborative think-alouds as a method for
capturing students’ use of the modeling tool.
Our approach to collaborative think-alouds was similar to
Miyake’s (1986) method of constructive interaction. In this
method, two or more students are observed as they work
together to solve a problem or complete a task, without
interruption from the researcher. This approach has several
advantages over traditional think-alouds, where individuals
are asked to verbalize their thinking while working through
some activity. First, students are not required to voice their
decision-making processes, an action that is unnatural for
most people and especially for children. Second, asking a
Fig. 2 Mock data overlays used
for remote testing of modeling
tool prototype
0
20
40
60
80
100
Task 1 Task 2 Task 3 Task 4
Fre
qu
ency
(%
)
Task Completion and Accuracy in Stage 1
Correct IncorrectNot Attempted
Fig. 3 Task completion and accuracy in stage 1 of usability testing
J Sci Educ Technol (2013) 22:681–696 687
123
peer for help is something that most students are inclined to
do anyway. Third, the dialog that takes place between stu-
dents as they work together can provide valuable insight into
their collaborative interactions.
One disadvantage to this approach is the possibility of the
‘‘Hawthorne effect’’ (Landsberger 1958), where participants
alter their behavior on account of their awareness of being
observed. Originally, we intended to take observation notes
during the collaborative think-alouds; however, after the first
think-aloud began, it became apparent that students were
uncomfortable with a researcher in the room. Since it was
important to document authentic interaction with the mod-
eling tool, we decided that students could work alone in the
classroom with their peer. Since we used the build-in video
cameras in the laptops (as opposed to an external video
camera mounted on a tripod) as a recording device, the study
environment was less invasive, which further increased the
candidness of students’ behavior. This approach to data
collection provided a clear window into how students were
engaging with the modeling tool.
Procedure
Eight students from an urban public middle school participated
in the first round of collaborative think-alouds. Students worked
on individual laptops in a separate room, away from their
classmates and the teacher. The researcher introduced the
activity and provided students with written instructions for
using the modeling tool. To reduce the amount of background
knowledge needed in order to complete the tasks, we used
simple environmental data layers (i.e., temperature and pre-
cipitation) in the modeling activity. Working in pairs, each
student was asked to complete the following three tasks using
the modeling tool:
Task 1: Using the drawing tool, select any city in the
United States and draw a circle around it. What is the
average annual temperature of your chosen city?
Task 2: Using the drawing tool, circle the area on the
map where the average annual temperature is between 3
and 18 �C.
Task 3: Using the drawing tool, circle the area on the
map where the average annual precipitation is between 60
and 240 cm.
Analysis
Task Performance
We measured students’ task performance by analyzing their
mouse and keyboard movements as recorded in the screencast
videos. When working on a task, students had to perform
certain computer actions to complete the task correctly, such
as clicking on a particular data layer or drawing a circle on the
map. For this reason, task performance was based on video
evidence of students having completed the necessary com-
puter actions. For each task, we assigned one of the following
scores: ‘‘correct’’ (the student performed all the necessary
actions and completed the task correctly), ‘‘incorrect’’ (the
student performed the necessary actions, but did not complete
the task correctly), and ‘‘not attempted’’ (the student did not
perform any actions at all). To provide an example, for task 2
to be scored correct, the student had to click the temperature
layer, select the drawing tool, and draw a line around the
colored area on the map where the average annual temperature
was between 3 and 18 �C. However, if the student had selected
the precipitation layer instead of the temperature layer, or if
they selected the temperature layer but circled the wrong
temperature range, then task 2 would be scored as incorrect.
The task would be scored as not attempted had the student not
performed any actions toward completing the task at all.
Task Completion Time
We measured task completion time by analyzing the
screencast time log, which keeps track of all onscreen
activity in hours, minutes, and seconds. Using the time logs,
it was possible to determine how much time students spent
performing actions that were both necessary and unneces-
sary for completing a task. Necessary actions were those that
were relevant to the task, such as panning or navigating the
map interface, turning on a data layer, or using the drawing
tool. Unnecessary actions were considered non-relevant to
the task, for example, opening and closing the browser
window, using other applications on the computer, or surfing
the Internet. The following measures were used when cal-
culating students’ task completion time:
Computer session: The total amount of time a student
spent at the computer, regardless of what he or she was
doing.
Task start time: The first instance when a student used
the mouse or keyboard to perform an action that was rel-
evant to completing the task (e.g., clicking a data layer or
drawing a circle on the map interface).
Task end time: The last instance when a student used the
mouse or keyboard to perform an action that was relevant
to completing the task. Task end time was also signaled
when a student completed the task (either correctly or
incorrectly), when they moved onto another task, or when
they left the computer or classroom.
Total task time: Total task time was the sum of all relevant
onscreen activity that a student performed toward completing the
three tasks. When analyzing the time log, any periods of com-
puter inactivity (i.e., no onscreen movement) or interruptions to
relevant activity of 5 or more seconds were excluded from the
total task time calculation. In other words, total task time was not
determined from a single start and end time, but from summing
688 J Sci Educ Technol (2013) 22:681–696
123
multiple intervals of relevant onscreen activity. The only
exception was when students were reading the activity instruc-
tions or when they were discussing a task with their peer. Both of
these actions were considered as being relevant toward the
completion of a task.
Task completion time (%): Task completion time was the
percentage of overall computing time that was spent perform-
ing actions that were relevant to the tasks. Task completion time
was calculated by dividing the total amount of time a student
spent at the computer (Computer session) with the total amount
of time spent completing tasks (Total task time).
Results
Task Performance
Students’ task performance results suggested they were still
experiencing some usability problems with the modeling
tool. For all three tasks, both the completion and accuracy
rates were relatively low (Fig. 4). For example, in task 1,
37.5 % of students were successful in circling a city with
the drawing tool and noting the correct average annual
temperature range for that city. For tasks 2 and 3, none of
the students could successfully use the drawing tool to
circle either the correct average temperature or precipita-
tion area for a specific range. Moreover, after working on
task 1, only 12.5 % of students went on to attempt task 2,
and no students went on to attempt task 3.
Data from the screencast videos revealed that students’
task performance and completion rates were associated
with their use of the interactive tools in the modeling
interface. In task 1, which involved circling a city and
reporting the average annual temperature, students had no
problems locating a city on the map. They did, however,
have difficulties using the polygon tool for drawing a circle
around the city. Polygon tools are typical in GIS systems,
but they can be awkward to use as they rotate around a
central pivot point and require coordinated mouse move-
ments of clicking and dragging. Additionally, a polygon
tool creates shapes that include shading, which increased
the complexity of the map interface by adding another
color layer. Students also struggled with the checkbox
feature when selecting the temperature and precipitation
data layers for viewing on the map. Geospatial projections,
including those used by our modeling tool, involve large
data files that typically take several seconds to render on a
digital map. If the temperature and precipitation data were
not immediately visible to students after clicking the
checkbox, they would keep clicking repeatedly. Students
did not realize that by doing this they were essentially
turning the data layers on and off, which only further slo-
wed the loading of the map projections and added to their
frustration.
Task Completion Time
Students spent an average of 28.52 % (SD = 23.51 %) of their
total computer time on actions that were relevant to completing
the tasks (Table 2). Tasks that took longer to complete were
those that had higher performance scores. For example, stu-
dents spent an average of 3 min and 11 s working on task 1
(where 62.5 % of students attempted the task), 14 s on task 2
(where 12.5 % of students attempted the task), and zero time
on task 3 (where no student attempted the task). The percentage
of total computer time spent on non-relevant task actions was
71.48 %.
As a result of these findings, we made some design modi-
fications to the modeling interface before continuing with
usability testing. Specifically, we changed the checkbox tool
in the data panel to large on/off slider buttons similar to those
used in iPhones. In addition, we switched the polygon tool for
a more intuitive free-form drawing tool that could outline an
area on the map without adding any shading. Figure 5 high-
lights the interactive modeling tools that were used in the
alpha version in stage 2 of usability testing; Fig. 6 highlights
the changes that were made to improve usability in the beta
version that was evaluated in stage 3.
Stage 3: Beta Testing of Modeling Tool
Procedure
In stage 3, we conducted a second round of collaborative
think-alouds following the same procedure described in
stage 2. Fourteen different students completed the same
three tasks using the redesigned interactive modeling tools.
Analysis
Task Performance and Completion Time
As in stage 2, we measured task performance and completion
time using the screencast videos and time logs collected
during student think-alouds. Both task performance and task
0
20
40
60
80
100
Task 1 Task 2 Task 3
Fre
qu
ency
(%
)
Task Completion and Accuracy
CorrectIncorrectNot attempted
Fig. 4 Task completion and accuracy in stage 2 of usability testing
J Sci Educ Technol (2013) 22:681–696 689
123
completion time were analyzed using the same metrics that
were described in stage 2.
Qualitative Analysis of Think-Aloud Discussions
To triangulate our study findings, we conducted a qualitative
analysis of students’ think-aloud discussions. To determine
whether there were additional usability problems not related
to the modeling interface, we developed a coding scheme
that reflected the use of models as a practice and the use of
modeled data for making inferences about climate (Table 3).
After transcribing the discussions, we segmented the data
into meaning units before completing several iterations of
coding. The unit of analysis, the meaning unit, was defined as
a sentence or a set of adjacent sentences that were part of the
same thought or idea (Miles and Huberman 1994). Students’
think-aloud discussions from stages 2 and 3 were included in
the qualitative analysis.
The coding rubric was organized into two categories of
observed errors that represented the practice and core disci-
plinary idea related to the use of models to explain and predict
the impacts of climate change on ecosystems. The first cate-
gory, called Modeling Technology, refers to the use or misuse
of the online modeling features, such as the drawing tool and
map zooming feature. We categorize the knowledge needed to
overcome this error type as largely associated with the practice
of scientific modeling (i.e., the correct use of models). The
second category, called Modeling Data, refers to the use or
misuse of the data represented in the model; for example,
being able to determine a specific region based on inferring
data values between legend intervals. We categorize the
knowledge needed to overcome this error type as including
both content and the ability to interpret data. Two independent
researchers coded the discussion transcripts, with an inter-
rater reliability rate of 0.91 (Cohen’s Kappa).
The coding process was guided by the question: What is the
source or context of the usability problem students faced when
working with the modeling tool? Prior to coding, we established
criteria for code assignment for the purpose of inter-rater reli-
ability. When considering a piece of data, we asked ourselves
the following: What specific action is the student performing
when experiencing the usability problem? What is he or she
trying to accomplish in that action? Since the screencasting
software recorded students’ dialog, it was possible to connect
students’ mouse and keyboard movements to precise moments
during their think-aloud discussions. This allowed us to analyze
the discourse at specific time points in the task sequence,
including those times when students were performing actions
that were non-relevant to completing the tasks.
The purpose of creating two coding categories was to
identify usability problems that could interfere with stu-
dents’ fused knowledge development about the use of
models for predicting the impacts of climate change. Early
in the coding process, it became evident that there was
much overlap between codes, and there were few instances
when only one code category was relevant to the data. As a
result of this overlap, we adjusted our coding protocol to
allow data segments to be coded with one or more codes
from either of the code categories (Modeling Technology
and Modeling Data). Table 4 presents examples of coded
data that were assigned codes from both categories.
After the discussion transcripts were coded, we looked
for meaningful relationships among the assigned codes. For
example, was the DAT-RNG (data range) code assigned
Table 2 Students’ task completion time in stage 2 (in min:s)
Min. Max. Mean SD
Task 1 time 00:00 09:47 03:11 03:08
Task 2 time 00:00 01:28 00:14 00:31
Task 3 time 00:00 00:00 00:00 00:00
Total task time 00:00 12:19 03:34 03:54
Computer session 05:55 21:15 11:13 05:44
Task completion time (%) 00.00 57.96 28.52 23.51
Fig. 5 Interactive tools in alpha
version of modeling interface
(stage 2)
690 J Sci Educ Technol (2013) 22:681–696
123
more or less frequently with certain codes rather than
others? And if so, what could this tell us? This line of
inquiry was ultimately not helpful, as it did not provide
insight into the source or context of a potential usability
problem. It did, however, highlight the fact that codes from
the Modeling Data category were assigned with much more
frequency than codes from the Modeling Technology
category.
Results
Task Performance
After modifying the interactive tools, there was a signifi-
cant improvement in both task completion and accuracy for
task 1 (p = .041, Fisher’s exact test) and task 3 (p = .018,
Fisher’s exact test) (Fig. 7). Unlike in the previous stage of
usability testing, the majority of stage 3 students (78.6 %)
were successful in locating a city on the map and circling it
with the free-form drawing tool (task 1). When working on
task 2 (circling a temperature range on the map), 21.4 % of
students circled the correct average temperature range,
28.6 % circled an incorrect average temperature range, and
50 % did not attempt the task. Task 3 performance was
similar, with the same number of students (28.6 %) cor-
rectly and incorrectly circling the specified average pre-
cipitation range, and 42.8 % not attempting the task at all.
Task Completion Time
Students in stage 3 were able to complete the same tasks in
less time than students in stage 2, a difference which was
statistically significant t(20) = 2.265, p = .035 (Table 5).
Students who completed task 1 did so quickly, spending an
average of 33 s working on the task (as opposed to 3 min,
11 s as in stage 2). Half of the students (50 %) completed
task 2 in an average of 48 s, a 37.5 % increase from the
previous stage. On average, students in stage 3 spent 83.64 %
(SD = 9.19) of their total computer time on actions that were
not relevant to completing the modeling tasks.
Qualitative Analysis of Think-Aloud Discussions
Coding for errors resulted in the identification of two task
objectives that were related to using both the modeling
technology and the modeling data. These were navigating
to a specific area within the map and reporting the infor-
mation that was represented in the map. These task
objectives were the underlying cause of many of students’
usability problems and were relevant to fused knowledge
development about the practice of modeling and the ideas
and data that are important for understanding climate
change. When analyzing the think-aloud discussions and
screencast videos together, the task objectives could be
linked to the crosscutting concept of scale, proportion, and
quantity and that required application for the students to
complete the modeling tasks correctly.
Fig. 6 Interactive tools in beta
version of modeling interface
(stage 3)
Table 3 Coding rubric for identifying the source of usability
problems
Modeling technology
codes
Modeling data
codes
Description Code Description Code
TEC: map zooming TEC-
MZOOM
DAT: average
temp.
DAT-
AVGT
TEC: map panning TEC-
MPAN
DAT: average
precip.
DAT-
AVGP
TEC: page nav. TEC-
PGNAV
DAT: map
scaling
DAT-
MSCL
TEC: data display TEC-
DATDIS
DAT: data range DAT-
RNG
TEC: drawing tool TEC-
DRTOOL
DAT: data
legend
DAT-
LGD
J Sci Educ Technol (2013) 22:681–696 691
123
Required Crosscutting Concept: Scale, Proportion,
and Quantity
Developing and using models to make predictions about
climate require an understanding about the data and
information that is represented in the model.
In this study, the environmental data used in the mod-
eling tool were projected as ranges of average temperature
and precipitation. Before beginning the tasks, students
were given a short reading that explained these ideas at a
reading level appropriate for middle school students. In the
modeling interface, this information was presented in data
legends, where the temperature legend included 11 ranges
(-30 through 29.9 �C) and the precipitation legend inclu-
ded 9 (0 through 449.9 cm). Although students were given
this information before beginning the tasks, they still had
difficulties understanding that the colors in the data layers
represented ranges of average annual temperature and
precipitation, and not single values. For example, in task 2,
students were asked to circle the area on the map that was
between 3 and 18 �C. When they could not find those exact
same numbers on the temperature legend (which was in
increments of 5 �C), they often became frustrated and
abandoned the activity.
The quantity of the projected ranges for the temperature
and precipitation data also appeared to confuse students.
For example, when circling the average annual precipita-
tion range of 60–240 cm, students could not understand
how it could cover the entire west side of the United States.
Other times, students’ unfamiliarity with weather and cli-
mate data appeared to cause them to rely on the colors of
average temperature ranges instead of the actual data val-
ues. In these situations, students would base the tempera-
ture of their city on their association of ‘‘hot’’ and ‘‘cold’’
color regions—where blue shades represented cold regions
and red shades represented hot regions.
A limited understanding about map scaling was another
cause of students’ usability problems. When locating a city
on the map, students tended to zoom in far too closely,
clicking the zoom button five or six times when only one or
two clicks were necessary. When they were zoomed in that
far, the temperature and precipitation data layers would
typically display only one or two colors because of the
large scale at which students were viewing the map. When
viewing the model at this scale, students became confused
and disoriented, not understanding why a city or other
landscape feature was no longer in view. In addition, stu-
dents did not seem to understand that the same map detail
they could view at a small scale would no longer be visible
if they zoomed in closely on the same area. When this
happened, students typically panned to different locations
Table 4 Examples of single and multi-coded data segments
Coded transcript clause Assigned code(s)
I don’t know what I’m doing.
I just want my map tiny again!
TEC-MZOOM
This map is dumb—it doesn’t show
anything. I picked Chicago but
the dot went in Wisconsin
TEC-MPAN
DAT-MSCL
Between what? 60 and 240? 60, so it has
to be yellow?
DAT-AVGP
DAT-RNG
DAT-LGD
I went on temperature [in the data panel]
and clicked it. Nothing happened though
TEC-DATDIS
DAT-AVGT
The shape isn’t saving. Stop! Ugh, whatever! TEC-DRTOOL
What color am I supposed to pick?
Detroit is cold, so I guess I’ll pick blue?
DAT-AVGT
DAT-RNG
DAT-LGD
Where the heck is the ‘‘Next’’ button? TEC-PGNAV
Between what? 60 and 240? 60, so it
has to be yellow?
DAT-AVGP
DAT-RNG
DAT-LGD
I don’t know how to do this, it says draw
a line on the map that’s between 3
and 18 �C. Well, there is no 3
and 18! It just says 0–4, 5–9, 10–19!
DAT-AVGT
DAT-RNG
DAT-LGD
My city disappeared and now
I’m in the ocean! Why am I in the ocean!?
TEC-MZOOM
DAT-MSCL
Why are these circles so big? It’s like,
half the country
DAT-AVGP
DAT-RNG
0
20
40
60
80
100
Task 1 Task 2 Task 3
Fre
qu
ency
(%
)
Task Completion and Accuracy
Correct
Incorrect
Not attempted
Fig. 7 Task completion and accuracy in stage 3 of usability testing
Table 5 Students’ task completion time in stage 3 (in min:s)
Min. Max. Mean SD
Task 1 time 00:00 01:40 00:33 00:26
Task 2 time 00:00 02:30 00:48 00:55
Task 3 time 00:00 02:57 01:15 00:54
Total task time 00:00 05:46 02:30 01:26
Computer session 12:47 20:35 15:19 02:48
Task completion time (%) 00.00 33.95 16.36 09.19
692 J Sci Educ Technol (2013) 22:681–696
123
on the map without first reducing the scale, which failed to
help them become reoriented with their positioning within
the map.
Discussion
Our goal in this study was to identify students’ usability
problems when working with a technology tool that was
redesigned to foster middle school students’ learning about
fused knowledge focused on modeling climate change
impacts. Our findings indicate that students struggled with
the use of models in two ways—with the practice of mod-
eling itself and with the use of modeling data for interpreting
information about climate. From the screencasting videos, it
was evident that students became frustrated when the inter-
active tools did not work as expected, and that students spent
a significant amount of their computer time was spent
reacting to errors made with the modeling tool. From the
analysis of students’ think-aloud discussions, we discovered
that many of the usability problems were closely tied to the
crosscutting concept of scale, proportion, and quantity.
The design improvements made SPECIES as part of our
evaluation resulted in higher usability and performance
outcomes for all three modeling tasks. These improvements
also increased students’ productive learning time by
reducing the amount of time it took for students to com-
plete the tasks. Despite the gains in both usability and
performance, we recognize that more could be done to
support students’ use of a professional modeling tool that
has been redesigned to foster fused knowledge about the
potential impacts of climate change on ecosystems.
Strategies for Repurposing a Professional Modeling
Tool
In science, models are used to construct explanations and
make predictions about natural systems and phenomenon. In
this study, we discovered that many students lacked knowl-
edge of a crosscutting concept that is important for under-
standing climate change and its impacts. To analyze and
interpret climate data, students need to apply the concepts of
averages, range, scale, and proportion. Applying knowledge
of a crosscutting concept can be surprisingly complex.
Rescaling large-scale and small-scale maps, for example, is
not an intuitive process and is a common cause of miscon-
ceptions among students (Nelson et al. 1992; Uttal 2000).
Some of the difficulty with scale is related to the way size is
referenced (i.e., the words ‘‘small’’ and ‘‘large’’) when
describing the scale ratio of the maps. To illustrate, a large-
scale map shows a smaller area in more detail because the
representative fraction for the ratio of map distance to land
distance is larger (e.g., 1/24,000). On a digital map, one can
easily increase the scale by zooming into see the map in
greater detail. Conversely, a small-scale map shows a larger
area in less detail because the representative fraction for the
ratio of map to land distance is comparatively smaller (e.g.,
1/240,000). Thus, a large-scale map shows a smaller area
with more detail (when zooming in), and a small-scale map
shows a larger area with less detail (when zooming out). This
example shows the depth of a crosscutting concept (i.e., scale
and proportion) that is part of the fused knowledge about
using models (the practice) to explain and predict the impacts
of climate change (core disciplinary idea).
To prepare students for STEM careers and promote
scientific literacy, educational technology tools and cur-
ricular resources must co-support the three dimensions of
knowledge in students’ science learning. Previous work
(Quintana et al. 2004) has synthesized the design features
of educational software tools into a scaffolding design
framework for supporting science inquiry. Here, we discuss
design strategies specific to repurposing a professional
modeling tool, emphasizing the role of usability for sup-
porting students’ knowledge development about science
and engineering practices, crosscutting concepts, and core
disciplinary ideas.
Sense-making Scaffolds for Modeling Data
Using technology to model systems and system processes
often requires the use of complex forms of data. When using
a computer model, both the model and the technology
become increasingly complex when multiple data sets are
used. Even a single data set, such as temperature or precip-
itation, can compound different types of information. When
these and other data sets are projected together into future
climate models, the amount and complexity of concepts and
disciplinary ideas increase considerably. Problems can arise
when the representation of data (e.g., ranges of average
annual temperature or precipitation) cannot be simplified
within the model, and students fail to see how multiple ele-
ments fit together when interpreting data.
How can technology designers reconcile the cognitive
demands for improved usability? One strategy is to develop
scaffolds that help students make sense of complex
forms of data. Given the complexity of computer-based
modeling, a first step might involve an inventory of the
underlying crosscutting concepts in the data. Breaking data
down into its constituent parts, either by building in defi-
nitions or adding a supplemental activity, can help students
make sense of complex data representations (Quintana
et al. 2004). However, because of time constraints or other
factors, it may not always be possible or practical to
develop technology scaffolds that give individual treatment
to crosscutting concepts. To address multiple concepts
simultaneously, scaffolds could take the form of a video or
J Sci Educ Technol (2013) 22:681–696 693
123
other media that introduces students to the underlying pro-
perties of the data and are important for developing fused
knowledge about modeling the impacts of climate change.
Creating a video, animation, or other resource that uses the
language and terminology used in the curriculum is another
strategy for helping students make sense of complex forms
of data.
Adopt Design Features of Ubiquitous Technologies
An important part of improving usability is reducing the
amount of time it takes for users to learn how to use the
technology (Preece et al. 2002). One approach to decreasing
technology learning time is to increase the familiarity of the
interface. Complex interfaces can be made more usable for
students by adopting design features of technologies that
may be familiar to students. Students using the computer
conference Pepper (Hewitt and Brett 2011), for example, can
use a ‘‘like’’ button similar to the one in Facebook to indicate
their agreement and support for their classmates’ ideas.
Other learning environments (Manouselis et al. 2011) have
used the recommender system popularized by Amazon to
suggest activities to students based on their prior perfor-
mance on activities in the learning environment.
When designing SPECIES, the use of OpenLayers
resulted in a map interface that was similar to Google
maps, a map application common to many web sites. In the
data panel of our modeling tool, using sliding on/off but-
tons like the ones used in iPhones improved usability by
clearly indicating the view status of the temperature and
precipitation layers. In addition, by switching the center-
pivoting polygon tool for a simple line-drawing tool typical
in most drawing applications, students were able to make
significantly more annotations on the map interface. These
findings demonstrate that even minor design modifications
can have a positive impact on how students use a sophis-
ticated modeling technology.
Align Data sets to Learning Goals
To practice scientific modeling, students require a level of
technological proficiency for analyzing and interpreting the
data in the model. In stages 1 and 2 of our evaluation,
students often expressed frustration when resizing maps on
the modeling interface. While difficulties using the navi-
gation tool would seem the likely cause, we realized that
students were facing an additional impediment. Using a
map-based modeling tool requires knowledge of spatial
and proportional reasoning (Bausmith and Leinhardt 1998),
concepts that many of our students appeared to lack. This
deficit may explain why students were confused when they
overclicked the zoom button and found themselves in the
ocean—they did not understand that by increasing the scale
of the model, they were simultaneously decreasing the
viewable area of the map.
Analyzing and interpreting data for making predictions
about climate change impacts is an important component in
many of our curricular activities. Our study findings suggest
that students’ lack of knowledge about concepts associated
with temperature and precipitation data contributed to stu-
dents’ usability problems with the modeling tool. In most
online computer models, increasing the volume of available
data will also increase the demand and load time on the
computer (Edelson 2004). As the number of data sets
increases, so does the complexity of the technology. Limiting
data sets to include only those that address the learning goals
will contribute toward a more simplified and usable modeling
interface. In some cases, transforming dynamic map models
into a static image (thus eliminating the need for map scaling)
may be all that is necessary to strategically simplify a model.
Imposing task boundaries in this way can support students’
inquiry learning by scaffolding the data management aspect of
the modeling system (Quintana et al. 2004). This can also
increase students’ productive learning time by eliminating
interface features that are tangential to the learning goals.
It should be noted that not all modeling design features
have the potential for being modified. There are often
certain design parameters that researchers and technology
specialists must work within when repurposing a profes-
sional modeling tool. Certain navigational controls, such as
map zooming and panning features, are often standardized
and non-changeable in packaged web applications. How-
ever, the movement toward free access to geospatial
technologies (e.g., Open Source Geospatial Foundation) is
increasing the GIS functionality options that are available
to developers. This has implications for educational
researchers working with modeling tools, since the appli-
cation can be customized to be compatible with different
forms of data and designed in a manner that is appropriate
for a specific target audience. Although this study focused
on GIS systems, the applicability and relevancy of the
findings—supporting students’ learning of tridimensional
knowledge—can be generalized to the broader context of
computer-supported science learning.
Looking forward, it is important to realize that challenges
to using computer-based models are likely to increase.
Advancements in technology have made it possible for sci-
entists to work with complex digital data sets, many of which
were unavailable less than a decade ago. This presents a
challenge for researchers and educators wishing to incorporate
fused knowledge about scientific modeling into the curricu-
lum. GIS systems and other modeling tools will also become
more sophisticated as the technology becomes more powerful.
These developments highlight the importance of repurposed
modeling tools like SPECIES that limit functionality to focus
694 J Sci Educ Technol (2013) 22:681–696
123
students’ attention on what is important—analyzing and
interpreting modeling data for making predictions and con-
structing explanations about complex ideas in science.
Conclusion
This study evaluated the usability of a professional modeling
technology that was repurposed to support middle school
students’ learning about fused knowledge associated with
climate change impacts. Usability testing identified a number
of problematic features in the interface design of the modeling
tool. More importantly, it identified a number of data-related
usability problems related to the crosscutting concepts that
needed to be applied to the modeling data. Evaluating the
usability of a repurposed modeling technology is important for
establishing classroom efficacy and for having broader edu-
cational impacts through product scalability.
The findings from this study have implications for cur-
riculum developers and science educators. New technologies
bring with them new opportunities for developing a deep
understanding of science. In this study, many students were
unable to apply the concepts of scale, proportion, and
quantity when working with the temperature and precipita-
tion data, which is necessary for analyzing and interpreting
climate data. Subsequent versions of SPECIES will include
dedicated scaffolds for supporting students’ fused knowl-
edge development about the use of models for predicting the
impacts of climate change. To improve the learning experi-
ence with a repurposed professional modeling tool, we rec-
ommend early usability testing for identifying the aspects of
tridimensional knowledge that require scaffolding. We also
recommend the use of screencast videos as a method for
capturing authentic data from student think-alouds. Finally,
we encourage additional work that increases the range and
quality of empirical information available for supporting
fused knowledge development that includes practices,
crosscutting concepts, and core disciplinary ideas in science.
Acknowledgments This research is based upon work supported by
the National Science Foundation under Grant No. 0918590. Any
opinions, findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily reflect
the views of the National Science Foundation.
References
Alibrandi M (2003) GIS in the classroom: using geographic
information systems in social studies and environmental science.
Heinemann, Portsmouth
Baker TR (2005) Internet-based GIS mapping in support of K-12
education. Prof Geogr 57(1):44–50
Baker TR, White SH (2003) The effects of G.I.S. on students’
attitudes, self-efficacy, and achievement middle school science
classrooms. J Geogr 102(6):243–254
Bausmith JM, Leinhardt G (1998) Middle-school students’ map
construction: understanding complex spatial displays. J Geogr
97(3):93–107
Beach JH, Stewart AM, Vorontsov, GY (2002) Mapping distributed
life with distributed computation. In: Proceedings of the 22nd
annual Esri international user conference. Redlands, CA: Esri
Publications
Benyon D (1993) Adaptive systems: a solution to usability problems.
User Model User Adap Inter 3(1):65–87
Board College (2009) Science: college board standards for college
success. The College Board Press, New York
Bury KF (1984) The iterative development of usable computer
interfaces. In Proceedings of IFIP INTERACT’84 international
conference on human–computer interaction, pp 743–748. London,
UK, Sept 4–7
Ceaparu I, Lazar J, Bessiere K, Robinson J, Schneiderman B (2004)
Determining causes and severity of end-user frustration. Int J
Hum Comput Interact 17(3):333–356
Coll R, France B, Taylor I (2005) The role of models/and analogies in
science education: implications from research. Int J Sci Edu
27(2):183–198
Collins A (2011) A study of expert theory formation: the role of
different model types and domain frameworks. In: Khine MS,
Saleh IM (eds) Models and modeling: cognitive tools for
scientific enquiry. Springer, New York, pp 23–40
Crowther MS, Keller CC, Waddoups GL (2004) Improving the quality
and effectiveness of computer-mediated instruction through
usability evaluations. British J Edu Technol 35(3):289–303
Csikszentmihalyi M (1997) Finding flow: the psychology of engage-
ment with everyday life. Basic Books, New York
Edelson DC (2004). Designing GIS software for education: a
workshop report for the GIS community. The Geographic Data
in Education Initiative at Northwestern University
Edelson DC, Gordin D (1998) Visualization for learning: a framework
for adapting scientist’s tools. Comput Geosci 24(7):607–616
Gobert JD, Pallant A (2004) Fostering students’ epistemologies of
models via authentic model-based tasks. J Sci Educ Technol
13(1):7–22
Hewitt J, Brett C (2011) Engaging learners in the identification of key
ideas in complex online discussions. In Spada H, Stahl G, Miyake N,
Law N (eds) In: Proceedings of the computer-supported collabo-
rative learning conference, pp 960–961, Hong Kong, July 4–8
IPCC (2007) Climate change 2007: synthesis report. In: Core writing team,
Pachauri RK, Reisinger A (eds) Contribution of working groups I, II
and III to the fourth assessment report of the intergovernmental panel
on climate change. IPCC, Geneva, Switzerland
Johnson RR, Salvo MJ, Zoetewey MW (2007) User-centered technology
in participatory culture: two decades ‘‘Beyond a narrow conception of
usability testing’’. Proc IEEE Transact Prof Commun 50(4):320–332
Kaplan DE, Black JB (2003) Mental models and computer-based
scientific inquiry learning: effects of mechanism cues on
adolescent representation and reasoning about causal systems.
J Sci Educ Technol 12(4):483–493
Keating T, Barnett MG, Barab SA, Hay KE (2002) The virtual
solar system project: developing conceptual understanding of
astronomical concepts through building three-dimensional com-
putational models. J Sci Educ Technol 11(3):261–275
Landsberger HenryA (1958) Hawthorne revisited: management and
the worker: its critics, and developments in human relations in
industry. Cornell University Press, Ithaca
Lazar J, Jones A, Shneiderman B (2006) Workplace user frustration
with computers: an exploratory investigation of the causes and
severity. Behav Inf Technol 25(3):239–251
Lehrer R, Schauble L (2006) Scientific thinking and science literacy.
In: Damon W, Lerner R, Renninger KA, Sigel IE (eds)
Handbook of child psychology, vol 4, 6th edn. Wiley, Hoboken
J Sci Educ Technol (2013) 22:681–696 695
123
Manouselis N, Drachsler H, Vuorikari R, Hummel H, Koper, K
(2011) Recommender sytems in technology enhanced learning.
In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommener
systems handbook. Springer, New York, pp 387–415
Michigan Department of Education. K-7 Content grade level expectations
in science, version 1.09. Retrieved from: http://michigan.gov/mde/
Miles MB, Huberman AM (1994) Qualitative data analysis: an expanded
sourcebook, 2nd edn. Sage Publications, Thousand Oaks
Miyake N (1986) Constructive interaction and the iterative process of
understanding. Cognitive Science 10:151–177
National Research Council (2012) A framework for K-12 science
education: practices, crosscutting concepts, and core ideas.
National Academies Press, Washington
Nelson BD, Aron RH, Francek MA (1992) Clarification of selected
misconceptions in physical geography. J Geogr 91(2):76–80
Nielsen J (1993) Iterative user-interface design. IEEE Comput 26(11):
32–41
Nielsen J (2003) Usability 101: an introduction to usability. Jakob
Nielsen’s Alertbox. Base article retrieved 2 Nov 2010, from:
http://www.useit.com/alertbox/20030825.html
Nielsen J (2010). Children’s websites: usability issues in designing for
kids. Jakob Nielsen’s Alertbox. Article retrieved 2 Nov 2010,
from: http://www.useit.com/alertbox/children.html
Nielsen J, Landauer TK (1993) A mathematical model of the finding
of usability problems. In Proceedings of ACM INTERCHI’93
conference. ACM Press, Amsterdam, pp 206–213
Pallant A, Tinker RF (2004) Reasoning with atomic-scale molecular
dynamic models. J Sci Educ Technol 13(1):51–66
Preece J, Rogers Y, Sharp H (2002) Interaction design: beyond
human-computer interaction. Wiley, New York
Quintana C, Reiser BJ, Davis EA, Krajcik J, Fretz E, Duncan RG, Kyza
E, Edelson D, Soloway E (2004) A scaffolding design framework
for software to support science inquiry. J Learn Sci 13(3):337–386
Schwarz CV, White BY (2005) Metamodeling knowledge: developing
students’ understanding of scientific modeling. Cogn Instr 32(2):
165–205
Sengupta P, Wilensky U (2011) Lowering the learning threshold:
multi-agent-based models and learning electricity (pp 141–171).
In Khine MS, Saleh IM (eds) Models and modeling: cognitive
tools for scientific enquiry. Springer, New York, pp 23–40
Shapiro AM (2008) Hypermedia design as learner scaffolding. Edu
Tech Res Dev 56:29–44
Shneiderman B, Bederson BB (2005) Maintaining concentration to
achieve task completion. In: Proceedings of the 2005 designing
user experience conference. AIGA Press, San Francisco
Shorkey CT, Crocker SB (1981) Frustration theory: a source of unifying
concepts for generalist practice. Soc Work 26(5):374–379
Sim G, MacFarlane S, Read J (2006) All work and no play: measuring
fun, usability and learning in software for children. Comput
Educ 46(3):235–248
Soloway E, Guzdial M, Hay KE (1994) Learner-centered design: the
challenge for HCI in the 21st century. Interactions 1(2):36–48
Songer NB (2006) BioKIDS: an animated conversation on the
development of complex reasoning in science (pp 355–369). In:
Sawyer RK (ed) Cambridge handbook of the learning sciences.
Cambridge University Press, New York
Songer NB, Kelcey B, Gotwals AW (2009) How and when does
complex reasoning occur? Empirically driven development of a
learning progression focused on complex reasoning about
biodiversity. J Res Sci Teach 46(6):610–631
Sullivan P (1989) Beyond a narrow conception of usability testing.
Proc IEEE Transac Prof Commun 32(4):256–264
Taylor I, Barker M, Jones A (2003) Promoting mental model building
in astronomy education. Int J Sci Edu 25(10):1205–1225
Turner CW, Lewis JR, Nielsen J (2006) Determining usability test sample
size. In: Karwowski W (ed) International encyclopedia of ergonomics
and human factors, vol 4, 6th edn. CRC Press, Boca Raton
Uttal DH (2000) Seeing the big picture: map use and the development
of special cognition. Dev Sci 3(3):247–264
Varma K, Linn MC (2011) Using interactive technology to support
students’ understanding of the greenhouse effect and global
warming. J Sci Educ Technol. doi:10.1007/s10956-011-9337-9
Virzi RA (1992) Refining the test phase of usability evaluation: how
many subjects is enough? Hum Factors 34(4):457–468
696 J Sci Educ Technol (2013) 22:681–696
123