Download pdf - Evaluating the Usability of a Professional Modeling Tool Repurposed for Middle School Learning

Evaluating the Usability of a Professional Modeling ToolRepurposed for Middle School Learning

Vanessa L. Peters • Nancy Butler Songer

Published online: 23 October 2012

� Springer Science+Business Media New York 2012

Abstract This paper reports the results of a three-stage

usability test of a modeling tool designed to support

learners’ deep understanding of the impacts of climate

change on ecosystems. The design process involved

repurposing an existing modeling technology used by

professional scientists into a learning tool specifically

designed for middle school students. To evaluate usability,

we analyzed students’ task performance and task comple-

tion time as they worked on an activity with the repurposed

modeling technology. In stage 1, we conducted remote

testing of an early modeling prototype with urban middle

school students (n = 84). In stages 2 and 3, we used

screencasting software to record students’ mouse and

keyboard movements during collaborative think-alouds

(n = 22) and conducted a qualitative analysis of their peer

discussions. Taken together, the study findings revealed

two kinds of usability issues that interfered with students’

productive use of the tool: issues related to the use of data

and information, and issues related to the use of the

modeling technology. The study findings resulted in design

improvements that led to stronger usability outcomes and

higher task performance among students. In this paper, we

describe our methods for usability testing, our research

findings, and our design solutions for supporting students’

use of the modeling technology and use of data. The paper

concludes with implications for the design and study of

modeling technologies for science learning.

Keywords Usability � Computer modeling �Learning technologies � GIS � Climate change

Introduction

Recent policy documents in the United States, such as the

Framework for K-12 Science Education (NRC 2012), point to a

new sense of urgency on science, technology, engineering, and

mathematics (STEM) education. These and other policy doc-

uments highlight the importance of integrating complex tech-

nologies in teaching students how scientific knowledge is

developed and applied. Complex technologies are routinely

used in most scientific domains, and they are a fundamental part

of developing new forms of knowledge. Indeed, some of our

most pressing scientific problems, such as predicting the

impacts of global climate change on humans and other organ-

isms, can only be studied with the use of sophisticated com-

puter-based modeling systems. In STEM education, using

models and modeled data is identified as critical practices that

are needed for constructing arguments or making predictions

with scientific evidence (College Board 2009; NRC 2012).

To prepare a citizenry of informed science consumers for

both personal and civic decision-making, STEM education

must also provide opportunities for developing a deep

understanding of fundamental knowledge in science and

engineering. According to the National Research Council

(NRC 2012), this fundamental knowledge consists of three

dimensions that are inextricably linked with one another.

These dimensions include the following: (1) Practices—the

work used by scientists and engineers for investigating

questions, solving problems, and building models and the-

ories about the world; (2) Crosscutting concepts—large

themes that have applications across many domains of sci-

ence and engineering (e.g., cause and effect); and (3) Core

disciplinary ideas—the fundamental principles and disci-

plinary knowledge in science and engineering (NRC 2012).

In the Framework for K-12 Science Education, the integra-

tion of these three dimensions of knowledge is expressed in

V. L. Peters (&) � N. B. Songer

School of Education, University of Michigan, 610 East

University Avenue, Ann Arbor, MI 48109, USA

e-mail: [email protected]

123

J Sci Educ Technol (2013) 22:681–696

DOI 10.1007/s10956-012-9422-8

performance expectations. Table 1 presents examples of two

performance expectations that are associated with climate

change and the history of the Earth (NRC 2012). Beside each

example are the practices, disciplinary core ideas, and

crosscutting concepts that are associated with that perfor-

mance expectation.

While the framework presents an articulate goal for fos-

tering tridimensional-fused knowledge, it provides little

information on how to design curricular or technological

resources that support the development of fused knowledge.

Indeed, the document states that the research on the design of

resources is beyond the scope of their work (NRC 2012).

The goal of the current study was to gather empirical

data on the design of a modeling tool designed to foster

fused knowledge associated with the impact of climate

change on ecosystems. In particular, we focus on the fol-

lowing dimensions of science knowledge:

Practices: The use of models to make predictions and

develop explanations about the effects of climate change

on natural phenomenon

Crosscutting concepts: Scale, proportion, and quantity

Core disciplinary ideas: Global climate change,

weather, and climate

The study was driven by the following research

questions:

1. What kinds of usability problems do students experi-

ence when working with a professional modeling tool

that has been redesigned to foster fused knowledge

about climate change impacts?

2. What are the most important design features to consider

when repurposing a professionalmodeling tool that supports

students’ development of fused knowledge in science?

Our design resource was a professional Geographic

Information System (GIS) tool called Lifemapper (http://

www.lifemapper.org). Lifemapper provides predictive dis-

tribution modeling information and modeled data for pro-

fessional scientists’ predictions about the possible impacts of

a changing climate on the environment. In the first phase of

our work, the Lifemapper tool was redesigned and simplified

to support middle school students’ learning about the effects

of global climate change on species distribution.

Our research study included three cycles of iterative

design and usability evaluation of our middle school

modeling tool. In this paper, we report our research find-

ings and describe our interface design solutions for sub-

sequent development work. We conclude with a discussion

about the role of modeling technologies for supporting

middle school students’ development of fused knowledge.

Fused Knowledge About Climate Modeling

In STEM education, providing students with opportunities

to experience the practice of modeling is important for

preparing them to become informed decision-makers:

Science and engineering affect diverse domains—agri-

culture, medicine, housing, transportation … modeling

… can help provide insight into the consequences of

actions beyond the scale of place, time or system com-

plexity that individual human judgments can readily

encompass, thereby informing both personal and soci-

etal decision-making. (NRC 2012, p. 212).

Using models (both computerized and non-computer-

ized) for making predictions about the natural world

requires a sophisticated understanding of all three dimen-

sions of science knowledge. For example, to make a pre-

diction that answers the question, ‘‘Will a future climate

scenario impact the distribution of the red squirrel?’’ stu-

dents will require an understanding of the following

knowledge dimensions:

Practice: Students must understand what a model is, and

how modeled data are used as evidence when answering

scientific questions

Crosscutting concepts: Students must understand scale

and proportion when applying modeled data about species

distribution and future scenarios

Table 1 Sample performance expectations focused on climate

change and Earth history

Performance

expectation

Practices Disciplinary

core ideas

Crosscutting

concepts

HS.ESS-CC climate

change

e. Use global climate

models in

combination with

other geologic data

to predict and

explain how human

activities and

natural phenomena

affect climate,

providing the

scientific basis for

planning for

humanity’s future

needs

Developing

and using

models

Weather and

climate;

global

climate

change

Cause and

effect;

stability

and change

MS.ESS-HE the

history of Earth

b. Use models of the

geological

timescale in order

to organize major

events in Earth’s

history

Developing

and using

models

The history of

the Planet

Earth

Scale,

proportion,

and

quantity

682 J Sci Educ Technol (2013) 22:681–696

123

http://www.lifemapper.org

http://www.lifemapper.org

Core disciplinary ideas: Students must understand that if

Earth’s mean temperature continues to rise, the lives of humans

and other organisms will be affected in many different ways.

To truly appreciate the nature of scientific knowledge, it is

important that students experience the practice of science

while learning core disciplinary ideas and concepts (NRC

2012). With the use of advanced computer models, students

can develop an understanding of the consequences of human

activity on natural systems by analyzing climate data and

species distribution data that are represented in models.

Although modeling technologies provide a valuable tool for

fostering fused knowledge about science, they often come with

an associated cost. Learning a new technology can take a tre-

mendous amount of time, taking away valuable class time that

would otherwise be available for learning. Science teachers, in

particular, are already challenged to cover the curriculum

content standards within their allotted class time. When learn-

ing technologies present usability problems, it becomes even

more difficult to promote students’ fused understanding of core

disciplinary ideas and modeling. In industry, usability testing is

standard, and significant resources are spent on obtaining user

feedback from the target audience. In education, however, few

studies have specifically investigated how K-12 students

engage and interact with complex technologies (Shapiro 2008).

More often than not, efforts to design computer-based learning

tools have relied on developers’ assumptions of how students

engage with digital material (Nielsen 2010).

For many years, professional scientists have used models to

visually represent data and make sound predictions in scien-

tific experimentation. Models can be described as analogous

representations of real-world systems that enable scientists to

test hypotheses about non-observable phenomena (Lehrer and

Schauble 2006). In STEM education, the use of computerized

models allows students to participate more deeply in the

practice of science and supports their deep understanding of

core content and prediction making that are not feasible

through other means (Baker and White 2003; Edelson and

Gordin 1998; Kaplan and Black 2003). Justifying predictions

and reasoning with data are a central aspect of using models to

investigate real-world problems. The NRC (2012) recom-

mends that ‘‘… more sophisticated types of models should

increasingly be used across the grades, both in instruction and

curriculum materials, as students progress through their sci-

ence education.’’ (pp. 3–9). In the classroom, educators have

used computer-based models to teach students about various

scientific phenomena including physical processes (e.g.,

molecular diffusion), Earth systems (e.g., plate tectonics),

chemical reactions (e.g., photosynthesis), and celestial

astronomy (e.g., planetary formation). Recent advances in

computer technology and increases in computer processing

capacity provide new opportunities for model-based scientific

experiments that, if used appropriately, can be adapted for

classroom instruction.

Few would argue of the potential of sophisticated mod-

eling tools for teaching students science. However, to be

productive for learning, the technology must be designed in a

way that is appropriate and accessible for learners (Edelson

and Gordin 1998). Using modeling technologies effectively

for teaching science requires a carefully designed resource

that is sensitive to the audience and the purpose of learning

(Soloway et al. 1994). Modeling tools that project geospatial

data may be particularly difficult for students to understand.

The digital representation of maps, for example, can be

confusing for students because maps often combine different

data sets such as species distribution, temperature, and

landscape features that must be analyzed together. To sup-

port a deep understanding of science, modeling resources

need to be purpose-driven and appropriate for the topic

(Taylor et al. 2003), and structured in ways that support a

meaningful progression of science content knowledge

(Alibrandi 2003). Learners also benefit when models are

revisited and practiced multiple times in a curriculum (Baker

2005), and when their understanding about phenomena are

transferred to new contexts and situations (Pallant and Tin-

ker 2004; Schwarz and White 2005).

Researchers have uncovered several strategies for lever-

aging the affordances of modeling tools. Analogies, for

example, can be useful for helping students make sense of new

scientific ideas (Coll et al. 2005). Students’ experimentation

strategies can become more explicit by manipulating indi-

vidual variables associated with modeled data (Varma and

Linn 2011). Collins (2011) describes an epistemological

framework for models that scaffolds students in forming

hypothesis, identifying variables, and evaluating alternative

explanations when constructing theories. Researchers have

also used peer critique (Gobert and Pallant 2004) and three-

dimensional representations (Keating et al. 2002) as mecha-

nisms for fostering a more sophisticated understanding of

models. More recently, researchers have explored how multi-

agent-based models can support students’ understanding of

complex emergent processes by building on their prior

knowledge about systems phenomena (Sengupta and Wilen-

sky 2011). As modeling technologies continue to assume a

central role in knowledge development, it will become even

more important to develop coherent curricular materials and

instructional strategies for working with complex modeling

tools.

Usability for Learning

Like any technology, a modeling tool will only be purposeful

if it is usable for the target audience. Many K-12 students,

especially those in younger grades, lack the prior knowledge

and experience with abstract representations that are required

for making sense of a complicated modeling interface. Along

with conceptual challenges and infrastructure compatibility

J Sci Educ Technol (2013) 22:681–696 683

123

issues, usability is considered to be one of the biggest chal-

lenges to using modeling technologies in the classroom

(Edelson 2004).

Several researchers have investigated the role of usability in

developing educational technologies. Crowther et al. (2004)

report a case study that examines the relationship between

usability testing and educational assessment, explaining how

the former is essential to improving the quality and effective-

ness of computer-supported learning and instruction. They

point out that instructional environments that are ill designed

are unlikely to have much pedagogical value and that early

usability tests can avoid time-consuming and costly modifica-

tions or replacements of technology. Unfortunately, however,

most usability testing in educational studies is conducted on

fully developed products, which lessens the potential for

usability findings to improve the design and functionality of the

learning technology (Johnson et al. 2007; Sullivan 1989).

Usability problems can compromise productive learning in a

number of ways. Poorly designed interfaces are distracting and

can interrupt the completion of a task, resulting in user frus-

tration and anxiety with computers (Shneiderman and Beder-

son 2005). Research on computer use in the workplace has

shown that up to fifty percent of workers’ computer time is

spent dealing with frustrating experiences (Ceaparu et al. 2004).

When frustration happens on a regular basis, users waste large

amounts of time and feel helpless and resigned in completing

their tasks. When responding to frustrating situations, users are

more likely to ask someone for help rather than consult a

manual or online help guide (Ceaparu et al. 2004). Common

causes of computer frustration include system error messages,

slow Internet connections, application crashes, and unpredict-

able and unclear interface features (Lazar et al. 2006; Preece

et al. 2002).

Frustration with computers is particularly detrimental to

student learning and can lead to maladaptive behavior that

lowers motivation and goal-oriented performance (Shorkey and

Crocker 1981), thus diminishing the time spent on relevant

tasks. To move forward with learning, students need to work

unencumbered in their activities in a state of uninterrupted

flow and sustained concentration (Csikszentmihalyi 1997).

Addressing usability issues is a clear opportunity for reducing

episodes of frustration and increasing student productivity as

the benefits are immediate and are applicable to most users

(Nielsen 2003). Moreover, even small interface design changes

have been shown to have a significant positive impact on the

usability and functionality of a technology (Benyon 1993) and

on the level of user enjoyment (Sim et al. 2006).

When evaluating the usability of a learning technology,

it is important that testing be performed on users from the

target audience. The ideal sample size for a usability study

has been investigated by a number of researchers. Virzi

(1992), for example, used Monte Carlo simulations and

data from three large usability evaluations to compute the

probability of problem-finding by participants. Virzi’s

study revealed that over 80 % of usability problems could

be identified in evaluations with only four to five study

participants and that additional participants did not increase

the likelihood of identifying additional problems. Further-

more, it was found that it was the first few participants in a

study that discovered the most serious usability problems.

Based on these and other study findings (e.g., Nielsen and

Landauer 1993; Turner et al. 2006), usability experts agree

that an iterative study design with a small sample size is

ideal for usability evaluations.

Research Context

The modeling tool evaluated in this study was developed as

part of a larger effort funded by the National Science Foun-

dation to develop dynamic, age-appropriate modeling tools,

curriculum units, and assessment instruments to foster fused

knowledge focused on the impacts of global climate change.

The work began with the development of a learning pro-

gression (Songer 2006; Songer et al. 2009) that articulates the

sequence of core disciplinary ideas (e.g., weather and climate,

Earth history), crosscutting concepts (e.g., scale, proportion,

and quantity; system models), and practices (e.g., use of

models to make predictions) that should be emphasized and

revisited throughout a multi-week curricular unit.

Conceptualizing the knowledge for our learning progres-

sions and curricular units has been a particularly challenging

part of the development process because of the need to bring

together core disciplinary ideas from different subject areas

into a coherent sequence of activities focused on climate

change and its impacts. For example, our learning progression

includes science topics that are often taught separately in units

of chemistry, biodiversity, ecology, and atmospheric science.

Additionally, we have found that although many states

address climate change and climate change impacts within

their state standards, students poorly understand the core

disciplinary ideas and they are rarely, if ever, discussed rela-

tive to each other (e.g., Michigan GLCEs v. 1.09). Our cur-

ricular decisions involved both the selection of the dimensions

of the knowledge to emphasize in our activities (the what) and

our plans for presenting them (the how) so they may be best

understood by teachers and students.

In addition to identifying the essential knowledge to

include in our learning progression, we were challenged by

the strategic simplification of the material so that it was well

suited for middle school students. Prioritizing knowledge

was important so that students could have enough time on

each core idea to support deep understandings about climate

change and its impacts. Our challenge was confounded by

the amount of relevant material, as many scientific ideas are

interrelated and deeply connected to the topic of climate


123

change. It would be undesirable, for example, for a climate

change biology curriculum to not include the carbon cycle,

and yet to deeply examine the nature of carbon involves a

foray into chemistry that may not be within the scope of a

particular curricular program. We were also challenged by

the complexity of the science. For example, the driving

factors used by the Intergovernmental Panel on Climate

Change (IPCC) for determining the various future climate

scenarios include economic, scientific, and sociocultural

dimensions (see IPCC 2007). Our decisions are, therefore,

both dynamic and ongoing, involving much negotiation

between scientists, educational researchers, and the tech-

nology specialists who were involved in developing our

modeling tool.

Repurposing a Professional Modeling Tool

An important part of curriculum development involved re-

purposing the GIS system Lifemapper (Beach et al. 2002)

into a web-based learning environment called SPECIES

(Students Predicting the Effects of Climate In EcoSystems).

Central to the SPECIES technology is a learner-focused

predicted distribution modeling (PDM) tool for teaching

students about the possible impacts of climate change on

species distribution. PDM is an innovative GIS-based

method used to produce current and predictive maps of

where elements (i.e., species, ecological elements) are likely

to occur and not occur under different predicted climate

scenarios. Modeling species distribution data is itself a

complex process, made all the more so by the complicated

GIS technologies that are typically used by scientists (e.g.,

SAGA, Quantum). Since learning these programs can be

daunting even for adults, we needed to repurpose Lifemapper

into a modeling tool that would be compatible with sophis-

ticated geospatial data, yet simple enough to be easily nav-

igated by middle school students. Working with a technology

developer, we created a customized mapping tool using the

platform MapServer and the open source JavaScript code

from OpenLayers. The result was an interactive Google-like

map application that could be used with authentic species

distribution data and the IPCC future climate scenarios.

To produce climate models, Lifemapper leverages mas-

sive caches of online geospatial occurrence data to create

maps for current and future predictions of different animal

species. Lifemapper does this by combining data on species

observations (i.e., known locations of where animals live)

with environmental data (e.g., precipitation and temperature

data) to create map models that predict where animals can

live in the future based on where they are known to live now.

Repurposing these data into digestible forms required

modifications to the database of species and niche distribu-

tions to support the future climate scenarios developed by the

IPCC and that are used in our curriculum. To format the data

for our modeling tool, the Lifemapper system required web

server refactoring, database redesign, geographic informa-

tion system scripting, and new server hardware for sup-

porting our data needs. Although data preparation is a

behind-the-scenes part of the modeling tool development

process, it is a critical one given that importing and exporting

GIS data are among the biggest challenges to using modeling

systems in schools (Edelson 2004).

Method

For our usability evaluation, we used an iterative study

design based on recommendations from experts in the fields

of computer engineering and interface design (e.g., Bury

1984; Nielsen 1993). In an iterative study design, usability

evaluation coincides with different stages of the technology

development process. In test cases of usability studies, an

iterative evaluation approach was shown to improve overall

usability by 165 %, with an average increase of 38 % per

iteration (Nielsen 1993). Unlike a pilot test, an iterative study

design is not a small-scale replication of a larger research

project, but a stand-alone study based on cycles of evaluation

and refinement of a technology innovation.

Participants and Data Sources

We evaluated the SPECIES modeling tool in three stages of

the development process. In stage 1, students completed an

online survey after completing a series of tasks using a

modeling tool prototype (n = 84). In stage 2, we collected

screencast video data using collaborative think-alouds as

students worked with a fully functional version of the SPE-

CIES modeling tool (n = 8). In stage 3, after a redesign of

the interface, we collected additional screencast data from a

second round of student think-alouds (n = 14) and per-

formed a qualitative analysis of students’ think-aloud dis-

cussions. The study procedure and data analysis for stages

1–3 are described in further detail in their respective sections.

Materials

We used the screencasting software ScreenFlowTM to record

students’ moment-to-moment computer interactions as they

completed tasks with the modeling tool. Screencasts are video

recordings of a computer monitor; they record a user’s mouse

and keyboard actions and keep time logs of all on-screen

activity. The software uses the computer’s built-in video

camera and microphone for recording, which enabled us to

capture students’ facial expressions and dialog as they com-

pleted tasks with the modeling tool. The screencast videos

provided us with an in-depth view of how students were

engaging with the technology, and identified where in the task


123

sequence students were experiencing difficulties. Figure 1

shows a screenshot of ScreenFlow’sTM interface, including

the video annotation tool (circled) and audio timeline used for

data analysis.

Stage 1: Remote Testing of Modeling Prototype

The goal of the first stage of usability testing was to gain a

broad overview of how students worked with data overlays in

the modeling interface. To achieve this, it was necessary to

obtain feedback from a larger number of users than is typically

collected in usability evaluations. Evaluating a technology

early in the development cycle is particularly valuable, since

early feedback still has potential for altering the course of the

design (Nielsen 2003). In our case, the design of the modeling

tool would inform the design of the associated curricular

activities that were simultaneously under development.

One of our first design decisions involved the selection

of data layers to include in the modeling tool. Although it

was desirable to let students choose from a wide range of

data sets (e.g., animal species, biomes, and cover, etc.), we

felt too many options might be distracting for students and

impede their completion of the activity. To learn how

students would interact with multiple data layers, we

developed a prototype map activity using Google maps and

MapServer. Since the Lifemapper GIS data were not yet

available, we created mock data layers for the purpose of

usability testing (see Fig. 2). These data layers were not

based on authentic geospatial data, but rather were simple

colored overlays that were deliberately designed in a way

that required students to access different areas of the map.

In the modeling interface, the colored data layers were

labeled to represent three different tree types: deciduous,

coniferous, and pine forest.

Procedure

To recruit student participants, we sent an email to several

middle school teachers inviting them to participate in the

study. Four teachers replied and agreed to implement the

online modeling activity in their classrooms.

Before beginning the modeling activity, students were

presented with written instructions for navigating the map

interface. Using the modeling prototype, students were

asked to complete four tasks using the tree distribution data

layers in the map application. Students were then asked to

answer five questions about the distribution of trees in the

United States (all states were clearly labeled on the map).

For example, one question asked students, ‘‘Can you find

coniferous trees in Texas?’’ To answer this question cor-

rectly, students had to first click on the appropriate data

layer in the interface (i.e., coniferous trees) and then use

the navigation and panning features to see if the coniferous

tree data layer covered all or part of Texas. After answering

the tree questions, students were presented with four Likert

scale questions that asked them about their experience

using the modeling tool. The Likert scale questions asked

Fig. 1 Screenshot of screencasting software used for student think-alouds


123

students to rate their agreement to a statement using a five-

point scale (strongly agree, agree, neutral, disagree, and

strongly disagree).

Analysis

A total of 84 students completed the activity using the mod-

eling tool prototype. The data from all four classes were

compiled and analyzed for task completion and accuracy

based on students’ responses to the tree distribution questions.

Each question was coded as either ‘‘correct’’ or ‘‘incorrect’’;

questions that were left blank were coded as ‘‘not attempted.’’

All data were tabulated and summarized with descriptive

statistics. Since there was no reliable method for determining

how long students spent on each task, stage 1 did not include

an analysis of task completion time.

Results

Students’ answers to the tree distribution questions suggested

they experienced some usability challenges when using the

map data layers. Since the questions were straightforward

with only one correct response, we expected a relatively high

accuracy across the four tasks. However, as shown in Fig. 3,

the percentage of correct responses was low, with a combined

mean of 55.29 % (SD = 10.08) for all four classes.

Students’ task performance level was not consistent with

their reported experiences of using the modeling tool.

When asked about the instructions for navigating the map

interface, the majority of students (82.1 %) agreed or

strongly agreed that the navigation instructions were easy

to understand. In addition, approximately three-quarters of

students (70.2 %) agreed or strongly agreed to having no

problems when finding the information they needed to

answer the tree distribution questions. Based on these early

findings, we decided that in subsequent development work,

we would limit the number of available data sets that

students could work with at any one time to no more than

four.

Stage 2: Alpha Testing of Modeling Tool

In stage 2, we evaluated a fully functional version of the

modeling tool that used the authentic species distribution

data provided by Lifemapper. Although remote testing in

stage 1 enabled us to obtain feedback from a large number

of students, it did not provide us with a detailed account of

how students were interacting with the modeling technol-

ogy. In order to obtain a more in-depth view of usability,

we conducted collaborative think-alouds as a method for

capturing students’ use of the modeling tool.

Our approach to collaborative think-alouds was similar to

Miyake’s (1986) method of constructive interaction. In this

method, two or more students are observed as they work

together to solve a problem or complete a task, without

interruption from the researcher. This approach has several

advantages over traditional think-alouds, where individuals

are asked to verbalize their thinking while working through

some activity. First, students are not required to voice their

decision-making processes, an action that is unnatural for

most people and especially for children. Second, asking a

Fig. 2 Mock data overlays used

for remote testing of modeling

tool prototype

0

20

40

60

80

100

Task 1 Task 2 Task 3 Task 4

Fre

qu

ency

(%

)

Task Completion and Accuracy in Stage 1

Correct IncorrectNot Attempted

Fig. 3 Task completion and accuracy in stage 1 of usability testing


123

peer for help is something that most students are inclined to

do anyway. Third, the dialog that takes place between stu-

dents as they work together can provide valuable insight into

their collaborative interactions.

One disadvantage to this approach is the possibility of the

‘‘Hawthorne effect’’ (Landsberger 1958), where participants

alter their behavior on account of their awareness of being

observed. Originally, we intended to take observation notes

during the collaborative think-alouds; however, after the first

think-aloud began, it became apparent that students were

uncomfortable with a researcher in the room. Since it was

important to document authentic interaction with the mod-

eling tool, we decided that students could work alone in the

classroom with their peer. Since we used the build-in video

cameras in the laptops (as opposed to an external video

camera mounted on a tripod) as a recording device, the study

environment was less invasive, which further increased the

candidness of students’ behavior. This approach to data

collection provided a clear window into how students were

engaging with the modeling tool.

Procedure

Eight students from an urban public middle school participated

in the first round of collaborative think-alouds. Students worked

on individual laptops in a separate room, away from their

classmates and the teacher. The researcher introduced the

activity and provided students with written instructions for

using the modeling tool. To reduce the amount of background

knowledge needed in order to complete the tasks, we used

simple environmental data layers (i.e., temperature and pre-

cipitation) in the modeling activity. Working in pairs, each

student was asked to complete the following three tasks using

the modeling tool:

Task 1: Using the drawing tool, select any city in the

United States and draw a circle around it. What is the

average annual temperature of your chosen city?

Task 2: Using the drawing tool, circle the area on the

map where the average annual temperature is between 3

and 18 �C.

Task 3: Using the drawing tool, circle the area on the

map where the average annual precipitation is between 60

and 240 cm.

Analysis

Task Performance

We measured students’ task performance by analyzing their

mouse and keyboard movements as recorded in the screencast

videos. When working on a task, students had to perform

certain computer actions to complete the task correctly, such

as clicking on a particular data layer or drawing a circle on the

map. For this reason, task performance was based on video

evidence of students having completed the necessary com-

puter actions. For each task, we assigned one of the following

scores: ‘‘correct’’ (the student performed all the necessary

actions and completed the task correctly), ‘‘incorrect’’ (the

student performed the necessary actions, but did not complete

the task correctly), and ‘‘not attempted’’ (the student did not

perform any actions at all). To provide an example, for task 2

to be scored correct, the student had to click the temperature

layer, select the drawing tool, and draw a line around the

colored area on the map where the average annual temperature

was between 3 and 18 �C. However, if the student had selected

the precipitation layer instead of the temperature layer, or if

they selected the temperature layer but circled the wrong

temperature range, then task 2 would be scored as incorrect.

The task would be scored as not attempted had the student not

performed any actions toward completing the task at all.

Task Completion Time

We measured task completion time by analyzing the

screencast time log, which keeps track of all onscreen

activity in hours, minutes, and seconds. Using the time logs,

it was possible to determine how much time students spent

performing actions that were both necessary and unneces-

sary for completing a task. Necessary actions were those that

were relevant to the task, such as panning or navigating the

map interface, turning on a data layer, or using the drawing

tool. Unnecessary actions were considered non-relevant to

the task, for example, opening and closing the browser

window, using other applications on the computer, or surfing

the Internet. The following measures were used when cal-

culating students’ task completion time:

Computer session: The total amount of time a student

spent at the computer, regardless of what he or she was

doing.

Task start time: The first instance when a student used

the mouse or keyboard to perform an action that was rel-

evant to completing the task (e.g., clicking a data layer or

drawing a circle on the map interface).

Task end time: The last instance when a student used the

mouse or keyboard to perform an action that was relevant

to completing the task. Task end time was also signaled

when a student completed the task (either correctly or

incorrectly), when they moved onto another task, or when

they left the computer or classroom.

Total task time: Total task time was the sum of all relevant

onscreen activity that a student performed toward completing the

three tasks. When analyzing the time log, any periods of com-

puter inactivity (i.e., no onscreen movement) or interruptions to

relevant activity of 5 or more seconds were excluded from the

total task time calculation. In other words, total task time was not

determined from a single start and end time, but from summing


123

multiple intervals of relevant onscreen activity. The only

exception was when students were reading the activity instruc-

tions or when they were discussing a task with their peer. Both of

these actions were considered as being relevant toward the

completion of a task.

Task completion time (%): Task completion time was the

percentage of overall computing time that was spent perform-

ing actions that were relevant to the tasks. Task completion time

was calculated by dividing the total amount of time a student

spent at the computer (Computer session) with the total amount

of time spent completing tasks (Total task time).

Results

Task Performance

Students’ task performance results suggested they were still

experiencing some usability problems with the modeling

tool. For all three tasks, both the completion and accuracy

rates were relatively low (Fig. 4). For example, in task 1,

37.5 % of students were successful in circling a city with

the drawing tool and noting the correct average annual

temperature range for that city. For tasks 2 and 3, none of

the students could successfully use the drawing tool to

circle either the correct average temperature or precipita-

tion area for a specific range. Moreover, after working on

task 1, only 12.5 % of students went on to attempt task 2,

and no students went on to attempt task 3.

Data from the screencast videos revealed that students’

task performance and completion rates were associated

with their use of the interactive tools in the modeling

interface. In task 1, which involved circling a city and

reporting the average annual temperature, students had no

problems locating a city on the map. They did, however,

have difficulties using the polygon tool for drawing a circle

around the city. Polygon tools are typical in GIS systems,

but they can be awkward to use as they rotate around a

central pivot point and require coordinated mouse move-

ments of clicking and dragging. Additionally, a polygon

tool creates shapes that include shading, which increased

the complexity of the map interface by adding another

color layer. Students also struggled with the checkbox

feature when selecting the temperature and precipitation

data layers for viewing on the map. Geospatial projections,

including those used by our modeling tool, involve large

data files that typically take several seconds to render on a

digital map. If the temperature and precipitation data were

not immediately visible to students after clicking the

checkbox, they would keep clicking repeatedly. Students

did not realize that by doing this they were essentially

turning the data layers on and off, which only further slo-

wed the loading of the map projections and added to their

frustration.


Students spent an average of 28.52 % (SD = 23.51 %) of their

total computer time on actions that were relevant to completing

the tasks (Table 2). Tasks that took longer to complete were

those that had higher performance scores. For example, stu-

dents spent an average of 3 min and 11 s working on task 1

(where 62.5 % of students attempted the task), 14 s on task 2

(where 12.5 % of students attempted the task), and zero time

on task 3 (where no student attempted the task). The percentage

of total computer time spent on non-relevant task actions was

71.48 %.

As a result of these findings, we made some design modi-

fications to the modeling interface before continuing with

usability testing. Specifically, we changed the checkbox tool

in the data panel to large on/off slider buttons similar to those

used in iPhones. In addition, we switched the polygon tool for

a more intuitive free-form drawing tool that could outline an

area on the map without adding any shading. Figure 5 high-

lights the interactive modeling tools that were used in the

alpha version in stage 2 of usability testing; Fig. 6 highlights

the changes that were made to improve usability in the beta

version that was evaluated in stage 3.

Stage 3: Beta Testing of Modeling Tool

Procedure

In stage 3, we conducted a second round of collaborative

think-alouds following the same procedure described in

stage 2. Fourteen different students completed the same

three tasks using the redesigned interactive modeling tools.

Analysis

Task Performance and Completion Time

As in stage 2, we measured task performance and completion

time using the screencast videos and time logs collected

during student think-alouds. Both task performance and task

0

20

40

60

80

100

Task 1 Task 2 Task 3

Fre

qu

ency

(%

)

Task Completion and Accuracy

CorrectIncorrectNot attempted



123

completion time were analyzed using the same metrics that

were described in stage 2.

Qualitative Analysis of Think-Aloud Discussions

To triangulate our study findings, we conducted a qualitative

analysis of students’ think-aloud discussions. To determine

whether there were additional usability problems not related

to the modeling interface, we developed a coding scheme

that reflected the use of models as a practice and the use of

modeled data for making inferences about climate (Table 3).

After transcribing the discussions, we segmented the data

into meaning units before completing several iterations of

coding. The unit of analysis, the meaning unit, was defined as

a sentence or a set of adjacent sentences that were part of the

same thought or idea (Miles and Huberman 1994). Students’

think-aloud discussions from stages 2 and 3 were included in

the qualitative analysis.

The coding rubric was organized into two categories of

observed errors that represented the practice and core disci-

plinary idea related to the use of models to explain and predict

the impacts of climate change on ecosystems. The first cate-

gory, called Modeling Technology, refers to the use or misuse

of the online modeling features, such as the drawing tool and

map zooming feature. We categorize the knowledge needed to

overcome this error type as largely associated with the practice

of scientific modeling (i.e., the correct use of models). The

second category, called Modeling Data, refers to the use or

misuse of the data represented in the model; for example,

being able to determine a specific region based on inferring

data values between legend intervals. We categorize the

knowledge needed to overcome this error type as including

both content and the ability to interpret data. Two independent

researchers coded the discussion transcripts, with an inter-

rater reliability rate of 0.91 (Cohen’s Kappa).

The coding process was guided by the question: What is the

source or context of the usability problem students faced when

working with the modeling tool? Prior to coding, we established

criteria for code assignment for the purpose of inter-rater reli-

ability. When considering a piece of data, we asked ourselves

the following: What specific action is the student performing

when experiencing the usability problem? What is he or she

trying to accomplish in that action? Since the screencasting

software recorded students’ dialog, it was possible to connect

students’ mouse and keyboard movements to precise moments

during their think-aloud discussions. This allowed us to analyze

the discourse at specific time points in the task sequence,

including those times when students were performing actions

that were non-relevant to completing the tasks.

The purpose of creating two coding categories was to

identify usability problems that could interfere with stu-

dents’ fused knowledge development about the use of

models for predicting the impacts of climate change. Early

in the coding process, it became evident that there was

much overlap between codes, and there were few instances

when only one code category was relevant to the data. As a

result of this overlap, we adjusted our coding protocol to

allow data segments to be coded with one or more codes

from either of the code categories (Modeling Technology

and Modeling Data). Table 4 presents examples of coded

data that were assigned codes from both categories.

After the discussion transcripts were coded, we looked

for meaningful relationships among the assigned codes. For

example, was the DAT-RNG (data range) code assigned

Table 2 Students’ task completion time in stage 2 (in min:s)

Min. Max. Mean SD

Task 1 time 00:00 09:47 03:11 03:08

Task 2 time 00:00 01:28 00:14 00:31

Task 3 time 00:00 00:00 00:00 00:00

Total task time 00:00 12:19 03:34 03:54

Computer session 05:55 21:15 11:13 05:44

Task completion time (%) 00.00 57.96 28.52 23.51

Fig. 5 Interactive tools in alpha

version of modeling interface

(stage 2)


123

more or less frequently with certain codes rather than

others? And if so, what could this tell us? This line of

inquiry was ultimately not helpful, as it did not provide

insight into the source or context of a potential usability

problem. It did, however, highlight the fact that codes from

the Modeling Data category were assigned with much more

frequency than codes from the Modeling Technology

category.

Results

Task Performance

After modifying the interactive tools, there was a signifi-

cant improvement in both task completion and accuracy for

task 1 (p = .041, Fisher’s exact test) and task 3 (p = .018,

Fisher’s exact test) (Fig. 7). Unlike in the previous stage of

usability testing, the majority of stage 3 students (78.6 %)

were successful in locating a city on the map and circling it

with the free-form drawing tool (task 1). When working on

task 2 (circling a temperature range on the map), 21.4 % of

students circled the correct average temperature range,

28.6 % circled an incorrect average temperature range, and

50 % did not attempt the task. Task 3 performance was

similar, with the same number of students (28.6 %) cor-

rectly and incorrectly circling the specified average pre-

cipitation range, and 42.8 % not attempting the task at all.


Students in stage 3 were able to complete the same tasks in

less time than students in stage 2, a difference which was

statistically significant t(20) = 2.265, p = .035 (Table 5).

Students who completed task 1 did so quickly, spending an

average of 33 s working on the task (as opposed to 3 min,

11 s as in stage 2). Half of the students (50 %) completed

task 2 in an average of 48 s, a 37.5 % increase from the

previous stage. On average, students in stage 3 spent 83.64 %

(SD = 9.19) of their total computer time on actions that were

not relevant to completing the modeling tasks.

Qualitative Analysis of Think-Aloud Discussions

Coding for errors resulted in the identification of two task

objectives that were related to using both the modeling

technology and the modeling data. These were navigating

to a specific area within the map and reporting the infor-

mation that was represented in the map. These task

objectives were the underlying cause of many of students’

usability problems and were relevant to fused knowledge

development about the practice of modeling and the ideas

and data that are important for understanding climate

change. When analyzing the think-aloud discussions and

screencast videos together, the task objectives could be

linked to the crosscutting concept of scale, proportion, and

quantity and that required application for the students to

complete the modeling tasks correctly.

Fig. 6 Interactive tools in beta

version of modeling interface

(stage 3)

Table 3 Coding rubric for identifying the source of usability

problems

Modeling technology

codes

Modeling data

codes

Description Code Description Code

TEC: map zooming TEC-

MZOOM

DAT: average

temp.

DAT-

AVGT

TEC: map panning TEC-

MPAN

DAT: average

precip.

DAT-

AVGP

TEC: page nav. TEC-

PGNAV

DAT: map

scaling

DAT-

MSCL

TEC: data display TEC-

DATDIS

DAT: data range DAT-

RNG

TEC: drawing tool TEC-

DRTOOL

DAT: data

legend

DAT-

LGD


123

Required Crosscutting Concept: Scale, Proportion,

and Quantity

Developing and using models to make predictions about

climate require an understanding about the data and

information that is represented in the model.

In this study, the environmental data used in the mod-

eling tool were projected as ranges of average temperature

and precipitation. Before beginning the tasks, students

were given a short reading that explained these ideas at a

reading level appropriate for middle school students. In the

modeling interface, this information was presented in data

legends, where the temperature legend included 11 ranges

(-30 through 29.9 �C) and the precipitation legend inclu-

ded 9 (0 through 449.9 cm). Although students were given

this information before beginning the tasks, they still had

difficulties understanding that the colors in the data layers

represented ranges of average annual temperature and

precipitation, and not single values. For example, in task 2,

students were asked to circle the area on the map that was

between 3 and 18 �C. When they could not find those exact

same numbers on the temperature legend (which was in

increments of 5 �C), they often became frustrated and

abandoned the activity.

The quantity of the projected ranges for the temperature

and precipitation data also appeared to confuse students.

For example, when circling the average annual precipita-

tion range of 60–240 cm, students could not understand

how it could cover the entire west side of the United States.

Other times, students’ unfamiliarity with weather and cli-

mate data appeared to cause them to rely on the colors of

average temperature ranges instead of the actual data val-

ues. In these situations, students would base the tempera-

ture of their city on their association of ‘‘hot’’ and ‘‘cold’’

color regions—where blue shades represented cold regions

and red shades represented hot regions.

A limited understanding about map scaling was another

cause of students’ usability problems. When locating a city

on the map, students tended to zoom in far too closely,

clicking the zoom button five or six times when only one or

two clicks were necessary. When they were zoomed in that

far, the temperature and precipitation data layers would

typically display only one or two colors because of the

large scale at which students were viewing the map. When

viewing the model at this scale, students became confused

and disoriented, not understanding why a city or other

landscape feature was no longer in view. In addition, stu-

dents did not seem to understand that the same map detail

they could view at a small scale would no longer be visible

if they zoomed in closely on the same area. When this

happened, students typically panned to different locations

Table 4 Examples of single and multi-coded data segments

Coded transcript clause Assigned code(s)

I don’t know what I’m doing.

I just want my map tiny again!

TEC-MZOOM

This map is dumb—it doesn’t show

anything. I picked Chicago but

the dot went in Wisconsin

TEC-MPAN

DAT-MSCL

Between what? 60 and 240? 60, so it has

to be yellow?

DAT-AVGP

DAT-RNG

DAT-LGD

I went on temperature [in the data panel]

and clicked it. Nothing happened though

TEC-DATDIS

DAT-AVGT

The shape isn’t saving. Stop! Ugh, whatever! TEC-DRTOOL

What color am I supposed to pick?

Detroit is cold, so I guess I’ll pick blue?

DAT-AVGT

DAT-RNG

DAT-LGD

Where the heck is the ‘‘Next’’ button? TEC-PGNAV

Between what? 60 and 240? 60, so it

has to be yellow?

DAT-AVGP

DAT-RNG

DAT-LGD

I don’t know how to do this, it says draw

a line on the map that’s between 3

and 18 �C. Well, there is no 3

and 18! It just says 0–4, 5–9, 10–19!

DAT-AVGT

DAT-RNG

DAT-LGD

My city disappeared and now

I’m in the ocean! Why am I in the ocean!?

TEC-MZOOM

DAT-MSCL

Why are these circles so big? It’s like,

half the country

DAT-AVGP

DAT-RNG

0

20

40

60

80

100

Task 1 Task 2 Task 3

Fre

qu

ency

(%

)

Task Completion and Accuracy

Correct

Incorrect

Not attempted


Table 5 Students’ task completion time in stage 3 (in min:s)

Min. Max. Mean SD

Task 1 time 00:00 01:40 00:33 00:26

Task 2 time 00:00 02:30 00:48 00:55

Task 3 time 00:00 02:57 01:15 00:54

Total task time 00:00 05:46 02:30 01:26

Computer session 12:47 20:35 15:19 02:48

Task completion time (%) 00.00 33.95 16.36 09.19


123

on the map without first reducing the scale, which failed to

help them become reoriented with their positioning within

the map.

Discussion

Our goal in this study was to identify students’ usability

problems when working with a technology tool that was

redesigned to foster middle school students’ learning about

fused knowledge focused on modeling climate change

impacts. Our findings indicate that students struggled with

the use of models in two ways—with the practice of mod-

eling itself and with the use of modeling data for interpreting

information about climate. From the screencasting videos, it

was evident that students became frustrated when the inter-

active tools did not work as expected, and that students spent

a significant amount of their computer time was spent

reacting to errors made with the modeling tool. From the

analysis of students’ think-aloud discussions, we discovered

that many of the usability problems were closely tied to the

crosscutting concept of scale, proportion, and quantity.

The design improvements made SPECIES as part of our

evaluation resulted in higher usability and performance

outcomes for all three modeling tasks. These improvements

also increased students’ productive learning time by

reducing the amount of time it took for students to com-

plete the tasks. Despite the gains in both usability and

performance, we recognize that more could be done to

support students’ use of a professional modeling tool that

has been redesigned to foster fused knowledge about the

potential impacts of climate change on ecosystems.

Strategies for Repurposing a Professional Modeling

Tool

In science, models are used to construct explanations and

make predictions about natural systems and phenomenon. In

this study, we discovered that many students lacked knowl-

edge of a crosscutting concept that is important for under-

standing climate change and its impacts. To analyze and

interpret climate data, students need to apply the concepts of

averages, range, scale, and proportion. Applying knowledge

of a crosscutting concept can be surprisingly complex.

Rescaling large-scale and small-scale maps, for example, is

not an intuitive process and is a common cause of miscon-

ceptions among students (Nelson et al. 1992; Uttal 2000).

Some of the difficulty with scale is related to the way size is

referenced (i.e., the words ‘‘small’’ and ‘‘large’’) when

describing the scale ratio of the maps. To illustrate, a large-

scale map shows a smaller area in more detail because the

representative fraction for the ratio of map distance to land

distance is larger (e.g., 1/24,000). On a digital map, one can

easily increase the scale by zooming into see the map in

greater detail. Conversely, a small-scale map shows a larger

area in less detail because the representative fraction for the

ratio of map to land distance is comparatively smaller (e.g.,

1/240,000). Thus, a large-scale map shows a smaller area

with more detail (when zooming in), and a small-scale map

shows a larger area with less detail (when zooming out). This

example shows the depth of a crosscutting concept (i.e., scale

and proportion) that is part of the fused knowledge about

using models (the practice) to explain and predict the impacts

of climate change (core disciplinary idea).

To prepare students for STEM careers and promote

scientific literacy, educational technology tools and cur-

ricular resources must co-support the three dimensions of

knowledge in students’ science learning. Previous work

(Quintana et al. 2004) has synthesized the design features

of educational software tools into a scaffolding design

framework for supporting science inquiry. Here, we discuss

design strategies specific to repurposing a professional

modeling tool, emphasizing the role of usability for sup-

porting students’ knowledge development about science

and engineering practices, crosscutting concepts, and core

disciplinary ideas.

Sense-making Scaffolds for Modeling Data

Using technology to model systems and system processes

often requires the use of complex forms of data. When using

a computer model, both the model and the technology

become increasingly complex when multiple data sets are

used. Even a single data set, such as temperature or precip-

itation, can compound different types of information. When

these and other data sets are projected together into future

climate models, the amount and complexity of concepts and

disciplinary ideas increase considerably. Problems can arise

when the representation of data (e.g., ranges of average

annual temperature or precipitation) cannot be simplified

within the model, and students fail to see how multiple ele-

ments fit together when interpreting data.

How can technology designers reconcile the cognitive

demands for improved usability? One strategy is to develop

scaffolds that help students make sense of complex

forms of data. Given the complexity of computer-based

modeling, a first step might involve an inventory of the

underlying crosscutting concepts in the data. Breaking data

down into its constituent parts, either by building in defi-

nitions or adding a supplemental activity, can help students

make sense of complex data representations (Quintana

et al. 2004). However, because of time constraints or other

factors, it may not always be possible or practical to

develop technology scaffolds that give individual treatment

to crosscutting concepts. To address multiple concepts

simultaneously, scaffolds could take the form of a video or


123

other media that introduces students to the underlying pro-

perties of the data and are important for developing fused

knowledge about modeling the impacts of climate change.

Creating a video, animation, or other resource that uses the

language and terminology used in the curriculum is another

strategy for helping students make sense of complex forms

of data.

Adopt Design Features of Ubiquitous Technologies

An important part of improving usability is reducing the

amount of time it takes for users to learn how to use the

technology (Preece et al. 2002). One approach to decreasing

technology learning time is to increase the familiarity of the

interface. Complex interfaces can be made more usable for

students by adopting design features of technologies that

may be familiar to students. Students using the computer

conference Pepper (Hewitt and Brett 2011), for example, can

use a ‘‘like’’ button similar to the one in Facebook to indicate

their agreement and support for their classmates’ ideas.

Other learning environments (Manouselis et al. 2011) have

used the recommender system popularized by Amazon to

suggest activities to students based on their prior perfor-

mance on activities in the learning environment.

When designing SPECIES, the use of OpenLayers

resulted in a map interface that was similar to Google

maps, a map application common to many web sites. In the

data panel of our modeling tool, using sliding on/off but-

tons like the ones used in iPhones improved usability by

clearly indicating the view status of the temperature and

precipitation layers. In addition, by switching the center-

pivoting polygon tool for a simple line-drawing tool typical

in most drawing applications, students were able to make

significantly more annotations on the map interface. These

findings demonstrate that even minor design modifications

can have a positive impact on how students use a sophis-

ticated modeling technology.

Align Data sets to Learning Goals

To practice scientific modeling, students require a level of

technological proficiency for analyzing and interpreting the

data in the model. In stages 1 and 2 of our evaluation,

students often expressed frustration when resizing maps on

the modeling interface. While difficulties using the navi-

gation tool would seem the likely cause, we realized that

students were facing an additional impediment. Using a

map-based modeling tool requires knowledge of spatial

and proportional reasoning (Bausmith and Leinhardt 1998),

concepts that many of our students appeared to lack. This

deficit may explain why students were confused when they

overclicked the zoom button and found themselves in the

ocean—they did not understand that by increasing the scale

of the model, they were simultaneously decreasing the

viewable area of the map.

Analyzing and interpreting data for making predictions

about climate change impacts is an important component in

many of our curricular activities. Our study findings suggest

that students’ lack of knowledge about concepts associated

with temperature and precipitation data contributed to stu-

dents’ usability problems with the modeling tool. In most

online computer models, increasing the volume of available

data will also increase the demand and load time on the

computer (Edelson 2004). As the number of data sets

increases, so does the complexity of the technology. Limiting

data sets to include only those that address the learning goals

will contribute toward a more simplified and usable modeling

interface. In some cases, transforming dynamic map models

into a static image (thus eliminating the need for map scaling)

may be all that is necessary to strategically simplify a model.

Imposing task boundaries in this way can support students’

inquiry learning by scaffolding the data management aspect of

the modeling system (Quintana et al. 2004). This can also

increase students’ productive learning time by eliminating

interface features that are tangential to the learning goals.

It should be noted that not all modeling design features

have the potential for being modified. There are often

certain design parameters that researchers and technology

specialists must work within when repurposing a profes-

sional modeling tool. Certain navigational controls, such as

map zooming and panning features, are often standardized

and non-changeable in packaged web applications. How-

ever, the movement toward free access to geospatial

technologies (e.g., Open Source Geospatial Foundation) is

increasing the GIS functionality options that are available

to developers. This has implications for educational

researchers working with modeling tools, since the appli-

cation can be customized to be compatible with different

forms of data and designed in a manner that is appropriate

for a specific target audience. Although this study focused

on GIS systems, the applicability and relevancy of the

findings—supporting students’ learning of tridimensional

knowledge—can be generalized to the broader context of

computer-supported science learning.

Looking forward, it is important to realize that challenges

to using computer-based models are likely to increase.

Advancements in technology have made it possible for sci-

entists to work with complex digital data sets, many of which

were unavailable less than a decade ago. This presents a

challenge for researchers and educators wishing to incorporate

fused knowledge about scientific modeling into the curricu-

lum. GIS systems and other modeling tools will also become

more sophisticated as the technology becomes more powerful.

These developments highlight the importance of repurposed

modeling tools like SPECIES that limit functionality to focus


123

students’ attention on what is important—analyzing and

interpreting modeling data for making predictions and con-

structing explanations about complex ideas in science.

Conclusion

This study evaluated the usability of a professional modeling

technology that was repurposed to support middle school

students’ learning about fused knowledge associated with

climate change impacts. Usability testing identified a number

of problematic features in the interface design of the modeling

tool. More importantly, it identified a number of data-related

usability problems related to the crosscutting concepts that

needed to be applied to the modeling data. Evaluating the

usability of a repurposed modeling technology is important for

establishing classroom efficacy and for having broader edu-

cational impacts through product scalability.

The findings from this study have implications for cur-

riculum developers and science educators. New technologies

bring with them new opportunities for developing a deep

understanding of science. In this study, many students were

unable to apply the concepts of scale, proportion, and

quantity when working with the temperature and precipita-

tion data, which is necessary for analyzing and interpreting

climate data. Subsequent versions of SPECIES will include

dedicated scaffolds for supporting students’ fused knowl-

edge development about the use of models for predicting the

impacts of climate change. To improve the learning experi-

ence with a repurposed professional modeling tool, we rec-

ommend early usability testing for identifying the aspects of

tridimensional knowledge that require scaffolding. We also

recommend the use of screencast videos as a method for

capturing authentic data from student think-alouds. Finally,

we encourage additional work that increases the range and

quality of empirical information available for supporting

fused knowledge development that includes practices,

crosscutting concepts, and core disciplinary ideas in science.

Acknowledgments This research is based upon work supported by

the National Science Foundation under Grant No. 0918590. Any

opinions, findings, and conclusions or recommendations expressed in

this material are those of the author(s) and do not necessarily reflect

the views of the National Science Foundation.

References

Alibrandi M (2003) GIS in the classroom: using geographic

information systems in social studies and environmental science.

Heinemann, Portsmouth

Baker TR (2005) Internet-based GIS mapping in support of K-12

education. Prof Geogr 57(1):44–50

Baker TR, White SH (2003) The effects of G.I.S. on students’

attitudes, self-efficacy, and achievement middle school science

classrooms. J Geogr 102(6):243–254

Bausmith JM, Leinhardt G (1998) Middle-school students’ map

construction: understanding complex spatial displays. J Geogr

97(3):93–107

Beach JH, Stewart AM, Vorontsov, GY (2002) Mapping distributed

life with distributed computation. In: Proceedings of the 22nd

annual Esri international user conference. Redlands, CA: Esri

Publications

Benyon D (1993) Adaptive systems: a solution to usability problems.

User Model User Adap Inter 3(1):65–87

Board College (2009) Science: college board standards for college

success. The College Board Press, New York

Bury KF (1984) The iterative development of usable computer

interfaces. In Proceedings of IFIP INTERACT’84 international

conference on human–computer interaction, pp 743–748. London,

UK, Sept 4–7

Ceaparu I, Lazar J, Bessiere K, Robinson J, Schneiderman B (2004)

Determining causes and severity of end-user frustration. Int J

Hum Comput Interact 17(3):333–356

Coll R, France B, Taylor I (2005) The role of models/and analogies in

science education: implications from research. Int J Sci Edu

27(2):183–198

Collins A (2011) A study of expert theory formation: the role of

different model types and domain frameworks. In: Khine MS,

Saleh IM (eds) Models and modeling: cognitive tools for

scientific enquiry. Springer, New York, pp 23–40

Crowther MS, Keller CC, Waddoups GL (2004) Improving the quality

and effectiveness of computer-mediated instruction through

usability evaluations. British J Edu Technol 35(3):289–303

Csikszentmihalyi M (1997) Finding flow: the psychology of engage-

ment with everyday life. Basic Books, New York

Edelson DC (2004). Designing GIS software for education: a

workshop report for the GIS community. The Geographic Data

in Education Initiative at Northwestern University

Edelson DC, Gordin D (1998) Visualization for learning: a framework

for adapting scientist’s tools. Comput Geosci 24(7):607–616

Gobert JD, Pallant A (2004) Fostering students’ epistemologies of

models via authentic model-based tasks. J Sci Educ Technol

13(1):7–22

Hewitt J, Brett C (2011) Engaging learners in the identification of key

ideas in complex online discussions. In Spada H, Stahl G, Miyake N,

Law N (eds) In: Proceedings of the computer-supported collabo-

rative learning conference, pp 960–961, Hong Kong, July 4–8

IPCC (2007) Climate change 2007: synthesis report. In: Core writing team,

Pachauri RK, Reisinger A (eds) Contribution of working groups I, II

and III to the fourth assessment report of the intergovernmental panel

on climate change. IPCC, Geneva, Switzerland

Johnson RR, Salvo MJ, Zoetewey MW (2007) User-centered technology

in participatory culture: two decades ‘‘Beyond a narrow conception of

usability testing’’. Proc IEEE Transact Prof Commun 50(4):320–332

Kaplan DE, Black JB (2003) Mental models and computer-based

scientific inquiry learning: effects of mechanism cues on

adolescent representation and reasoning about causal systems.

J Sci Educ Technol 12(4):483–493

Keating T, Barnett MG, Barab SA, Hay KE (2002) The virtual

solar system project: developing conceptual understanding of

astronomical concepts through building three-dimensional com-

putational models. J Sci Educ Technol 11(3):261–275

Landsberger HenryA (1958) Hawthorne revisited: management and

the worker: its critics, and developments in human relations in

industry. Cornell University Press, Ithaca

Lazar J, Jones A, Shneiderman B (2006) Workplace user frustration

with computers: an exploratory investigation of the causes and

severity. Behav Inf Technol 25(3):239–251

Lehrer R, Schauble L (2006) Scientific thinking and science literacy.

In: Damon W, Lerner R, Renninger KA, Sigel IE (eds)

Handbook of child psychology, vol 4, 6th edn. Wiley, Hoboken


123

Manouselis N, Drachsler H, Vuorikari R, Hummel H, Koper, K

(2011) Recommender sytems in technology enhanced learning.

In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommener

systems handbook. Springer, New York, pp 387–415

Michigan Department of Education. K-7 Content grade level expectations

in science, version 1.09. Retrieved from: http://michigan.gov/mde/

Miles MB, Huberman AM (1994) Qualitative data analysis: an expanded

sourcebook, 2nd edn. Sage Publications, Thousand Oaks

Miyake N (1986) Constructive interaction and the iterative process of

understanding. Cognitive Science 10:151–177

National Research Council (2012) A framework for K-12 science

education: practices, crosscutting concepts, and core ideas.

National Academies Press, Washington

Nelson BD, Aron RH, Francek MA (1992) Clarification of selected

misconceptions in physical geography. J Geogr 91(2):76–80

Nielsen J (1993) Iterative user-interface design. IEEE Comput 26(11):

32–41

Nielsen J (2003) Usability 101: an introduction to usability. Jakob

Nielsen’s Alertbox. Base article retrieved 2 Nov 2010, from:

http://www.useit.com/alertbox/20030825.html

Nielsen J (2010). Children’s websites: usability issues in designing for

kids. Jakob Nielsen’s Alertbox. Article retrieved 2 Nov 2010,

from: http://www.useit.com/alertbox/children.html

Nielsen J, Landauer TK (1993) A mathematical model of the finding

of usability problems. In Proceedings of ACM INTERCHI’93

conference. ACM Press, Amsterdam, pp 206–213

Pallant A, Tinker RF (2004) Reasoning with atomic-scale molecular

dynamic models. J Sci Educ Technol 13(1):51–66

Preece J, Rogers Y, Sharp H (2002) Interaction design: beyond

human-computer interaction. Wiley, New York

Quintana C, Reiser BJ, Davis EA, Krajcik J, Fretz E, Duncan RG, Kyza

E, Edelson D, Soloway E (2004) A scaffolding design framework

for software to support science inquiry. J Learn Sci 13(3):337–386

Schwarz CV, White BY (2005) Metamodeling knowledge: developing

students’ understanding of scientific modeling. Cogn Instr 32(2):

165–205

Sengupta P, Wilensky U (2011) Lowering the learning threshold:

multi-agent-based models and learning electricity (pp 141–171).

In Khine MS, Saleh IM (eds) Models and modeling: cognitive

tools for scientific enquiry. Springer, New York, pp 23–40

Shapiro AM (2008) Hypermedia design as learner scaffolding. Edu

Tech Res Dev 56:29–44

Shneiderman B, Bederson BB (2005) Maintaining concentration to

achieve task completion. In: Proceedings of the 2005 designing

user experience conference. AIGA Press, San Francisco

Shorkey CT, Crocker SB (1981) Frustration theory: a source of unifying

concepts for generalist practice. Soc Work 26(5):374–379

Sim G, MacFarlane S, Read J (2006) All work and no play: measuring

fun, usability and learning in software for children. Comput

Educ 46(3):235–248

Soloway E, Guzdial M, Hay KE (1994) Learner-centered design: the

challenge for HCI in the 21st century. Interactions 1(2):36–48

Songer NB (2006) BioKIDS: an animated conversation on the

development of complex reasoning in science (pp 355–369). In:

Sawyer RK (ed) Cambridge handbook of the learning sciences.

Cambridge University Press, New York

Songer NB, Kelcey B, Gotwals AW (2009) How and when does

complex reasoning occur? Empirically driven development of a

learning progression focused on complex reasoning about

biodiversity. J Res Sci Teach 46(6):610–631

Sullivan P (1989) Beyond a narrow conception of usability testing.

Proc IEEE Transac Prof Commun 32(4):256–264

Taylor I, Barker M, Jones A (2003) Promoting mental model building

in astronomy education. Int J Sci Edu 25(10):1205–1225

Turner CW, Lewis JR, Nielsen J (2006) Determining usability test sample

size. In: Karwowski W (ed) International encyclopedia of ergonomics

and human factors, vol 4, 6th edn. CRC Press, Boca Raton

Uttal DH (2000) Seeing the big picture: map use and the development

of special cognition. Dev Sci 3(3):247–264

Varma K, Linn MC (2011) Using interactive technology to support

students’ understanding of the greenhouse effect and global

warming. J Sci Educ Technol. doi:10.1007/s10956-011-9337-9

Virzi RA (1992) Refining the test phase of usability evaluation: how

many subjects is enough? Hum Factors 34(4):457–468


123

http://michigan.gov/mde/

http://www.useit.com/alertbox/20030825.html

http://www.useit.com/alertbox/children.html

http://dx.doi.org/10.1007/s10956-011-9337-9