Children’s Mathematical Reasoning in Online Games: Can Data Mining Reveal Strategic Thinking?

CHILD DEVELOPMENT PERSPECTIVES

Children’s Mathematical Reasoning in Online Games:Can Data Mining Reveal Strategic Thinking?

Shalom M. Fisch,1 Richard Lesh,2 Elizabeth Motoki,2 Sandra Crespo,3 and Vincent Melfi3

1MediaKidz Research & Consulting, 2Indiana University, and 3Michigan State University

ABSTRACT—Children’s interaction with educational com-

puter games reflects not only their game-playing expertise

but also their knowledge and skills about embedded

educational content. Recent pilot data, drawn from an

ongoing evaluation of children’s learning from educa-

tional media, illustrate that, much like earlier research on

formal classroom mathematics, children may engage in

cycles of increasingly sophisticated mathematical thinking

over the course of playing an online game. It is possible to

detect these shifts in strategies not only through in-person

observations, but via data mining of online tracking data

as well. This article discusses implications for the study of

mathematical reasoning, children’s use of educational

games, and assessment.

KEYWORDS—media; computer games; online; mathematics;

reasoning; assessment

None of us is born with a separate part of our brain that we use

exclusively for playing computer games. While playing games,

This research was funded as part of a grant from the National Sci-ence Foundation (DRL-0723829). We gratefully acknowledge thestaff, teachers, and students of the participating school. We alsothank the Cyberchase production team (especially Sandra Sheppard,Frances Nankin, and Michael Templeton) for their support, andonline producers David Hirmes and Brian Lee for building the track-ing software we used here. Finally, we are grateful to the fieldresearchers who helped collect our pilot data: Meredith Bissu, SusanR.D. Fisch, Carmina Marcial, Jennifer Shulman, Nava Silton, FaithSmith, and Carolyn Volpe. Without them, this article—and thedevelopment of this methodological approach—would have beenimpossible.

Correspondence concerning this article should be addressed toShalom M. Fisch, MediaKidz Research & Consulting, 78 GraysonPl., Teaneck, NJ 07666; e-mail: [email protected].

ª 2011 The Authors

Child Development Perspectives ª 2011 The Society for Research in Child Development

Volume 5, Number 2,

we apply the same sorts of knowledge, inferences, and cognitive

skills that we use in our offline lives.

With that in mind, researchers who study human–computer

interaction have sometimes drawn on established theories of

human cognition to explain users’ thinking while playing games

(e.g., Mayer & Moreno, 2003; Moreno, 2006) or have noted simi-

larities that exist between online and offline thinking and behav-

ior (e.g., Gee, 2003). Constructs such as social schemas certainly

play vital roles in online social networking (e.g., Subrahmanyama

& Greenfield, 2008). Moreover, even when users know that they

are interacting with machines rather than other live users,

research has shown that the same sorts of social schemas that

govern interactions with other people influence these inter-

actions, too, regardless of whether the device in question is an

animatronic, talking doll (Strommen, 2003) or a desktop

computer (Reeves & Nass, 1996).

By the same token, when children play educational computer

games, we might expect their reasoning to follow the same sorts

of paths that they use while figuring out similar educational con-

tent in real (offline) life. If so, this would not only help us under-

stand children’s use of educational technology but also present a

significant methodological opportunity for research. Successful

educational games have a tremendous reach among children; for

example, the mathematics-based Cyberchase website (http://

www.pbskids.org/cyberchase) has logged more than one billion

page views to date. Given the countless bits of data generated

while playing a game, data mining could yield a vast pool of data

for investigating applied reasoning during naturalistic play.

As part of a major, multiyear study of children’s mathematics

learning from Cyberchase, our research team explored the possi-

bility of using online Cyberchase games as instructional tools and

simultaneously as a means of assessing children’s problem solv-

ing as well. The field of computer-assisted instruction (CAI) rep-

resents a long history of teaching and assessing knowledge via

interactive games (e.g., Price, 1989; Rudestam & Schoenholtz-

Read, 2002; Suppes & Macken, 1978). However, unlike the

kinds of software traditionally used in CAI, Cyberchase was not

2011, Pages 88–92

Online Reasoning 89

originally designed for assessment. In addition, whereas assess-

ment in CAI frequently focuses on measuring the state of users’

knowledge or skills (to determine the types of exercises that the

software will provide next; e.g., Corbett & Anderson, 1995;

Gunzelmann & Gluck, 2004), we were more interested in observ-

ing the evolution of children’s problem-solving strategies and

mathematical thinking over the course of playing a game.

Does game play reflect children’s understanding of educa-

tional content and strategies for problem solving? If so, is data

mining sufficiently rich to model the process of reasoning, as

opposed to knowledge states or simply counting right answers?

To find out, our research team observed 74 third and fourth grad-

ers (27 girls and 47 boys) in person as they played three Cyber-

chase online games, regarding decimals, quantity and volume,

and proportional reasoning. For example, in the Railroad Repair

game (http://pbskids.org/cyberchase/games/decimals), players fill

gaps in a train track by using pieces labeled with decimals

between .1 and 1.0 (see Figure 1). Multiple correct solutions are

possible. However, players can use each length of track only

once per screen. Thus, children must add decimals in order to

create the appropriate lengths of track, and find multiple ways to

make a given sum when they need the same length of track in

several places on the same screen. As the game progresses, chil-

dren also must plan ahead in deciding which pieces to use in

constructing each sum, to make sure that all of the necessary

pieces will be available when needed.

Children played each of the three games for up to approxi-

mately 15 min per game; they stopped sooner if they either com-

pleted the entire game or gave up before 15 min elapsed.

Children played each game in pairs, to facilitate conversation

that could reveal ideas and strategies as they played.

As each pair of children played each game, a researcher con-

ducted in-person observations of their game play, recording the

moves they made in the game and any conversation or behavior

that took place while playing. Simultaneously, custom-built

Figure 1. Sample screen from Cyberchase Railroad Repair game.

Child Development Perspectives, Volum

tracking software automatically recorded the children’s mouse

clicks and keyboard input. At the end of each game, interviewers

asked each pair of children directly about their strategies for

playing the game and solving its mathematical problems: how

they figured out their answers, whether they changed their strat-

egy at any point (and, if so, why), and any tips that they would

offer to a friend to help him or her play the game well.

Just as one might expect in offline mathematical reasoning, we

found a range of sophistication in the mathematical strategies

that children used while playing the games. Moreover, parallel to

research on classroom mathematics (e.g., Lesh, Hoover, Hole,

Kelly, & Post, 2000) and findings within the developmental liter-

ature on children’s use of strategy (e.g., Siegler, 2007), those

children who used more sophisticated strategies often did not

apply them immediately. Rather, they engaged in cycles of prob-

lem solving that began with less sophisticated strategies and pro-

gressed to more sophisticated approaches when necessary.

In the game Railroad Repair, many children began by using a

matching strategy in which they matched the decimals shown

(e.g., a .8 piece of track to fill a .8 gap). When this strategy later

proved insufficient (e.g., they ran out of .8 pieces, or needed to

fill a gap that was larger than the 1.0 piece), some switched to an

additive strategy (e.g., combining .6 and .2 to fill a .8 gap). When

this strategy, too, proved insufficient (e.g., they needed the .2

piece later), some adopted an advanced strategy in which they

planned ahead, considered alternate ways to make the necessary

sums, and reserved pieces they would need later. Below, we

present percentages of children who reached each level of

sophistication.

As in past research on classroom mathematics (e.g., Lesh

et al., 2000), such changes in strategy were apparent from

in-person observations of behavior and conversations that

occurred during game play. For example, Table 1 presents an

excerpt of observations of one pair of girls as they played Railroad

Repair. In this excerpt, taken from the middle of the game, the

players begin by continuing to use the additive strategy that they

employed successfully on the previous screen, only to discover

that the strategy fails when they run out of the necessary pieces.

They recognize the failure, consider other ways to approach the

task (e.g., via subtraction, which the game does not permit), and

then reapproach the task of filling the gaps on the screen by

adopting a more sophisticated advanced strategy that also takes

the order of pieces into account. Their shift from an additive to an

advanced strategy is signaled both by the conversation between

the 2 players and by aspects of their behavior (e.g., their choices

among the available pieces of track, their use of the ‘‘clear’’ button

to clear the pieces from the screen and start over).

We conducted a qualitative analysis to determine whether it

was possible to use online tracking data to detect such strategies,

and shifts in strategies. Specifically, we examined the tracking

data to evaluate whether we could identify patterns of responses

that were associated with the types of mathematical strategies we

described earlier, whether tracking data could detect shifts

e 5, Number 2, 2011, Pages 88–92

Table 1

Excerpt From In-Person Observations of One Pair of Girls Playing Railroad Repair

Observation Interpretation

‘‘I think we’re supposed to use the 1 and then the 10’’The players attempt to make the desired sum

On this screen, which appears in the middle of the game, the 2 players begin bycontinuing the additive strategy that they employed successfully on the previousscreen. Here, they attempt to add 1.0 + .1 to fill one of the gaps on the screen

‘‘Uh-oh. Can [we] subtract?’’‘‘This is too confusing’’

The additive strategy breaks down as they realize that they have already used one of thetwo desired pieces, so it is no longer available. Note that subtraction is not possible inthe game

The players clear the pieces from the screen Use of the ‘‘clear’’ button indicates that the players recognize that their attempt hasfailed, and is a precursor to trying again with either the same strategy or a new one

‘‘This time, we’ll start with the mini-pieces . . . ’’ The players transition from an additive strategy to an advanced strategy, in which theyconsider not only which pieces to choose but also the order in which to use the pieces

Table 3

Sample Tracking Data From Railroad Repair (Second Screen)

Rowno. Event Piece Round

Successfulplacement?

Elapsedtime

5 piecepress track8 2 n ⁄ a 22.0906 piecedrop track8 2 Success 24.1017 piecepress track7 2 n ⁄ a 25.3298 piecedrop track7 2 Success 26.9429 piecepress track1 2 n ⁄ a 28.503

10 piecedrop track1 2 Wrong 28.71111 piecepress track1 2 n ⁄ a 29.09912 piecedrop track1 2 Success 30.910

Note. n ⁄ a = not applicable.

90 Shalom M. Fisch et al.

between strategies, and whether the identification of strategies

via tracking data was consistent with conclusions we drew from

in-person observations and interviews with the same pairs of

children.

In fact, this qualitative analysis revealed that data mining

could detect the same types of strategies (and shifts among strat-

egies) that were evident in in-person observations and self-report

interview data. Tracking data showed consistent patterns of

online responses reflecting each strategy (matching, additive, or

advanced) and clusters of errors at points when children’s strate-

gies broke down and they shifted to new strategies. Consider, for

example, some partial output of the tracking software for one

session of Railroad Repair. On the first screen, the tracking

software shows evidence of the player adopting a matching strat-

egy, picking up a .4 piece (piecepress) and placing it (piecedrop)

to fill a .4 gap (Table 2). After accidentally putting the piece in

the wrong location (row 2 in the example below), the player then

places it correctly (row 4):

On the next screen (Table 3), the player continues the match-

ing strategy, using a .8 piece to fill a .8 gap (rows 5–6 in the

example below). However, there is more than one .8 gap on this

screen and only one .8 piece. Thus, after using the .8 piece, the

player switches to an additive strategy, using two pieces (.7 and

.1) to fill the second gap. After accidentally misplacing the .1

piece (rows 9–10), the player places it successfully (rows 11–12).

Table 2

Sample Tracking Data From Railroad Repair (First Screen)



Elapsedtime

1 piecepress track4 1 n ⁄ a 7.1612 piecedrop track4 1 Wrong 7.2723 piecepress track4 1 n ⁄ a 8.1724 piecedrop track4 1 Success 10.200

Note. Pieces of track are identified by the decimal with which they are labeled inthe game (e.g., track4 indicates a piece labeled .4). Elapsed time is thecumulative number of seconds that have elapsed from the beginning of the gamethrough the event reflected in that row of the table. n ⁄ a = not applicable.

Child Development Perspectives, Volu

For the next several screens, the player continues to use the

additive strategy, until arriving at a screen where this strategy is

no longer sufficient (Table 4). After filling several large gaps, the

player combines a .5 piece and a .4 piece to fill a .9 gap (rows

13–15), only to find that all of the smaller pieces have been used

up, which makes it impossible to fill the remaining small gaps on

the screen. Recognizing this, the player hits the ‘‘clear’’ button to

clear the screen and start over (row 16). Then, the player

Table 4

Sample Tracking Data From Railroad Repair (Fifth Screen)



Elapsedtime

13 piecedrop track5 5 Success 280.01914 piecepress track4 5 n ⁄ a 282.06515 piecedrop track4 5 Success 283.58716 clear n ⁄ a 5 n ⁄ a 285.86417 piecepress track6 5 n ⁄ a 289.23418 piecedrop track6 5 Success 290.99619 piecepress track5 5 n ⁄ a 291.98220 piecedrop track5 5 Success 293.233

Note. n ⁄ a = not applicable.

me 5, Number 2, 2011, Pages 88–92

1This point is not limited to interactive media. For a similar point regardingchildren’s comprehension of educational television programs, see Fisch (2000,2004).

Online Reasoning 91

employs an advanced strategy, in which the player plans ahead

and reserves pieces that are necessary to fill specific gaps

instead of using those same pieces to construct sums elsewhere

on the screen. Thus, in this second attempt, the player fills the

smaller gaps on the screen first (rows 17–20) to ensure that the

smaller pieces are available when needed. Afterward, the player

uses the remaining pieces to fill the larger gaps, which have

more flexibility in the variety of ways they can be filled.

As the above examples illustrate, we found that children’s

shifts in strategies were detectable, not only via in-person obser-

vations or interviews but through online tracking data too.

Changes in strategies were often associated with clusters of

errors (indicating the player trying unsuccessfully to use different

pieces to fill a gap), use of the ‘‘clear’’ button (indicating the

player’s recognition that a strategy was not working), and ⁄or

simply not having the necessary pieces available to fill gaps that

remained on the screen. Thus, we could identify, and differenti-

ate among, instances when children failed to progress beyond

basic strategies, proceeded through more difficult problems via

trial and error (without necessarily employing a fundamental

change in their thinking), or shifted to more sophisticated strate-

gies over the course of a game.

The usefulness of online tracking data in assessing children’s

thinking (along with our initial conclusions about the process of

children’s mathematical thinking) was later confirmed in the full

study that followed our initial pilot test, which involved a second,

larger sample of children who played the games as part of the

study’s experimental treatment. As in the pilot test, patterns of

responses and errors in tracking data indicated instances of

matching, additive, and advanced strategies, as well as shifts

from one type of strategy to another.

For example, in the full study, 145 fourth graders played Rail-

road Repair at least once. While playing the game for the first

time, 68% of these children showed evidence of at least one use

of an advanced strategy during the game, 28% progressed as far

as an additive strategy, 1% never moved beyond a matching

strategy, and 2% did not employ any of these strategies (and, as

a result, did not provide any correct answers). Yet, despite the

fact that 96% of the children used more sophisticated strategies

(additive and ⁄or advanced) at some point during the game, all

but 2 of the 145 children used a matching strategy at the begin-

ning of the game. Thus, even those children who were capable of

more sophisticated reasoning typically began by using a more

basic strategy and subsequently progressed to more sophisticated

strategies in response to the increasing demands of the game.

CONCLUSION

Taken together, the observation, interview, and tracking data

from our three games hold implications for researchers and prac-

titioners interested in computer games, mathematics education,

and ⁄or assessment. For those interested in children’s use of edu-

cational games, the parallels between online and offline reason-

Child Development Perspectives, Volum

ing highlight the degree to which game play is influenced by

players’ experience and skill level in playing games as well as

by their knowledge and skills regarding educational content

embedded in such games.1 As in formal classroom mathematics,

children often do not display the same level of sophistication

throughout a game, even if they are capable of relatively sophis-

ticated reasoning. Rather, their mathematical reasoning may

begin at a fairly basic level but become more sophisticated over

the course of a game, when necessary to respond to the game’s

demands.

For math educators, this similarity between online and offline

reasoning also shows that games can provide a naturalistic, out-

of-school context for assessing mathematical reasoning. Indeed,

even without in-person observations, data mining of online track-

ing data can provide a window into rich processes of reasoning

and problem solving. When recorded and coded appropriately,

such data can reflect both the outcomes and the process of prob-

lem solving. (As a result, as we noted earlier, we have chosen to

include tracking software among the assessments in our current

research on children’s learning from Cyberchase media.)

Yet, our experiences during pilot testing also point toward sev-

eral challenges that researchers must overcome if they are to use

tracking data effectively to measure reasoning. First, as anyone

who has analyzed any sort of web-based tracking data knows,

users’ clicks produce massive amounts of data. The sample data

we presented above come from a single session—and even that

one game produced a spreadsheet containing more than 120

rows of data. When multiplied by the literally thousands of users

who might play an online game in a single day, the volume of

data can become staggering, posing challenges for both storage

and analysis (even if the analysis can be partially automated).

Second, online tracking data must be limited to information

that can be collected legally. The Children’s Online Privacy Pro-

tection Act (COPPA) places strict limitations on the kinds of

information that researchers can collect from children online. To

help interpret data on game play, reasoning, or problem solving,

researchers naturally look to characteristics of the players (such

as age, gender, level of experience or prior knowledge), but

COPPA can make it difficult to gather such information online.

Because our project was part of a larger research study, and we

had parents’ signed consent for their children’s participation, we

designed the tracking software to record data only for players

whose user names matched those in our study. Outside the con-

text of such studies, however, researchers must either find alter-

nate ways to gather demographic data or do without it.

Third, tracking data are effective only for behavior that players

perform clearly and unambiguously on the screen. Whereas the

use of tracking data was highly successful for Railroad Repair,

it was only partially successful for Sleuths on the Loose

e 5, Number 2, 2011, Pages 88–92

92 Shalom M. Fisch et al.

(http://pbskids.org/cyberchase/games/bodymath), a game about

measurement and proportional reasoning in which children use

proportional reasoning to infer the size of ‘‘baby creatures’’ and

‘‘mama creatures’’ from the size of their footprints. In Sleuths on

the Loose, we could accurately record and code the answers that

children provided, but it was harder to gauge their use of mea-

surement for two reasons. Instead of using the on-screen ‘‘ruler’’

that served as a measuring tool, some children measured via

alternate means such as holding their fingers up to the screen;

the software could not detect these sorts of offline behavior. In

addition, even when children did use the on-screen ruler, track-

ing data alone were not always a reliable indicator of whether a

player was attempting to measure, because some children simply

moved the ruler idly around the screen while thinking. Thus, we

could tell whether players’ answers were correct or incorrect,

and identify some instances when players used the

on-screen ruler for measurement (by establishing parameters for

valid placement of the ruler). However, only in-person observa-

tion could identify other cases of measurement.

As our experience in this study makes clear, educational

games can provide a rich context for assessing and studying

children’s naturalistic reasoning. In addition, understanding

children’s facility with educational content is an important part

of understanding how they play educational games. Games and

tracking software must be designed carefully in order to produce

useful, reliable data. But if they are designed properly, data min-

ing can provide us with deep insight into children’s thinking and

reasoning—without our having to peek over children’s shoulders

to do it.

REFERENCES

Corbett, A. T., & Anderson, J. R. (1995). Knowledge tracing: Modelingthe acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4, 253–278.

Fisch, S. M. (2000). A capacity model of children’s comprehension ofeducational content on television. Media Psychology, 2, 63–91.

Child Development Perspectives, Volu

Fisch, S. M. (2004). Children’s learning from educational television:Sesame Street and beyond. Mahwah, NJ: Erlbaum.

Gee, J. P. (2003). What video games have to teach us about learning andliteracy. New York: Palgrave-MacMillan.

Gunzelmann, G., & Gluck, K. A. (2004, May). Knowledge tracing forcomplex training applications: Beyond Bayesian mastery estimates.Paper presented at the Simulation Interoperability StandardsOrganization’s 13th conference on Behavior Representation inModeling and Simulation, Orlando, FL.

Lesh, R. A., Hoover, M., Hole, B., Kelly, A., & Post, T. (2000).Principles for developing thought-revealing activities for studentsand teachers. In A. E. Kelly & R. A. Lesh (Eds.), Handbook ofresearch design in mathematics and science education (pp. 591–646). Mahwah, NJ: Erlbaum.

Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive loadin multimedia learning. Educational Psychologist, 38, 43–52.

Moreno, R. (2006). Learning in high-tech and multimedia environments.Current Directions in Psychological Science, 15, 63–67.

Price, R. (1989). An historical perspective on the design of computer-assisted instruction: Lessons from the past. Computers in theSchools, 6, 145–157.

Reeves, B., & Nass, C. (1996). The media equation: How people treatcomputers, television, and new media like real people and places.New York: Cambridge University Press.

Rudestam, K. E., & Schoenholtz-Read, J. (Eds.). (2002). Handbook ofonline learning: Innovations in higher education and corporatetraining. Thousand Oaks, CA: Sage.

Siegler, R. S. (2007). Cognitive variability. Developmental Science, 10,104–109.

Strommen, E. F. (2003, April). Interacting with people versus interactingwith machines: Is there a meaningful difference from the point ofview of theory? In S. M. Fisch (Chair), Theoretical approachestoward integrating cognitive and social processing of media.Symposium presented at the biennial meeting of the Society forResearch in Child Development, Tampa, FL.

Subrahmanyama, K., & Greenfield, P. M. (2008). Virtual worlds indevelopment: Implications of social networking sites. Journal ofApplied Developmental Psychology, 29, 417–419.

Suppes, P., & Macken, E. (1978). The historical path from research anddevelopment to operation use of CAI. Educational Technology,18(4), 9–11.

me 5, Number 2, 2011, Pages 88–92

Documents

Children’s Mathematical Reasoning in Online Games: Can Data Mining Reveal Strategic Thinking?