20003-01698762

7/31/2019 20003-01698762

1/4

VizEval: An Experimental System for the Study of Program Visualization

Quality

Philippa Rhodes, Eileen Kraemer, Ashley

Hamilton-Taylor, Sujith Thomas, andMatthew Ross

Computer Science Dept.The University of Georgia

Athens, GA USA 30602{rhodes, eileen, ataylor, sujith,

ross}@cs.uga.edu

Elizabeth Davis, Kenneth Hailston,

Keith MainSchool of Psychology

Georgia Institute of TechnologyAtlanta, GA

[email protected],[email protected],

[email protected]

Abstract

The VizEval Suite* is an environment designed tosupport experimentation with and evaluation of

program visualization attributes that affect the usersability to grasp essential concepts. In this paper, we

describe the VizEval Suite and an initial experiment

conducted both as a test-bed of the VizEval Suite and

to study how perceptual/cognitive characteristics of the

visualization affect the users understanding of the

program visualization. VizEval is designed to simplifythe creation and analysis of such studies. Our

experimental results show that some

perceptual/cognitive characteristics that help one task(e.g., detection of critical information) may harm

another (e.g., localization of critical items), and vice

versa. The VizEval software is available for download

athttp://www.cs.uga.edu/~eileen/VizEval/.

1. Introduction

In the context of this paper we define program

visualization(PV) as the use of animated views of

computer programs to enhance human understanding

of the behavior and properties of those programs andinclude animation of data structures, source code,

performance, and other aspects of the programstructure or behavior.

Although many programmers, instructors, and

programming students have a strong intuitive belief

that visualization is valuable for communicatinginformation about the state and behavior of programs,

* This work supported by the National Science Foundation under

grants NSF-IIS 0308117 and NSF-IIS 0308063.

empirical studies have yielded mixed results. One

contributing factor may be that viewers may havedifficulty in understanding the message the designer is

trying to convey.

In this work we look at the ability of the

visualization to help the user understand this message.We are working to identify and evaluate perceptual,

attentional, and cognitive features of program

visualizations that affect viewer comprehension and to

categorize and quantify these results. This work is

performed in the context of a larger project that

involves observational studies of instructors, empirical

studies of PV effectiveness at the level of algorithmunderstanding (SSEA: System to Study the

Effectiveness of Animations[11]), and the development

of improved presentation and interaction techniques for

program visualization in the context of computer

science education (SKA: Support Kit for

Animation[6]).

We have developed a testing environment, theVizEval Suite, that permits us to easily design,

conduct, and evaluate the results of these experiments.

In this paper we describe VizEval and present results

of an initial study of attributes commonly used inanimations of sorting algorithms. In these algorithms,

bars represent data elements. The bars locationindicates the index within an array and the heightindicates the value. Bars change height as elements are

exchanged and cueing is used to draw the viewers

attention to elements that are being compared or

swapped. We look at the effects of various perceptual

characteristics on detection (the user notices that a

change has occurred) and localization (the user is ableto identify which element has changed value).

Visual Languages and Human-Centric Computing (VL-HCC'06)0-7695-2586-5/06 $20.00 2006

7/31/2019 20003-01698762

2/4

2. Related work

Empirical studies of the effectiveness of program

visualizations for teaching computer algorithms have

had mixed results, with some studies showing

advantages for the use of animations [7,12,13,18],some showing benefits at least partially attributable tovisualization [2,9], other studies failing to show clear

benefit [15,16], and at least one study showing a

significant disadvantage to the use of visualization[14].

A good survey and meta-study of studies conducted

prior to 2002 can be found in [10].Gurka and Citrin [5] address factors that must be

considered when evaluating these results and provide aframework for experiments performed in this field. Our

research focuses on the quality of the visualizations,

one of the factors enumerated by Gurka and Citrin.

We define quality loosely to mean those attributes that

correlate with the ability of a visualization to convey a

desired concept.In the work described in this paper we have

narrowed our focus to concentrate on low-level studies

of attributes to determine their individual effects onperception and cognition. One software package used

frequently in such cognitive and perceptual research is

E-prime developed by Psychology Software Tools,

Inc.[4]. However, it lacks some features needed for

experiments targeted at program visualization.

3. VizEval Suite

The VizEval Suite allows the experimenter to

develop an experiment, to deploy that experiment andto collect and organize the output. Automation is

important because of the size and complexity of these

experiments, which may consist of hundreds of trials

and complex ordering within a trial or across testparticipants. The experiment consists of a number of

blocks. Each block contains some number of trials. In

each trial, the participant views a short animation and

is then asked a series of questions about what she sawand understood.

Figure 1 depicts the system architecture of the

VizEval Suite, which consists ofSKA (the Support Kit

for Animation)[6], TestCreator, FileCreator,

TestTaker, and Utility modules.SKA[6] is a combination of a visual data structure

library, a visual data structure diagram manipulation

environment, and an algorithm animation system, all

designed for use in an instructional setting. In the

context of the VizEval suite, SKA serves as thegraphics and animation engine. As a result, we are

able to directly apply lessons learned in the VizEval

environment to continuing refinement of SKA.

Figure 1. The VizEval Architecture

Graphical objects and their animations are specifiedin graphics and animation files, respectively. These

are simple text files that are processed by SKA at run-

time. While such files may be created manually using a

text editor, it is desirable in the case of largeexperiments to use FileCreator to automate this

process.

TestCreator facilitates the design and generation of

experiment test files. It leads the experiment designerstep-by-step through the process of specifying each

block, the trials within each block, and the graphics,

animations, and questions associated with each trial.

TestCreator features support for five different types of

questions: mouse (requires user interaction with a

mouse), keyboard (requires interaction through thekeyboard), multiple choice, N-Point (e.g., Likert

Scale), and Yes/No (True/False). Additional

customized question types may be created, byextending the class Question.

The experiment file generated by TestCreatorcontains all the information needed to run the

experiment, including user instructions, questions, startmethod (enter key, mouse click, space bar), attractor

(countdown timer, etc.), graphics and animation files to

be used, as well as various timing and flow of control

parameters.

TestTaker is the execution environment for the

experiments. It keeps track of the user data, the dateand time at which the experiment was conducted, and

other metrics such as height and width of the screen,

distance of the eyes from the screen and directory intowhich the log files are written. Through TestTaker,animations and associated questions are displayed.

User responses and other needed information are

written into a log file. The utility programs parse these

log files to extract and analyze the data.Figure 2 depicts a sample user session. In this case

eight bars are shown. One or two bars have been cued

(by flashing) and zero, one, or two bars have changed

File Creator Test Creator Test Taker

Graphic andAnimationFiles

Test Files

Log Files

UtilityPrograms

SKASupport Kit for

Animation

Visual Languages and Human-Centric Computing (VL-HCC'06)0-7695-2586-5/06 $20.00 2006

7/31/2019 20003-01698762

3/4

height. The user is then asked a series of questions to

determine if they noticed that something changed(detection) and can identify the object that changed

(localization).

Figure 2. The TestTaker interface during asession.

4. Experiment

We performed an experiment both as a test-bed for

the VizEval Suite and to study how varying

perceptual/cognitive aspects of the visual display affect

users comprehension of the program visualization.

We explored two simple, fundamental tasks forprocessing information presented in the animation:detection of whether critical changes had occurred andlocalization of where they had occurred.

4.1 MethodSubjects: Georgia Tech students participated in this

study. All 36 subjects had 20/20 vision after any

necessary refractive correction.

Apparatus: Two Dell Dimension desktop

computers with Sony Trinitron 19 color monitorswere used. Subjects interacted with TestTaker of the

VizEval Suite which managed the graphics and

animations and logged subjects responses.

Stimuli: A variable number of bars (set size of 4, 8,or 16) were displayed across the screen, as shown in

Figure 2. The saturated green bars were presented

against a faint gray background -- each bar width was

approximately 0.75o visual angle in width, so that the

individual bars were clearly visible [1,3,8,17]. Barheight represents the value or weight of each data

element. The bars varied in height from 44 to 374

pixels, in increments of 22 pixels. Preliminary studiesshowed that an increment of 22 pixels was clearly

detectable, even in the far peripheral portion of the

screen. The screen subtended 20o of visual angle, all

within the useful field of view (UFOV).

Design: Four variables were manipulated for each

subject: (a) labeled (a letter presented underneath eachbar) vs. unlabeled bars, (b) display set-size (4, 8, or 16

bars), (c) number of bars cued (1 or 2) by flashing, and

(d) the number of bars that changed height (0, 1, or 2).

Procedure: Each subject completed two blocks oftrials, one with labels and one without, and the order

was counterbalanced across subjects. The number of

bars displayed (set size), number of bars that changed

height, and number of cues were randomly variedwithin each block of 288 trials. When two bars either

were cued or changed heights, the bars always

appeared on opposite sides of the display. This

encouraged subjects to simultaneously monitor both

sides of the display.

Subjects fixated the top center of the screen, andthen pressed the spacebar to begin a countdown

presented at the point of fixation. The trial began abrief, randomized period after the countdown had

ended. On each trial, subjects answered the following

five questions by responding with the mouse:1. Did any of the bars change height? (Yes or No)

2. Please click on the bar most likely to havechanged height.

3. How confident are you that this bar actually

changed height? (1=least 7=most confident)

4. Please click on the bar second most likely to

have changed height.

5. How confident are you that this bar actuallychanged height? (1= least 7=most confident)

Subjects always had to select two bars and, if theyhad not detected a change, provide a low-confidence

rating.

5. Results and discussion

The VizEval Suite allowed us to modify aspects of

the display animations that can affect perceptual andcognitive processing. In doing this, we uncovered

some intriguing findings. The results for detection and

localization of changes in bar height differed, so thatsome perceptual/cognitive characteristics helped

detection but hurt localization, and vice versa. First,

detection was significantly better when two bars

simultaneously changed height than if only one

changed (F(1,33)=62.96, p

7/31/2019 20003-01698762

4/4

localization performance. For localization there was a

set-size effect so performance was worse when morebars were displayed; however, this set-size effect was

not caused by either failure to detect changes in bar

heights or failure to attend widely separated locations.

Third, labels significantly improved localizationperformance (F(1,33)=7.88,p

Documents

20003-01698762