Upload
thiago-mendes
View
214
Download
0
Embed Size (px)
Citation preview
7/31/2019 20003-01698762
1/4
VizEval: An Experimental System for the Study of Program Visualization
Quality
Philippa Rhodes, Eileen Kraemer, Ashley
Hamilton-Taylor, Sujith Thomas, andMatthew Ross
Computer Science Dept.The University of Georgia
Athens, GA USA 30602{rhodes, eileen, ataylor, sujith,
ross}@cs.uga.edu
Elizabeth Davis, Kenneth Hailston,
Keith MainSchool of Psychology
Georgia Institute of TechnologyAtlanta, GA
[email protected],[email protected],
Abstract
The VizEval Suite* is an environment designed tosupport experimentation with and evaluation of
program visualization attributes that affect the usersability to grasp essential concepts. In this paper, we
describe the VizEval Suite and an initial experiment
conducted both as a test-bed of the VizEval Suite and
to study how perceptual/cognitive characteristics of the
visualization affect the users understanding of the
program visualization. VizEval is designed to simplifythe creation and analysis of such studies. Our
experimental results show that some
perceptual/cognitive characteristics that help one task(e.g., detection of critical information) may harm
another (e.g., localization of critical items), and vice
versa. The VizEval software is available for download
athttp://www.cs.uga.edu/~eileen/VizEval/.
1. Introduction
In the context of this paper we define program
visualization(PV) as the use of animated views of
computer programs to enhance human understanding
of the behavior and properties of those programs andinclude animation of data structures, source code,
performance, and other aspects of the programstructure or behavior.
Although many programmers, instructors, and
programming students have a strong intuitive belief
that visualization is valuable for communicatinginformation about the state and behavior of programs,
* This work supported by the National Science Foundation under
grants NSF-IIS 0308117 and NSF-IIS 0308063.
empirical studies have yielded mixed results. One
contributing factor may be that viewers may havedifficulty in understanding the message the designer is
trying to convey.
In this work we look at the ability of the
visualization to help the user understand this message.We are working to identify and evaluate perceptual,
attentional, and cognitive features of program
visualizations that affect viewer comprehension and to
categorize and quantify these results. This work is
performed in the context of a larger project that
involves observational studies of instructors, empirical
studies of PV effectiveness at the level of algorithmunderstanding (SSEA: System to Study the
Effectiveness of Animations[11]), and the development
of improved presentation and interaction techniques for
program visualization in the context of computer
science education (SKA: Support Kit for
Animation[6]).
We have developed a testing environment, theVizEval Suite, that permits us to easily design,
conduct, and evaluate the results of these experiments.
In this paper we describe VizEval and present results
of an initial study of attributes commonly used inanimations of sorting algorithms. In these algorithms,
bars represent data elements. The bars locationindicates the index within an array and the heightindicates the value. Bars change height as elements are
exchanged and cueing is used to draw the viewers
attention to elements that are being compared or
swapped. We look at the effects of various perceptual
characteristics on detection (the user notices that a
change has occurred) and localization (the user is ableto identify which element has changed value).
Visual Languages and Human-Centric Computing (VL-HCC'06)0-7695-2586-5/06 $20.00 2006
7/31/2019 20003-01698762
2/4
2. Related work
Empirical studies of the effectiveness of program
visualizations for teaching computer algorithms have
had mixed results, with some studies showing
advantages for the use of animations [7,12,13,18],some showing benefits at least partially attributable tovisualization [2,9], other studies failing to show clear
benefit [15,16], and at least one study showing a
significant disadvantage to the use of visualization[14].
A good survey and meta-study of studies conducted
prior to 2002 can be found in [10].Gurka and Citrin [5] address factors that must be
considered when evaluating these results and provide aframework for experiments performed in this field. Our
research focuses on the quality of the visualizations,
one of the factors enumerated by Gurka and Citrin.
We define quality loosely to mean those attributes that
correlate with the ability of a visualization to convey a
desired concept.In the work described in this paper we have
narrowed our focus to concentrate on low-level studies
of attributes to determine their individual effects onperception and cognition. One software package used
frequently in such cognitive and perceptual research is
E-prime developed by Psychology Software Tools,
Inc.[4]. However, it lacks some features needed for
experiments targeted at program visualization.
3. VizEval Suite
The VizEval Suite allows the experimenter to
develop an experiment, to deploy that experiment andto collect and organize the output. Automation is
important because of the size and complexity of these
experiments, which may consist of hundreds of trials
and complex ordering within a trial or across testparticipants. The experiment consists of a number of
blocks. Each block contains some number of trials. In
each trial, the participant views a short animation and
is then asked a series of questions about what she sawand understood.
Figure 1 depicts the system architecture of the
VizEval Suite, which consists ofSKA (the Support Kit
for Animation)[6], TestCreator, FileCreator,
TestTaker, and Utility modules.SKA[6] is a combination of a visual data structure
library, a visual data structure diagram manipulation
environment, and an algorithm animation system, all
designed for use in an instructional setting. In the
context of the VizEval suite, SKA serves as thegraphics and animation engine. As a result, we are
able to directly apply lessons learned in the VizEval
environment to continuing refinement of SKA.
Figure 1. The VizEval Architecture
Graphical objects and their animations are specifiedin graphics and animation files, respectively. These
are simple text files that are processed by SKA at run-
time. While such files may be created manually using a
text editor, it is desirable in the case of largeexperiments to use FileCreator to automate this
process.
TestCreator facilitates the design and generation of
experiment test files. It leads the experiment designerstep-by-step through the process of specifying each
block, the trials within each block, and the graphics,
animations, and questions associated with each trial.
TestCreator features support for five different types of
questions: mouse (requires user interaction with a
mouse), keyboard (requires interaction through thekeyboard), multiple choice, N-Point (e.g., Likert
Scale), and Yes/No (True/False). Additional
customized question types may be created, byextending the class Question.
The experiment file generated by TestCreatorcontains all the information needed to run the
experiment, including user instructions, questions, startmethod (enter key, mouse click, space bar), attractor
(countdown timer, etc.), graphics and animation files to
be used, as well as various timing and flow of control
parameters.
TestTaker is the execution environment for the
experiments. It keeps track of the user data, the dateand time at which the experiment was conducted, and
other metrics such as height and width of the screen,
distance of the eyes from the screen and directory intowhich the log files are written. Through TestTaker,animations and associated questions are displayed.
User responses and other needed information are
written into a log file. The utility programs parse these
log files to extract and analyze the data.Figure 2 depicts a sample user session. In this case
eight bars are shown. One or two bars have been cued
(by flashing) and zero, one, or two bars have changed
File Creator Test Creator Test Taker
Graphic andAnimationFiles
Test Files
Log Files
UtilityPrograms
SKASupport Kit for
Animation
Visual Languages and Human-Centric Computing (VL-HCC'06)0-7695-2586-5/06 $20.00 2006
7/31/2019 20003-01698762
3/4
height. The user is then asked a series of questions to
determine if they noticed that something changed(detection) and can identify the object that changed
(localization).
Figure 2. The TestTaker interface during asession.
4. Experiment
We performed an experiment both as a test-bed for
the VizEval Suite and to study how varying
perceptual/cognitive aspects of the visual display affect
users comprehension of the program visualization.
We explored two simple, fundamental tasks forprocessing information presented in the animation:detection of whether critical changes had occurred andlocalization of where they had occurred.
4.1 MethodSubjects: Georgia Tech students participated in this
study. All 36 subjects had 20/20 vision after any
necessary refractive correction.
Apparatus: Two Dell Dimension desktop
computers with Sony Trinitron 19 color monitorswere used. Subjects interacted with TestTaker of the
VizEval Suite which managed the graphics and
animations and logged subjects responses.
Stimuli: A variable number of bars (set size of 4, 8,or 16) were displayed across the screen, as shown in
Figure 2. The saturated green bars were presented
against a faint gray background -- each bar width was
approximately 0.75o visual angle in width, so that the
individual bars were clearly visible [1,3,8,17]. Barheight represents the value or weight of each data
element. The bars varied in height from 44 to 374
pixels, in increments of 22 pixels. Preliminary studiesshowed that an increment of 22 pixels was clearly
detectable, even in the far peripheral portion of the
screen. The screen subtended 20o of visual angle, all
within the useful field of view (UFOV).
Design: Four variables were manipulated for each
subject: (a) labeled (a letter presented underneath eachbar) vs. unlabeled bars, (b) display set-size (4, 8, or 16
bars), (c) number of bars cued (1 or 2) by flashing, and
(d) the number of bars that changed height (0, 1, or 2).
Procedure: Each subject completed two blocks oftrials, one with labels and one without, and the order
was counterbalanced across subjects. The number of
bars displayed (set size), number of bars that changed
height, and number of cues were randomly variedwithin each block of 288 trials. When two bars either
were cued or changed heights, the bars always
appeared on opposite sides of the display. This
encouraged subjects to simultaneously monitor both
sides of the display.
Subjects fixated the top center of the screen, andthen pressed the spacebar to begin a countdown
presented at the point of fixation. The trial began abrief, randomized period after the countdown had
ended. On each trial, subjects answered the following
five questions by responding with the mouse:1. Did any of the bars change height? (Yes or No)
2. Please click on the bar most likely to havechanged height.
3. How confident are you that this bar actually
changed height? (1=least 7=most confident)
4. Please click on the bar second most likely to
have changed height.
5. How confident are you that this bar actuallychanged height? (1= least 7=most confident)
Subjects always had to select two bars and, if theyhad not detected a change, provide a low-confidence
rating.
5. Results and discussion
The VizEval Suite allowed us to modify aspects of
the display animations that can affect perceptual andcognitive processing. In doing this, we uncovered
some intriguing findings. The results for detection and
localization of changes in bar height differed, so thatsome perceptual/cognitive characteristics helped
detection but hurt localization, and vice versa. First,
detection was significantly better when two bars
simultaneously changed height than if only one
changed (F(1,33)=62.96, p
7/31/2019 20003-01698762
4/4
localization performance. For localization there was a
set-size effect so performance was worse when morebars were displayed; however, this set-size effect was
not caused by either failure to detect changes in bar
heights or failure to attend widely separated locations.
Third, labels significantly improved localizationperformance (F(1,33)=7.88,p