20003-01698762

Embed Size (px)

Citation preview

  • 7/31/2019 20003-01698762

    1/4

    VizEval: An Experimental System for the Study of Program Visualization

    Quality

    Philippa Rhodes, Eileen Kraemer, Ashley

    Hamilton-Taylor, Sujith Thomas, andMatthew Ross

    Computer Science Dept.The University of Georgia

    Athens, GA USA 30602{rhodes, eileen, ataylor, sujith,

    ross}@cs.uga.edu

    Elizabeth Davis, Kenneth Hailston,

    Keith MainSchool of Psychology

    Georgia Institute of TechnologyAtlanta, GA

    [email protected],[email protected],

    [email protected]

    Abstract

    The VizEval Suite* is an environment designed tosupport experimentation with and evaluation of

    program visualization attributes that affect the usersability to grasp essential concepts. In this paper, we

    describe the VizEval Suite and an initial experiment

    conducted both as a test-bed of the VizEval Suite and

    to study how perceptual/cognitive characteristics of the

    visualization affect the users understanding of the

    program visualization. VizEval is designed to simplifythe creation and analysis of such studies. Our

    experimental results show that some

    perceptual/cognitive characteristics that help one task(e.g., detection of critical information) may harm

    another (e.g., localization of critical items), and vice

    versa. The VizEval software is available for download

    athttp://www.cs.uga.edu/~eileen/VizEval/.

    1. Introduction

    In the context of this paper we define program

    visualization(PV) as the use of animated views of

    computer programs to enhance human understanding

    of the behavior and properties of those programs andinclude animation of data structures, source code,

    performance, and other aspects of the programstructure or behavior.

    Although many programmers, instructors, and

    programming students have a strong intuitive belief

    that visualization is valuable for communicatinginformation about the state and behavior of programs,

    * This work supported by the National Science Foundation under

    grants NSF-IIS 0308117 and NSF-IIS 0308063.

    empirical studies have yielded mixed results. One

    contributing factor may be that viewers may havedifficulty in understanding the message the designer is

    trying to convey.

    In this work we look at the ability of the

    visualization to help the user understand this message.We are working to identify and evaluate perceptual,

    attentional, and cognitive features of program

    visualizations that affect viewer comprehension and to

    categorize and quantify these results. This work is

    performed in the context of a larger project that

    involves observational studies of instructors, empirical

    studies of PV effectiveness at the level of algorithmunderstanding (SSEA: System to Study the

    Effectiveness of Animations[11]), and the development

    of improved presentation and interaction techniques for

    program visualization in the context of computer

    science education (SKA: Support Kit for

    Animation[6]).

    We have developed a testing environment, theVizEval Suite, that permits us to easily design,

    conduct, and evaluate the results of these experiments.

    In this paper we describe VizEval and present results

    of an initial study of attributes commonly used inanimations of sorting algorithms. In these algorithms,

    bars represent data elements. The bars locationindicates the index within an array and the heightindicates the value. Bars change height as elements are

    exchanged and cueing is used to draw the viewers

    attention to elements that are being compared or

    swapped. We look at the effects of various perceptual

    characteristics on detection (the user notices that a

    change has occurred) and localization (the user is ableto identify which element has changed value).

    Visual Languages and Human-Centric Computing (VL-HCC'06)0-7695-2586-5/06 $20.00 2006

  • 7/31/2019 20003-01698762

    2/4

    2. Related work

    Empirical studies of the effectiveness of program

    visualizations for teaching computer algorithms have

    had mixed results, with some studies showing

    advantages for the use of animations [7,12,13,18],some showing benefits at least partially attributable tovisualization [2,9], other studies failing to show clear

    benefit [15,16], and at least one study showing a

    significant disadvantage to the use of visualization[14].

    A good survey and meta-study of studies conducted

    prior to 2002 can be found in [10].Gurka and Citrin [5] address factors that must be

    considered when evaluating these results and provide aframework for experiments performed in this field. Our

    research focuses on the quality of the visualizations,

    one of the factors enumerated by Gurka and Citrin.

    We define quality loosely to mean those attributes that

    correlate with the ability of a visualization to convey a

    desired concept.In the work described in this paper we have

    narrowed our focus to concentrate on low-level studies

    of attributes to determine their individual effects onperception and cognition. One software package used

    frequently in such cognitive and perceptual research is

    E-prime developed by Psychology Software Tools,

    Inc.[4]. However, it lacks some features needed for

    experiments targeted at program visualization.

    3. VizEval Suite

    The VizEval Suite allows the experimenter to

    develop an experiment, to deploy that experiment andto collect and organize the output. Automation is

    important because of the size and complexity of these

    experiments, which may consist of hundreds of trials

    and complex ordering within a trial or across testparticipants. The experiment consists of a number of

    blocks. Each block contains some number of trials. In

    each trial, the participant views a short animation and

    is then asked a series of questions about what she sawand understood.

    Figure 1 depicts the system architecture of the

    VizEval Suite, which consists ofSKA (the Support Kit

    for Animation)[6], TestCreator, FileCreator,

    TestTaker, and Utility modules.SKA[6] is a combination of a visual data structure

    library, a visual data structure diagram manipulation

    environment, and an algorithm animation system, all

    designed for use in an instructional setting. In the

    context of the VizEval suite, SKA serves as thegraphics and animation engine. As a result, we are

    able to directly apply lessons learned in the VizEval

    environment to continuing refinement of SKA.

    Figure 1. The VizEval Architecture

    Graphical objects and their animations are specifiedin graphics and animation files, respectively. These

    are simple text files that are processed by SKA at run-

    time. While such files may be created manually using a

    text editor, it is desirable in the case of largeexperiments to use FileCreator to automate this

    process.

    TestCreator facilitates the design and generation of

    experiment test files. It leads the experiment designerstep-by-step through the process of specifying each

    block, the trials within each block, and the graphics,

    animations, and questions associated with each trial.

    TestCreator features support for five different types of

    questions: mouse (requires user interaction with a

    mouse), keyboard (requires interaction through thekeyboard), multiple choice, N-Point (e.g., Likert

    Scale), and Yes/No (True/False). Additional

    customized question types may be created, byextending the class Question.

    The experiment file generated by TestCreatorcontains all the information needed to run the

    experiment, including user instructions, questions, startmethod (enter key, mouse click, space bar), attractor

    (countdown timer, etc.), graphics and animation files to

    be used, as well as various timing and flow of control

    parameters.

    TestTaker is the execution environment for the

    experiments. It keeps track of the user data, the dateand time at which the experiment was conducted, and

    other metrics such as height and width of the screen,

    distance of the eyes from the screen and directory intowhich the log files are written. Through TestTaker,animations and associated questions are displayed.

    User responses and other needed information are

    written into a log file. The utility programs parse these

    log files to extract and analyze the data.Figure 2 depicts a sample user session. In this case

    eight bars are shown. One or two bars have been cued

    (by flashing) and zero, one, or two bars have changed

    File Creator Test Creator Test Taker

    Graphic andAnimationFiles

    Test Files

    Log Files

    UtilityPrograms

    SKASupport Kit for

    Animation

    Visual Languages and Human-Centric Computing (VL-HCC'06)0-7695-2586-5/06 $20.00 2006

  • 7/31/2019 20003-01698762

    3/4

    height. The user is then asked a series of questions to

    determine if they noticed that something changed(detection) and can identify the object that changed

    (localization).

    Figure 2. The TestTaker interface during asession.

    4. Experiment

    We performed an experiment both as a test-bed for

    the VizEval Suite and to study how varying

    perceptual/cognitive aspects of the visual display affect

    users comprehension of the program visualization.

    We explored two simple, fundamental tasks forprocessing information presented in the animation:detection of whether critical changes had occurred andlocalization of where they had occurred.

    4.1 MethodSubjects: Georgia Tech students participated in this

    study. All 36 subjects had 20/20 vision after any

    necessary refractive correction.

    Apparatus: Two Dell Dimension desktop

    computers with Sony Trinitron 19 color monitorswere used. Subjects interacted with TestTaker of the

    VizEval Suite which managed the graphics and

    animations and logged subjects responses.

    Stimuli: A variable number of bars (set size of 4, 8,or 16) were displayed across the screen, as shown in

    Figure 2. The saturated green bars were presented

    against a faint gray background -- each bar width was

    approximately 0.75o visual angle in width, so that the

    individual bars were clearly visible [1,3,8,17]. Barheight represents the value or weight of each data

    element. The bars varied in height from 44 to 374

    pixels, in increments of 22 pixels. Preliminary studiesshowed that an increment of 22 pixels was clearly

    detectable, even in the far peripheral portion of the

    screen. The screen subtended 20o of visual angle, all

    within the useful field of view (UFOV).

    Design: Four variables were manipulated for each

    subject: (a) labeled (a letter presented underneath eachbar) vs. unlabeled bars, (b) display set-size (4, 8, or 16

    bars), (c) number of bars cued (1 or 2) by flashing, and

    (d) the number of bars that changed height (0, 1, or 2).

    Procedure: Each subject completed two blocks oftrials, one with labels and one without, and the order

    was counterbalanced across subjects. The number of

    bars displayed (set size), number of bars that changed

    height, and number of cues were randomly variedwithin each block of 288 trials. When two bars either

    were cued or changed heights, the bars always

    appeared on opposite sides of the display. This

    encouraged subjects to simultaneously monitor both

    sides of the display.

    Subjects fixated the top center of the screen, andthen pressed the spacebar to begin a countdown

    presented at the point of fixation. The trial began abrief, randomized period after the countdown had

    ended. On each trial, subjects answered the following

    five questions by responding with the mouse:1. Did any of the bars change height? (Yes or No)

    2. Please click on the bar most likely to havechanged height.

    3. How confident are you that this bar actually

    changed height? (1=least 7=most confident)

    4. Please click on the bar second most likely to

    have changed height.

    5. How confident are you that this bar actuallychanged height? (1= least 7=most confident)

    Subjects always had to select two bars and, if theyhad not detected a change, provide a low-confidence

    rating.

    5. Results and discussion

    The VizEval Suite allowed us to modify aspects of

    the display animations that can affect perceptual andcognitive processing. In doing this, we uncovered

    some intriguing findings. The results for detection and

    localization of changes in bar height differed, so thatsome perceptual/cognitive characteristics helped

    detection but hurt localization, and vice versa. First,

    detection was significantly better when two bars

    simultaneously changed height than if only one

    changed (F(1,33)=62.96, p

  • 7/31/2019 20003-01698762

    4/4

    localization performance. For localization there was a

    set-size effect so performance was worse when morebars were displayed; however, this set-size effect was

    not caused by either failure to detect changes in bar

    heights or failure to attend widely separated locations.

    Third, labels significantly improved localizationperformance (F(1,33)=7.88,p