Evaluating 2D Flow Visualization Using Eye Tracking · 2015-07-03 · Hsin-Yang Ho, I-Cheng Yeh, Yu-Chi Lai, Wen-Chieh Lin, and Fu-Yin Cherng / Evaluating 2D Flow Visualization Using

Eurographics Conference on Visualization (EuroVis) 2015H. Carr, K.-L. Ma, and G. Santucci(Guest Editors)

Volume 34 (2015), Number 3

Evaluating 2D Flow Visualization Using Eye Tracking

Hsin-Yang Ho1, I-Cheng Yeh2, Yu-Chi Lai3, Wen-Chieh Lin1, and Fu-Yin Cherng1

1National Chiao Tung University, Hsinchu, Taiwan2Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Chung-Li, Taiwan

3National Taiwan University of Science and Technology, Taipei, Taiwan

AbstractFlow visualization is recognized as an essential tool for many scientific research fields and different visualizationapproaches are proposed. Several studies are also conducted to evaluate their effectiveness but these studiesrarely examine the performance from the perspective of visual perception. In this paper, we aim at exploring howusers’ visual perception is influenced by different 2D flow visualization methods. An eye tracker is used to analyzeusers’ visual behaviors when they perform the free viewing, advection prediction, flow feature detection, and flowfeature identification tasks on the flow field images generated by different visualizations methods. We evaluate theillustration capability of five representative visualization algorithms. Our results show that the eye-tracking-basedevaluation provides more insights to quantitatively analyze the effectiveness of these visualization methods.

Categories and Subject Descriptors (according to ACM CCS): I.3.6 [Computer Graphics]: Methodology andTechniques—Ergonomics I.3.8 [Computer Graphics]: Applications—;

1. Introduction

Flow fields have become a widespread research medium inthe domains of oceanographic-atmospheric modeling, com-putational fluid dynamics simulation, and electromagneticfield analysis. Flow visualization displays flow fields to letscientists get quick and accurate insight of the flow pat-terns. The demands for different applications have moti-vated the development of various flow visualization methods[CL93, LHS08, She98, TB96]. However, there are only fewstudies [LDM∗01, LKJ∗05, LCS∗12] dedicating in evaluat-ing the performance of various flow visualization methods.

These evaluation studies can be divided into two cate-gories. The first category is the explicit/implicit task meth-ods [FCL09, LKJ∗05, LDM∗01, LCS∗12] which ask partic-ipants to perform certain tasks, such as feature location andidentification, and several behavioral responses, such as re-sponse time, locating error, or correctness, are measured.The measured quantities along with designed questionnairesare used to gain the evidence of the effectiveness of visual-ization. The second category is subjective scoring methods[AJDL08] which ask participants to rate the visualization re-sults. The behavioral responses and subjective scoring usedin these two categories can be easily acquired; however, theydo not directly reflect users’ visual behaviors, which are crit-ical for evaluating the effectiveness of visualization. To fill

this gap, we suggest evaluating the performance of visual-ization methods by analyzing users’ visual behaviors. By in-corporating an eye tracker to observe users’ visual behaviors,we hope to get more insights and better understanding of therelative merits of each flow visualization method.

Along with the quantitative measurements, we design fourexperiments to evaluate the influence of different visualiza-tion techniques to users’ visual behaviors. The first experi-ment lets a participant freely watch the visualization resultwithout any instructions. This helps us understand which vi-sualization technique better attracts a participant’s attention.The second experiment asks a participant to predict the ad-vection of a vector field. The third experiment asks a partic-ipant to locate the critical points in a field. The fourth exper-iment asks a participant to identify a specific type of criticalpoints from several possible candidates. These experimentsevaluate the performance of flow visualization from differentaspects, which allows us to better understand how users in-terpret a vector field and recognize the relationship betweeneye gazes and visualization results.

Eye tracking analysis allows us to discover several issuesthat have not been explored in the past evaluation studies:(1) Generally, researchers assume that more visual attentionis paid to highly varying regions. Our study verifies this as-sumption by using an eye tracker to discover the connection

c© 2015 The Author(s)Computer Graphics Forum c© 2015 The Eurographics Association and JohnWiley & Sons Ltd. Published by John Wiley & Sons Ltd.

Hsin-Yang Ho, I-Cheng Yeh, Yu-Chi Lai, Wen-Chieh Lin, and Fu-Yin Cherng / Evaluating 2D Flow Visualization Using Eye Tracking

between visual attention and the characteristics of flow fields(e.g., the variation of flow velocity and positions of criticalpoints) in the free viewing condition. (2) Traditional evalua-tion methods [LKJ∗05, LCS∗12] cannot determine whethera visualization result can clearly reveal the flow pattern butan eye tracker can analyze the similarity of scanpaths tojudge the effectiveness of expressive flow patterns by record-ing users’ eye movements and scanpaths. (3) The abilities ofexpressing the feature of a field is conventionally evaluatedby measuring users’ performance, such as the time and accu-racy of locating the feature. We can analyze the relationshipof eye gaze distribution and feature locations to objectivelyassess whether the visualization results can attract the users’attention to the features. (4) When searching a certain tar-get, our eyes move around the image to determine the targetlocation. This visual behavior can be utilized to check howwell the visualization results can present the types of criticalpoints by comparing the number of back-and-forth move-ments among the candidates.

2. Related Work

Flow visualization evaluation methods conventionally usedquestionnaires to evaluate the performance of flow visual-ization. Laidlaw et al. [LKJ∗05, LDM∗01] designed threetasks, locating and identifying critical points and flow ad-vection, to evaluate 2D vector field visualization. Liu et al.[LCS∗12] further evaluated and analyzed the influence ofcolor mapping and symmetry on flow visualization. Mar-tin et al. [MSM∗08] analyzed the background effects on2D hurricane visualization. Acevedo et al. [AJDL08] foundhigh correlation among the grades from experts and studyresults in [LKJ∗05]. Recently, Pineo and Ware [PW12]used a computational model to automatically evaluate andoptimize visualizations which rely on a simulation of infor-mation processing in the retina and primary visual cortex.Isenberg et al. [IIC∗13] presented a systematic survey of vi-sualization evaluation. These studies evaluated the effective-ness of visualizations from indirect perceptual indicationsand may be easily affected by other human factors. Besides,there are also studies using computational metrics, includ-ing saliency map [JC10, JWC∗11] and image-quality map[MK13,War08], to evaluate flow visualizations. These com-putational methods are objective but may not directly reflectusers’ visual perception of flow visualization.

Eye tracking analysis has been applied to visualizationevaluation including the dual-view visualization [CCY∗03],the eye gaze strategies for 2D and 3D view arrangement[MTY05], the comparison of the radial and Cartesian vari-ants of transaction sequences in information hierarchies[BBBD08], the readability of the orthogonal, force-directed,and hierarchical graph layouts [PSD09], and the effective-ness of the spider, bar, and line graph methods [GH10].Kurzhals et al. [KFBW14, KBB∗14] also provided an eyetracking benchmark for various visualization methods. Net-zel et al. [NBW14] conducted a comparative study for dif-

ferent trajectory visualization techniques. Besides visualiza-tion evaluation, an eye tracker has also been introduced formanipulation of the flow visualization [MWFI04]. To thebest of our knowledge, eye tracking has never been appliedto evaluate flow visualizations.

3. Research Goals

Our goal is to explore the possibility of applying eye trackinganalysis to evaluate flow visualization methods from the per-spective of visual perception. We investigate how eye gazescan be utilized to complement other behavior measures, suchas reaction time and correct rate, to provide a more elaborateevaluation. To observe how visualization methods influencevisual behaviors, we first design an experiment to analyzeparticipants’ visual attention on the visualizations of flowfields in free viewing condition (Exp 1). This helps verifyif a visualization method can attract participants’ eye gazesto important flow features at first glance, which is one of thefundamental goals for flow visualizations but has never beenverified in the previous studies.

Furthermore, we also study how different types of flow vi-sualization methods affect participants’ performance and vi-sual behaviors in executing different tasks. We design threeexperiments to evaluate flow visualization methods on threetasks: advection prediction (Exp 2), feature detection (Exp3), and feature identification (Exp 4). These tasks are cho-sen according to experts’ suggestions and previous studies[LKJ∗05,LCS∗12] because they are critical for judging visu-alization methods’ illustration capability on flow speed, flowdirection, and critical points, which are three of the most im-portant flow features.

4. Method

We adopt the within-subject design in all experiments. Toavoid the learning effects, the flow fields used in each ex-periment are different and the orders of flow field and visu-alization method are randomized in all experiments. In Exp1, Exp 2, and Exp 3, 40 flow images (8 fields × 5 meth-ods) are displayed in a random order to each participant ineach experiment. In Exp 4, 25 flow images (5 fields × 5methods) are displayed randomly to each participant using aLatin square. The randomization of visual stimuli minimizesthe learning effects in the within-subject design.

4.1. Flow Data Sets

In each experiment of this study, we generate flow fields withdistinct properties using mathematical synthesis and physi-cal simulation. The details are described as follows.

Mathematically Synthesized Flow Fields: To set up ap-propriate vector fields for different tasks, we develop an in-tuitive field design tool adapted from Zhang et al.’s vectorfield design method [ZMT06] to place critical points at thedesired locations. All flow fields have the same resolution

c© 2015 The Author(s)Computer Graphics Forum c© 2015 The Eurographics Association and John Wiley & Sons Ltd.


of 800× 800. We generate two types of fields: (1) random-placement and (2) fixed-placement of critical points. Thefirst type is used in Exp 1, 2, and 3. It is generated by ran-domly placing three to seven critical points of various typesinside a square of 560×560 pixels. To avoid artifacts on vi-sualizations caused by extreme flow speeds, nonlinear scal-ing is applied such that the maximal flow speed is 400 pix-els/sec and 90% of samples have flow speed larger than 40%of the maximal speed. The second type is used in Exp 4 andgenerated by placing five critical points on the vertices of apentagon with a radius of 200 pixels in the center of the im-age (Figure 1(a)), using a larger decay factor to minimize in-terferences between adjacent critical points. Each field con-tains five different types of critical points including saddle,node sink/source, and focus sink/source, i.e., each criticalpoint in a field has a unique type. Nonlinear scaling is alsoapplied to restrict flow speeds to be within 400 pixels/secand 80% of samples have speeds greater than 10% of themaximal speed; Please refer to the supplemental materialsfor more details of field settings and generated patterns.

Physically Simulated Flow Fields: Synthesized fieldsare conventionally used for evaluation because of their goodcontrollability and availability, but they may be lack of real-ism. Hence, we also use vector fields generated by physicalsimulation for better approximation of real fields. ANSYSFluent is used for its abilities of high-precision and com-plex fluid behavior modeling. After discussing with an ex-pert of computational fluid dynamics, we set the simulationattributes as follows: incompressible water with a density of1000 kg/m3 and viscosity of 0.001 Pa·s. We keep the fluidtemperature invariant and run the simulation in a 2D grid of801× 801 whose physical size is 8× 8 m2. After simula-tion, we scale the flow speeds of all fields to be within 400pixel/sec. To create complex and realistic flow features, tur-bulence simulation is enabled. The boundary conditions aredescribed in the supplemental materials. We totally generatefour different vector fields, which have eight, five, nine, andeleven critical points, respectively.

4.2. Flow Visualization Methods

To evaluate the influences of different types of visualizationsto participants’ performance and visual perception, we se-lect five representative methods from different categorizes,including glyph-based, geometry-based, and texture-based.These methods are commonly evaluated in the previous stud-ies [LKJ∗05, LDM∗01, AJDL08, LCS∗12, War06]. This fa-cilitates us to compare our findings with theirs. Similarly,we also refer to the previous studies to set the parametersof visualization methods whenever possible. The followingssummarize these methods and the parameters used in thisstudy.

Hedgehogs Plot (HP) is a glyph-based method. It illus-trates the flow speed and direction in 2D flows using sim-ple arrows (length proportional to the speed and direction

aligned with flow direction). We set n = 35 to divide theimage into uniform-sized blocks and the maximal length ofan arrow to be 95% of the block size to avoid overlappingamong arrows. Image-Guided Streamline (IGS) [TB96] isa geometry-based method. It iteratively places streamlinesbased on a streamline-density-difference energy function.Arrows are added to indicate streamline direction. We usethe authors’ implementation to generate visualization im-ages by setting the density factor as 0.02 and the style asfancy, which are similar to the settings in previous studies.Illustrative Streamline (IS) [LHS08] is a geometry-basedmethod. It illustrates a vector field using a seeding strategy tocompactly generate a minimal set of streamlines on an n×nuniform-sized blocks. We choose n = 201, the local dissimi-larity threshold as 0.2, and the global dissimilarity thresholdas 0.05 as suggested in the paper [LHS08]. Line IntegralConvolution (LIC) illustrates a vector field by a gray-scaleimage with texture advection. The gray value of a pixel iscomputed by convolving a random gray-scale image with alow-pass kernel and integrating the convolution values of allpixels along the forward and backward streamline of a fixedarc length passing the pixel. We adopt the LIC implementedby Laidlaw et al. [LKJ∗05] and set the kernel as a box filterof size 20× 20 pixels without contrast enhancement. Un-steady Flow Line Integral Convolution (UFLIC) [She98]uses a time-accurate value scattering scheme to model un-steady texture flow. To have dynamic visual effects on steadyfields, we modify UFLIC by changing the color of stable pix-els when the steady state lasts longer than one second. We setthe life cycle of each particle as four seconds for comfortablevisualization and have the sampling rate and filtering kernelfor minimal artifacts.

4.3. Procedures

Before the experiments, participants watched an instructionvideo that introduced basic knowledge of flow fields, typesof critical points, examples of the five visualization meth-ods, and the procedures of each experiment. If they had anyquestions after watching the video, we answered them. Be-fore each experiment, participants would practice the tasksuntil they understood the content of the tasks and could suc-cessfully perform them. The flow fields used in the prac-tice task were not used in the experiments. Participants thentook the calibration session of the eye tracker and performedthe four experiments in sequence. After experiments, partic-ipants filled out a feedback form. We described the detailedprocedures of each experiment below.

Exp 1: Free Viewing. In Exp 1, participants were askedto freely watch the visualization results without any pur-pose. The visual stimuli were 40 flow images: (4 mathemat-ically synthesized fields + 4 physical simulated fields) × 5methods. Each participant had to watch all 40 flow imagesdisplayed in a random order. Before each flow image wasshown, a participant needed to click the white dot located



(a) HP (b) IGS (c) IS (d) LIC (e) UFLICFigure 1: Examples of five visualization method evaluated in our user study. (a) Hedgehogs Plot (b) Image-Guided Streamlin(c) Illustrative Streamline (d) Line Integral Convolution (e) is a frame of the animating result generated by Unsteady Flow LIC.

in the center of a black screen. This guided the participant’seye gaze back to the center of the screen. Once the partici-pant clicked the white dot, a flow image would be displayedon the screen for five seconds and the participant’s eye gazeswere recorded during this period. The participant performedthe clicking and viewing tasks alternatively until all 40 im-ages were displayed.

Exp 2: Advection Prediction. Participants were in-structed to observe and predict the flow advection from apixel location in a flow image. The stimuli were 40 flowimages (8 synthesized fields × 5 methods) displayed to aparticipant in a random order. Each flow image was over-layed with a circle (radius = 200 pixels) and a dot at thecenter of the circle as shown in Figure 6(a). Participants hadto observe the flow advection at the dot, predict where thepathline originating from the dot intersects the circle, andclick the predicted intersection point on the circle. Each par-ticipant’s response time, mouse click position, and eye gazemovements were recorded. Similar to the procedure of Exp1, before each flow image was shown, participants had toclick the white dot in the center of of a black screen so thattheir gazes were back to the center of the screen. There wasno time limit for participants to complete the prediction task.

Exp 3: Feature Detection. This experiment evaluated thecapability of a flow visualization method to highlight thecritical points of a flow field during the visualization pro-cess. The visual stimuli were 40 flow images (8 synthesizedflow fields × 5 methods). Each field had two to five criti-cal points that are randomly placed and totally there were28 critical points in eight fields, including eight saddles, fivenode sinks, five node sources, five focus sinks, and five focussources. To prevent participants’ gazes from moving out ofthe screen, the distance from any critical point to the screenboundaries was larger than 50 pixels. Before each flow im-age was shown, participants had to click the white dot ona black screen and a flow image would be displayed. Theythen located each critical point by clicking it on the screen.Each participant had to detect flow features on 40 flow im-ages shown in a random order. Each participant’s responsetime, mouse click position, and eye movement data wererecorded.

Exp 4: Feature Identification. This experiment used thesecond type of synthesized flow field (Figure 1(a)). Eachflow field had five critical points of different types placed atthe vertices of a pentagon. A circle with radius of 100 pixelswas overlayed on the region of each critical point. This de-sign ensured that the measured response time was dominatedby the identification task rather than the locating task. Eachparticipant had to perform the identification task on 25 flowimages (5 fields × 5 methods) displayed in a random order.At beginning of each trial, a participant was informed whichtype of critical point s/he needed to identify by a message onthe screen. A white dot would be displayed at the center ofa black screen and the participant had to click the white dotto move his/her gaze to the center. Then a flow image wouldbe displayed and the participant needed to identify the spec-ified type of critical point by clicking one of the five criticalpoints. After mouse clicking, the participant began a newtrial and repeated the above procedures until all 25 flow im-ages were displayed. To avoid the order effect, a 5×5 Latinsquare was used to determine the order of the type of criticalpoint to be identified for each participant. This ensured thateach type of critical points would show exactly once at eachfixed location. Response time, mouse click position, and eyemovement data were recorded.

4.4. Device and Participants

An eye-tracker Tobii T120 was employed to observe partic-ipants’ visual behaviors. Its data rate was 120Hz and accu-racy was 0.5 degrees with tracking distance of 50-80 cm.The eye tracker was integrated with a 17" LCD monitor ofresolution 1280×1024 pixels for presenting visual stimuli.

There were 24 participants in the study. Four of themwere excluded because their eye gazes could not be trackedcorrectly. The remaining participants (seven females) in-cluded five fluid dynamics researchers and 15 undergradu-ate/graduate students with age ranging from 19 to 25 yearsold, majoring in science and engineering disciplines. We didnot compare expert and non-expert participants in our studybecause in general, these two groups do not exhibit a statis-tically significant difference as reported in [LKJ∗05]. With



the training session and introduction to flow visualization,any difference would be mitigated.

To ensure the collected data was close to real visual reac-tion, our lab settings allowed participants to complete the ex-perimental tasks in a comfortable and distraction-free condi-tion. We recorded a participant’s physiological responses in-cluding eye scanpaths, gaze distribution, gaze response time,and other behavioral responses such as keyboard hitting andmouse clicking. Tobii Fixation Filter with the default settingwas used to preprocess the collected data.

4.5. Data Cleaning and Measurement

Before analysis, the eye tracking data and mouse clickingdata containing obvious errors were eliminated based on thefollowing criteria. First, when the eye tracker failed in track-ing eye gazes, the results were discarded. Second, when aparticipant’s response was unreasonable, that response datawas discarded. For example, in Exp 2, a participant’s mouseclicking response was erroneous if s/he clicked a positionfar away from the circle (farer than 20 pixels, one tenth ofthe radius). Or in Exp 4, if a participant did not click anyof the circles of the five critical points, that mouse clickingdata was invalid. In Table 1, the first row showed the numberof all data collected from 20 participants. The second rowshowed the number of data left after being examined by Cri-terion 1. The third row described the number of data whichpassed Criterion 1 and 2 and were used for analysis.

Our study evaluated the performance of different visual-ization methods based on three types of measurements: re-sponse time, mouse click position, and eye tracking. The firsttwo measurements, which were commonly used in previousstudies [LKJ∗05, LCS∗12], reflect the efficiency and accu-racy to complete a task. The third measurement is the focusof this study. Depending on the goal of each experiment, weadopted different analysis to eye fixations and scanpaths.

5. Results

We applied a repeated measure ANOVA to study the effectof visualization methods. The null hypothesis of ANOVAwas that all five visualization methods had equal populationmeans. When p-value was less than 0.05, the null hypoth-esis was rejected and Tukey HSD test was utilized for thepost hoc test, which further examined whether two meth-ods were significantly different by computing their p-value.If a pair of methods were significantly different, we labeledthem with two alphabets. If not, we labeled them with the

Table 1: The number of original data and valid data.

Exp 1 Exp 2 Exp 3 Exp 4Collected 800 640 800 2500Passing 1st 680 532 530 2119Passing 1st and 2nd 680 524 530 2104

same alphabet. For example, in Figure 3(a), there were sta-tistically significant differences between HP (labeled with“A”) and any of IGS, IS, LIC, UFLIC (labeled with “B” or“BC”). LIC (labeled with “C”) had no statistically significantdifferences with IS or UFLIC (labeled with “BC”). Further-more, Mauchly’s test was used to check whether sphericityhad been violated. If violated, the degrees of freedom werecorrected using Greenhouse-Geisser estimates of sphericity.The detailed analysis results were provided in the supple-mental materials.

5.1. Exp 1: Free Viewing Results

In this experiment, we studied the effect of different visu-alization methods to the participants’ visual attention in thefree view condition by analyzing the relationship betweenthe geometric expression of flow fields and visual attention.The normalized scanpath saliency (NSS) [IKN98] was usedto measure this relationship. A higher NSS represented a par-ticipant’s attention (fixations) focused more on the salientgeometric expression of a flow field. Specifically, we repre-sented the geometric expression of a vector field by a scalarfield. The values of the scalar field were normalized to havea distribution of zero mean and unit variance. Let S be thenormalized scalar field. The NSS of a participant’s scanpathwas the the average of the values along the scanpath in S. Inthis paper, we defined an NSS score by weighting NSS bythe duration of fixations. If the NSS score was larger thanzero, most fixations fell in the salient regions (since S wasnormalized to zero mean and unit variance). In other words,the higher the NSS score was, the closer the participant’s at-tention focused on the salient regions of the scalar field (re-gions with large values). Figure 2 illustrated an example ofNSS computation. Please refer to [PIIK05] for more details.

(a) (b) (c)

Figure 2: An example of NSS score estimation. (a) a partic-ipant’s eye fixations and scanpath. (b) is a scalar field ob-tained by computing the directional difference of the flowfield, where large values are shown in red color. The orangedots represent eye gazes. (c) is a histogram of (b) normalizedby z-score, the blue line shows the weighted mean of entirescanpath.

We considered two kinds of scalar fields to represent thegeometric expression of a flow field: speed map and distancemap. The former was simply the magnitude of the flow field.The latter was obtained by computing the distance of eachpoint in the flow field to the nearest critical points. That



0.0

-0.1

-0.2

-0.3

-0.4

-0.5

HP IGS IS LIC UFLIC

A

B

BCC

BC

Score

(a) Speed map

0.00

1.05

1.00

0.95

0.90

0.85

0.80

0.75

0.70

0.65

0.60

HP IGS IS LIC UFLIC

C

AB

A A

B

Score

(b) Distance map

Figure 3: NSS scores in Exp 1. (a) F(4,659) = 18.52, p <1e-4. (b) F(4,659) = 11.55, p < 1e-4. The error bars showthe standard errors.

is, at an arbitrary point q of the distance field, the distancevalue is calculated by d(q) = min1≤i≤n{‖pi− q‖}, wherepi was the i-th critical point in the flow field. Both speedmap and distance map were normalized to have zero meanand unit variance. The distance map was further multipliedby -1 such that a larger value meant closer to a critical point.Figure 3(a) showed the average NSS score of all participantsfor each method. The error bars in all plots of this papershowed the standard error. It appears that participants’ at-tention was not drawn by the regions with large flow speedas all four methods had negative NSS scores. Among thefive methods, HP could draw participants’ attention to theregions with large speeds more significantly than the othersthough. On the other hand, Figure 3(b) showed that all meth-ods attracted participants’ attention to critical points as theirNSS scores were all greater than 0.5. It was an interestingfinding that in free viewing condition, participants tended topay more attention to critical points. IS and LIC were obvi-ously better than HP for representing critical points in termsof drawing participants’ attention.

Based on our observation, participants had the tendency tofocus on regions with a complex flow pattern such as criticalpoints or turbulence. To verify this tendency, a directionaldifference map was defined to predict the gaze region of aparticipant. The map was computed by measuring the max-imal difference between the direction at a point in the flowfield and the direction at the vicinity of the point. The largerthe directional difference is, the more complex the flow pat-tern is. To compute the map, the angular difference betweenpoint x and y in a flow field v is defined as follows:

DA(x,y) = cos−1 v(x) ·v(y)‖v(x)‖‖v(y)‖ .

The directional difference DV measuring the complexity ofthe flow at point x with respect to point y is defined by:

DV (x) = max{DA(x,y) ·G(‖x−y‖;µ,σ) |∀y ∈ R2},

where G(‖x,y‖;µ,σ) is a Gaussian function defined by

G(d;µ,σ2) =1

σ√

2πe−

12 (

d−µσ

)2.

µ and σ are chosen as 0 and 33.973, respectively. The scalar

(a) Directional diff. map

0.0

1.6

1.5

1.4

1.3

1.2

1.1

1.0

0.9

0.8

0.7

HP IGS IS LIC UFLIC

C

B

A A

AB

Score

(b) NSS score

Figure 4: (a) An example of directional difference map. Redcolor represents larger directional differences. (b) The NSSscores of directional difference map, F(4,659)= 12.67, p <1e-4.

0.0

5.5

5.0

4.5

4.0

3.5

3.0

2.5

HP IGS IS UFLIC

A

B

A

B

Time (sec)

(a) Response time

0

18

16

14

12

10

8

6

4

HP IGS IS UFLIC

A

A

A

A

Angular (degree)

(b) Prediction error

0

100

95

90

85

80

75

70

65

60

55

HP IGS IS UFLIC

A

A

B

A

Difference

(c) Trajectory diff.

Figure 5: Exp 2: Advection prediction results. (a)F(2.6,338.2) = 12.2, p < 1e-4. (b) F(2,265) = 2, p = 0.15.(c) F(2.7,346.1) = 29.03, p < 1e-4

field DV is smoothed by a Gaussian filter to get the direc-tional difference map. Figure 4(a) shows a directional dif-ference map. Figure 4(b) lists the NSS score of the direc-tional difference map computed from 40 flow images. Allfive methods had the tendency to attract visual attention tothe regions with a complex flow pattern. In particular, IS andLIC has stronger tendency than the other three techniques.

5.2. Exp 2: Advection Prediction Results

This experiment evaluated the guidance ability of differentvisualization methods based on participants’ response timeand prediction error and the difference between their eyescanpath and the pathline. Because LIC only showed the ori-entation of the flow, not the flow direction, we did not eval-uate it in this experiment. The response time measured theperiod from the stimulus shown to a participant to the par-ticipant clicking a predicted position on the screen. It repre-sented the total time that a participant spent on understand-ing and predicting the flow direction in a flow field. Fig-ure 5(a) showed that IGS and UFLIC had shorter responsetime than HP and IS. The prediction error was measuredas the central angle between the predicted position and ad-vected position as shown in Figure 6. Figure 5(b) showedthat participants’ prediction errors were not affected by flowvisualization methods.

Ideally, a good flow visualization method should guide aparticipant’s scanpath to follow a pathline when the partic-



(a)

θ

(b)

Figure 6: Definition of prediction error. (a) An example offlow images used in Exp 2. (b) Suppose the green dot is a par-ticipant’s predicted position of flow advected from the center.The blue curve is a pathline starting from the center and itsintersection with the circle is the real advected position. Thecentral angle between the predicted and advected positionsis defined as the prediction error.

ipant predicted the advection of flow. This motivated us toadopt a scanpath similarity metric [JHN10] to measure thetrajectory difference between an eye scanpath and a pathlinein the flow field. Compared to the prediction error, this tra-jectory difference provided more insights on evaluating theguidance ability of flow visualization. Figure 7 illustratedfive scanpathes from small to large trajectory differences.Figure 5(c) compared the trajectory difference of the fourvisualization methods. IS had significantly larger trajectorydifference than HP, IGS, and UFLIC. This finding showedthat eye tracking can reveal more information in evaluat-ing flow visualization methods as the prediction error alongcould not distinguish the performance of different visualiza-tion methods (comparing Figure 5(b) and (c)).

5.3. Exp 3: Feature Detection Results

This experiment evaluated flow visualization methods’ abil-ity on illustrating the features of flow fields based on par-ticipants’ response time, correctness, locating error, and thedifference of distribution between their fixations and criti-cal points in flow fields. Figure 8(a)(a) showed that HP hadlonger response time than IGS, LIC, and UFLIC while theother four methods had no statistically significant differencebetween each other. We considered a critical point as beingcorrectly detected if the distance from it to the mouse click-ing position was within 40 pixels. A flow image was consid-ered correctly detected if all critical points in the field werecorrectly detected†. As shown in Figure 8(b), the texture-based methods (LIC and UFLIC) had higher correct ratesthan the others because they provided straightforward infor-mation to locate features.

† Human visual system has high acuity at the fovea of the retina,which corresponds to two degrees of the visual field. For a 17 inchLCD monitor (1280 × 1024 pixels) with a participant sitting at adistance of 60 cm, the high sensitivity region is about 40-pixel wide.

For each correctly detected critical point, we computed adeviated distance from the critical point to the position ofthe mouse click that correctly detected it. The locating errorwas then calculated by averaging the deviated distance ofall detected critical points. Similar to the comparison of cor-rect rate, texture-based methods (LIC and UFLIC) also hadsignificantly smaller locating error than HP and IGS (Figure8(c)). Additionally, because the streamlines in IS would con-verge to sinks and sources, participants could easily detectthe four types of critical points and thus had much smallerlocating error than HP and IGS.

We further investigated the distribution of participants’fixations to analyze why some visualization methods per-form better in assisting participants to detect critical points.We followed the analysis proposed by Santella and DeCarlo[SD04] to compare the distributions of eye fixations and crit-ical points. A gaze distance was defined as the shortest dis-tance from a fixation to its nearest critical point in a flowfield. The average gaze distance of a flow visualization wasthen the mean gaze distance of all fixations of all partici-pants in the visualization. If the average gaze distance wasshorter, it meant that participants could detect critical pointsmore easily since participants’ attention were drawn to crit-ical points. Figure 8(d) showed the average gaze distance ofeach visualization method. IS had significantly shorter aver-age gaze distance than LIC and UFLIC, which meant that IScould better attract participants’ attention to critical points(sinks and sources). This finding could not be discoveredfrom the results of response time, correct rate and locatingerror. Because IS could not illustrate saddles, its correct ratewas still worse than LIC and UFLIC (note that the locatingerror only counted the average error of correctly detectedcritical points).

5.4. Exp 4: Feature Identification Results

This experiment evaluated the performance of visualizationmethods on illustrating the type of flow features based onthe participants’ response time and eye gazes. The defini-tion of each measurement was similar to that used in Exp3. Because LIC did not express the orientation of a vectorfield, its visualization results could not present node and fo-cus well. Thus, LIC was excluded from this experiment. Thestatistics of the response time were listed in Figure 9(a). IGSand UFLIC performed faster than the others, which allowedthe participants to identify the type of a critical point withinthree seconds in average. IS visualizations took participantsthe longest time to determine the feature type.

The correct rates of all methods were above 98% in 431times of identification tasks for each visualization method.There were no statistically significant differences among themethods. For the eye tracking data, we evaluated visualiza-tion methods by counting the total number of critical pointvisits and the sequence of visits. A critical point visit wasdefined as that a participant’s eye gaze moves into a circle



(a) diff.= 11.18 (b) diff.= 66.37 (c) diff.= 77.66 (d) diff.= 59.77 (e) diff.= 127.56Figure 7: Five eye scanpathes and their trajectory difference with the pathline in Exp 2. (a) Eye gaze moved with the pathline.(b) Eye gaze kept scanning on the pathline repeatedly. (c) Participant’s eye gaze moved to a distant location. (d)(e) Participantfocused on arrows first and traced the pathline later. Please see also the supplemental materials for enlarged images.

0.0

8.5

8.0

7.5

7.0

6.5

6.0

5.5

5.0

4.5

HP IGS IS LIC UFLIC

A

B ABB

B

Time (sec)

(a) Response time

0

105

100

95

90

85

80

75

70

65

HP IGS IS LIC UFLIC

A

A

A

BB

Proportion (%)

(b) Correct rate

0

12

11

10

9

8

7

6

5

HP IGS IS LIC UFLIC

A

A

B B B

Distance (pixel)

(c) Locating error

0

135

130

125

120

115

110

105

100

95

HP IGS IS LIC UFLIC

ABAB

B

AA

Distance (pixel)

(d) Average gaze distance

Figure 8: Results of Exp 3. (a) F(3.4,360.5) = 5.80, p= 0.0004. (b) F(2.6046,273.48) = 10.964, p < 1e-4. (c)F(3.2,339.7) = 66.66, p < 1e-4. (d) F(3.3,347.8) = 12.71,p < 1e-4.

(with a radius of 100 pixels) centered at a critical point. Thepercentage of the first critical point visit that was the cor-rect type was shown in Figure 9(b). Higher percentage meantthat participants were more likely to notice the correct typeof critical points at first glance. None of the methods out-performed others though. In addition, we defined first-noticevisits as the number of critical point visits before finding thecorrect one and total visits as the number of total criticalpoint visits made by a participant in the experiment. Fig-ure 9(c) showed these two measurements. The average countof first-notice visits was approximately three. IGS performedbetter than HP as both of its first-notice visits and total vis-its were smaller. This indicated the participants noticed andidentified the correct type of critical points earlier.

0.0

3.4

3.2

3.0

2.8

2.6

2.4

2.2

2.0

HP IGS IS UFLIC

B

C

A

C

Time (sec)

(a) Response time

0

28

26

24

22

20

18

16

HP IGS IS UFLIC

A

A

A

A

Proportion (%)

(b) Correct 1st visit

0.0

5.0

4.5

4.0

3.5

3.0

2.5

HP IGS IS UFLIC

A

B

A

B

A

B ABAB

Times

(c) # of CP visits

Figure 9: Results of Exp 4. (a) F(2.5,1090.1) = 60.94, p <1e-4. (b) Percentage of the first critical point (CP) visit thatis correct. F(3,1689) = 1.005, p = 0.3896. (c) Green: first-notice visits, F(2.9,1216.9) = 4.76, p = 0.0029; Red: totalvisits, F(2.8,1184.6) = 17.59, p < 1e-4.

6. Discussion

Our experiments demonstrate the feasibility of applying eyetracking analysis to evaluate flow visualization methods. Wesummarize our findings and discuss the limitations of thisstudy as follows.

Influences of flow visualization methods on partici-pant performance. Participants using HP have worse per-formance in almost all four tasks. HP is poor for flow advec-tion prediction and feature identification. In the free view-ing condition, participants’ eye gazes on HP visualizationsare very different from those on other methods. They do notconcentrate on the regions with flow features. Therefore, itis hard to make sense of a vector field with HP. IGS per-forms averagely among the five visualization methods. Be-cause IGS tends to place streamlines uniformly in the visu-alization result, the tails of streamlines might not convergeto a critical point. Therefore, its flow guidance ability can bea little fuzzy. IS can accurately locate flow features becauseall streamlines exactly converge to one of the critical points.However, the sparse presentation leads to the difficulty infollowing the streamlines of a flow field in the empty regionand the participants’ fixation distribution verifies this. LIC isvery effective on feature detection in terms of the responsetime and locating error. However, from the perception per-



spective, LIC poorly attracts eye gazes toward the features.This might be because participants have difficulties to focuson features under the low contrast texturing representation.UFLIC extends LIC. Thus, its performance of is similar orslightly better than LIC. UFLIC also has poor performancein attracting eye gazes toward features. Despite this, UFLICoutperforms the other methods in almost all experiments.

New Findings. In previous studies [LKJ∗05, LCS∗12],explicit or implicit tasks were designed to understand theperformance of different visualization methods by analyz-ing participants’ behaviors in experiments. Eye movements,which reveal participants’ direct visual reactions to flow vi-sualizations, are very important behaviors but have not yetbeen utilized in existing studies on flow visualization evalu-ation. Benefiting from eye tracking analysis, our study dis-covered several new findings. First, the free viewing exper-iment, which has not been conducted before, showed thatparticipants tend to focus on critical points or highly vary-ing regions. IS could particularly illustrate critical pointsbetter. This kind of findings cannot be observed in conven-tional evaluation studies. Second, the advection predictionexperiment evaluated the guiding ability of flow visualiza-tion methods by directly comparing the difference betweenthe participants’ eye scanpath and streamlines. This providedus useful information to evaluate flow visualization meth-ods, especially when the response time and prediction errorwere similar (Figure 5). For example, HP and IS had sim-ilar response time and prediction error, but the trajectorydifference revealed that HP can guide users’ gazes muchbetter. Similarly, in Figure 8, the performance of IS, LIC,and UFLIC cannot be distinguished from response time andmouse clicking error but the average gaze distance showedIS can better attract users’ attention to flow features.

It is also interesting to compare the differences betweenexperts’ and non-experts’ visual behaviors. We only inves-tigated this issue in the free viewing condition since it hadnever been explored in previous studies. We found that ex-perts had much stronger tendency to focus on regions of crit-ical points and large directional variations than non-experts.Due to the small number of experts, this finding may not bereliable and needs further studies to verify.

Limitations and Future Work. In this study, the visualstimuli and the parameters for visualization methods weremainly chosen according to the previous studies on flow vi-sualization evaluation. This helped compare our study withprevious ones and independently judge the capabilities ofeye tracking analysis on visualization evaluation. The selec-tion of visual stimuli and parameter setting could be furtherimproved though. For instance, UFLIC was animated andasymmetric to the other four static methods. Besides, as thebox filter used in LIC may slightly blur the flow patterns andthe contrast of LIC images was not fine tuned, these mightdegrade LIC’s performance. An interesting direction for fu-ture study is to choose visual stimuli from the perspective

of perception instead of visualization techniques. For exam-ple, Ware [War08] suggested optimizing flow visualizationresults from the perspective of visual perception. Our evalu-ation approach based on eye tracking analysis can help val-idate and improve the effectiveness of Ware’s approach. Inthe future, we would like to conduct experiments along thisdirection.

Besides, this study did not consider the influences of thedensity and shape of arrows, which might affect participant’seye gazes significantly, especially on predicting flow advec-tion. According to participants’ eye scanpaths shown in Fig-ure 7(d) and 7(e), participants would find arrows first, andthen determine the flow direction. Therefore, the arrange-ment of arrows might have an impact on the performancemeasurement. This issue needs to be taken into account inexperiment design.

7. Conclusion

This study introduced a new approach for evaluating flowvisualization methods based on eye tracking analysis. Fivevisualization methods were evaluated on four experimen-tal tasks. Free viewing experiment confirmed participantstend to focus on regions with critical points or large direc-tional variation. Advection prediction experiment evaluatedthe guiding abilities of different methods by comparing thesimilarity of participants’ scanpaths and pathlines in a field.Feature detection experiment examined the expressing abil-ities of flow features by analyzing the relationship amongeye fixations and flow features. Feature identification exper-iment utilized the number and order of critical point visits toevaluate the representing ability on the type of flow features.Compared to past studies, our work exploited eye trackingto analyze the influences of flow visualization methods toparticipants’ visual perception. Due to additional perspectiveoffered by eye tracking, several new findings were revealed.

8. Acknowledgments

The authors would like to thank Prof. Chen-Chao Taofor providing the eye-tracker, Prof. Han-Wei Shen forthe UFLIC code, and anonymous reviewers for their in-sightful comments. This work was supported in part byTaiwan Ministry of Science and Technology (MOST)under grants 102-2221-E-009-082-MY3, 101-2628-E-009-021-MY3, and 103-2221-E-011-114-MY2, the UST-UCSDInternational Center of Excellence in Advanced Bioengi-neering sponsored by the MOST I-RiCE Program undergrant MOST 103-2911-I-009-101-, and the Innovation Cen-ter for Big Data and Digital Convergence, Yuan Ze Univer-sity.

References[AJDL08] ACEVEDO. D., JACKSON C. D., DRURY F., LAID-

LAW D. H.: Using visual design experts in critique-based evalu-ation of 2D vector visualization methods. IEEE Transactions on



Visualization and Computer Graphics 14, 4 (2008), 877–884. 1,2, 3

[BBBD08] BURCH M., BOTT F., BECK F., DIEHL S.: Cartesianvs. radial - a comparative evaluation of two visualization tools. InInternational Symposium on Visual Computing (2008), pp. 151–160. 2

[CCY∗03] CONVERTINO G., CHEN J., YOST B., RYU Y.-S.,NORTH C.: Exploring context switching and cognition in dual-view coordinated visualizations. In International Conferenceon Coordinated and Multiple Views in Exploratory Visualization(2003), pp. 55–62. 2

[CL93] CABRAL B., LEEDOM L. C.: Imaging vector fields usingline integral convolution. In Proc. SIGGRAPH (1993), pp. 263–270. 1

[FCL09] FORSBERG A., CHEN J., LAIDLAW D.: Comparing 3Dvector field visualization methods: A user study. IEEE Trans-actions on Visualization and Computer Graphics 15, 6 (2009),1219–1226. 1

[GH10] GOLDBERG J. H., HELFMAN J. I.: Comparing infor-mation graphics: A critical look at eye tracking. In Proceed-ings of the 3rd BELIV’10 Workshop: BEyond Time and Errors:Novel evaLuation Methods for Information Visualization (2010),pp. 71–78. 2

[IIC∗13] ISENBERG T., ISENBERG P., CHEN J., SEDLMAIR M.,MOLLER T.: A systematic review on the practice of evaluatingvisualization. IEEE Transactions on Visualization and ComputerGraphics 19, 12 (2013), 2818–2827. 2

[IKN98] ITTI L., KOCH C., NIEBUR E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transac-tions on Pattern Analysis and Machine Intelligence 20, 11 (1998),1254–1259. 5

[JC10] JÄNICKE H., CHEN M.: A salience-based quality metricfor visualization. Computer Graphics Forum 29, 3 (2010), 1183–1192. 2

[JHN10] JARODZKA H., HOLMQVIST K., NYSTRÖM M.: Avector-based, multidimensional scanpath similarity measure. InProc. ETRA (2010), pp. 211–218. 7

[JWC∗11] JÄNICKE H., WEIDNER T., CHUNG D., LARAMEER. S., TOWNSEND P., CHEN M.: Visual reconstructability as aquality metric for flow visualization. Computer Graphics Forum30, 3 (2011), 781–790. 2

[KBB∗14] KURZHALS K., BOPP C. F., BÄSSLER J., EBINGERF., WEISKOPF D.: Benchmark data for evaluating visualizationand analysis techniques for eye tracking for video stimuli. InProc. BELIV (2014), pp. 54–60. 2

[KFBW14] KURZHALS K., FISHER B., BURCH M., WEISKOPFD.: Evaluating visual analytics with eye tracking. In Proc. BELIV(2014), pp. 61–69. 2

[LCS∗12] LIU Z., CAI S., SWAN J., MOORHEAD R., MARTINJ., JANKUN-KELLY T.: A 2D flow visualization user study us-ing explicit flow synthesis and implicit task design. IEEE Trans-actions on Visualization and Computer Graphics 18, 5 (2012),783–796. 1, 2, 3, 5, 9

[LDM∗01] LAIDLAW D. H., DAVIDSON J. S., MILLER T. S.,DA SILVA M., KIRBY R. M., WARREN W. H., TARR M.: Quan-titative comparative evaluation of 2D vector field visualizationmethods. In Proc. IEEE Visualization (2001), pp. 143–150. 1, 2,3

[LHS08] LI L., HSIEH H.-H., SHEN H.-W.: Illustrative stream-line placement and visualization. In Proc. IEEE PacificVis(2008), pp. 79–86. 1, 3

[LKJ∗05] LAIDLAW D., KIRBY R., JACKSON C., DAVIDSON J.,MILLER T., DA SILVA M., WARREN W., TARR M.: Comparing2D vector field visualization methods: a user study. IEEE Trans-actions on Visualization and Computer Graphics 11, 1 (2005),59–70. 1, 2, 3, 4, 5, 9

[MK13] MATVIENKO V., KRÜGER J.: A metric for the evalua-tion of dense vector field visualizations. IEEE Transactions onVisualization and Computer Graphics 19, 7 (2013), 1122–1132.2

[MSM∗08] MARTIN J. P., SWAN II J. E., MOORHEAD II R. J.,LIU Z., CAI S.: Results of a user study on 2D hurricane visual-ization. Computer Graphics Forum 27, 3 (2008), 991–998. 2

[MTY05] MELANIE TORY M. STELLA ATKINS A. E. K. M. N.,YANG G.-Z.: Eyegaze area-of-interest analysis of 2d and 3dcombination displays. In Proc. IEEE Visualization (2005),pp. 519–526. 2

[MWFI04] MAO X., WATANABE D., FUJITA M., IMAMIYA A.:Gaze-directed flow visualization. In Proc. SPIE (2004). 2

[NBW14] NETZEL R., BURCH M., WEISKOPF D.: Compara-tive eye tracking study on node-link visualizations of trajectories.IEEE Transactions on Visualization and Computer Graphics 20,12 (2014). 2

[PIIK05] PETERS R. J., IYER A., ITTI L., KOCH C.: Compo-nents of bottom-up gaze allocation in natural images. Vision Re-search 45, 18 (2005), 2394–2416. 5

[PSD09] POHL M., SCHMITT M., DIEHL S.: Comparing read-ability of graph layouts using eyetracking and task-oriented anal-ysis. In Proc. Computational Aesthetics (2009), pp. 49–56. 2

[PW12] PINEO D., WARE C.: Data visualization optimizationvia computational modeling of perception. IEEE Transactionson Visualization and Computer Graphics 18, 2 (2012), 309–320.2

[SD04] SANTELLA A., DECARLO D.: Visual interest and NPRan evaluation and manifesto. In Proc. NPAR (2004). 7

[She98] SHEN H.-W.: A new line integral convolution algorithmfor visualizing time-varying flow fields. IEEE Transactions onVisualization and Computer Graphics 4, 2 (1998), 98–108. 1, 3

[TB96] TURK G., BANKS D.: Image-guided streamline place-ment. In Proc. SIGGRAPH (1996). 1, 3

[War06] WARE C.: 3D contour perception for flow visualization.In Proc. APGV (2006), pp. 101–106. 3

[War08] WARE C.: Toward a perceptual theory of flow visualiza-tion. IEEE Computer Graphics and Applications 28, 2 (2008),6–11. 2, 9

[ZMT06] ZHANG E., MISCHAIKOW K., TURK G.: Vector fielddesign on surfaces. ACM Transactions on Graphics (TOG) 25, 4(2006), 1294–1326. 2


Documents

Evaluating 2D Flow Visualization Using Eye Tracking · 2015-07-03 · Hsin-Yang Ho, I-Cheng Yeh, Yu-Chi Lai, Wen-Chieh Lin, and Fu-Yin Cherng / Evaluating 2D Flow Visualization Using