22
Journal of Visual Languages and Computing (2002) 13, 601^622 doi:10.1006/S1045-926X(02)00026-5 available online at http://www.idealibrary.com on The Role of Visual Perception in DataVisualization Mehdi Dastani Institute of Information and Computing Sciences, Universiteit Utrecht, Padualaan 14, De Uithof 3584CH Utrecht, The Netherlands, e-mail: [email protected] Received 12 July 2001; accepted 27 March 2002 This paper presents a perceptually motivated formal framework for e¡ective visualiza- tion of relational data. In this framework, the intended structure of data and the per- ceptual structure of visualizations are formally and uniformly de¢ned in terms of relations that are induced on data and visual elements by data and visual attributes, respectively.Visual attributes are analyzed and classi¢ed from the perceptual point of view and in terms of perceptual relations that they induce on visual elements. The presented framework satis¢es a necessary condition for e¡ective data visualizations. This condition is formulated in terms of a structure preserving map between the intended structure of data and the perceptual structure of visualization. r 2002 Published by Elsevier Science Ltd. 1. Introduction T heories of eFFective datavisualization aim to analyze and explain the relation between data and their e¡ective visualizations [1^6].This relation is essentially a denota- tion relation in the sense that visual elements and relations denote data elements and relations, respectively. For example, the number of cars produced by di¡erent factories can be visualized by a bar-chart. Each bar represents a car factory and its length represents the number of produced cars by that factory. The perceivable length relations between bars denote relations between the numbers of cars produced by di¡erent factories. The e¡ectiveness of a visualization depends on many conditions such as conventions, subjective user preferences, cultural settings, and the structural correspondence with its denoting data [4, 5, 7^9]. For example, the e¡ectiveness of a temperature map in which temperature values are visualized by color hue values increases when conventional color hue values are used for high and low temperatures, i.e. red for high temperatures and blue for low temperatures. Although the satisfaction of some conditions enhance the e¡ec- tiveness of visualizations, the satisfaction of other conditions are necessary for e¡ective data visualization. In general, we distinguish conditions for e¡ective data visualization into necessary and su⁄cient conditions. This distinction is closely related to the well-known distinction between expressiveness and e¡ectiveness [2, 3].There are at least two reasons for not using the expressiveness/e¡ectiveness distinction. First, the expressiveness criterion is de¢ned 1045-926X/02/$ - see front matter r 2002 Published by Elsevier Science Ltd.

The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Journal ofVisual Languages and Computing (2002) 13, 601^622doi:10.1006/S1045-926X(02)00026-5 available online at http://www.idealibrary.com on

The Role of Visual Perception in DataVisualization

Mehdi Dastani

Institute of Information and Computing Sciences, Universiteit Utrecht, Padualaan14,DeUithof 3584CHUtrecht,TheNetherlands, e-mail: [email protected]

Received12 July 2001; accepted 27March 2002

This paper presents a perceptually motivated formal framework for e¡ective visualiza-tion of relational data. In this framework, the intended structure of data and the per-ceptual structure of visualizations are formally and uniformly de¢ned in terms ofrelations that are induced on data and visual elements by data and visual attributes,respectively.Visual attributes are analyzed and classi¢ed from the perceptual point ofview and in terms of perceptual relations that they induce on visual elements. Thepresented framework satis¢es a necessary condition for e¡ective data visualizations.This condition is formulated in terms of a structure preserving map betweenthe intended structure of data and the perceptual structure of visualization. r 2002Published by Elsevier Science Ltd.

1. Introduction

Theories of eFFective datavisualization aim to analyze and explain the relationbetween data and their e¡ective visualizations [1^6].This relation is essentially a denota-tion relation in the sense that visual elements and relations denote data elements andrelations, respectively. For example, the number of cars produced by di¡erent factoriescanbe visualized byabar-chart. Eachbar represents a car factoryand its length representsthe number of produced cars by that factory.The perceivable length relations betweenbars denote relations between the numbers of cars produced by di¡erent factories.

The e¡ectiveness of a visualization depends on many conditions such as conventions,subjective user preferences, cultural settings, and the structural correspondence with itsdenoting data [4, 5, 7^9]. For example, the e¡ectiveness of a temperature map in whichtemperature values are visualized by color hue values increases when conventional colorhue values are used for high and low temperatures, i.e. red for high temperatures and bluefor low temperatures. Although the satisfaction of some conditions enhance the e¡ec-tiveness of visualizations, the satisfaction of other conditions are necessary for e¡ectivedata visualization.

In general, we distinguish conditions for e¡ective datavisualization into necessary andsu⁄cient conditions. This distinction is closely related to the well-known distinctionbetween expressiveness and e¡ectiveness [2, 3].There are at least two reasons for notusingthe expressiveness/e¡ectiveness distinction. First, the expressiveness criterion is de¢ned

1045-926X/02/$ - see front matterr 2002 Published by Elsevier Science Ltd.

Page 2: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

M. DASTANI602

with respect to a graphical language which we want to abstract from. The aim of thispaper is to propose a general framework for e¡ective data visualization, which is inde-pendent of graphical languages and thus not limited to their expressions. Second, theexpressiveness criterion is claimed to be satis¢ed if and only if data are represented inthe visual structure and the e¡ectiveness criterion is claimed to be satis¢ed if themechan-ismof the humanvisual system is taken into account [2].Webelieve that the expressivenesscriterion makes only sense if visual structures are perceivable by human viewers. Thisimplies that both expressiveness and e¡ectiveness criteria depend on the human visualsystem and that the expressiveness/e¡ectiveness distinction is not a distinction betweenperceptually depending and perceptually not depending conditions.

In the context of e¡ective data visualization, the structural correspondence conditionis a necessary conditionwithout which the e¡ectiveness of visualizations cannot be guar-anteed.This condition should be de¢ned between the structure of data and the percep-tual structure of its visualization. The emphasize on perceptual structure in thisformulation is based on the assumption that the e¡ectiveness of visualizations dependon the e¡ective use of the capabilities of the humanvisual system to perceive visual struc-tures.We argue that a theory of e¡ective data visualization should therefore be based ontheories of human visual system that describe human capabilities of perceiving visualrelations. For example, the human capability to perceive relation between color hue va-lues is limited to the identity relation. For this reason, a visualization inwhich the age ofindividuals is visualized by color hue values is not e¡ective if the intended data relationsare quantitative relations between ages.

In this paper, we focus on the structural correspondence condition for e¡ective datavisualization and propose a formal framework for e¡ective data visualization, which sa-tis¢es the structural correspondence condition. This paper is organized as follows. InSection 2, we propose a process model for e¡ective data visualization and identify thesteps through which the structural correspondence between data and visualizationsshould be established. For this model, we discuss the class of input data structures, theprojection step from data structures to perceptual structures, the layout of perceptualstructures, and ¢nally the necessary condition for e¡ective data visualization. In Section3, some general types of data attributes that are important for data visualization are intro-duced and a classi¢cation of attribute-based data structures is proposed. Similarly, variousvisual attributes are analyzed from the perceptual point of view and a classi¢cationof perceptual structures is proposed. In Section 4, we build on formal de¢nitions of at-tributes and specify data and perceptual structures in a formal way. The structuralcorrespondence condition for e¡ective data visualization is then de¢ned as a structure-preserving relation between data and perceptual structures. Finally, in Section 5, weconclude the paper and discuss future research directions.

2. A Process Model for E¡ective DataVisualization

The process of data visualization is considered here as the inverse of the interpretationprocess, i.e. the visualization process generates for an input data a visualization theinterpretation of which results the original data. Since the e¡ectiveness of a visualizationcan be measured in terms of the easiness and directness of acquiring its intendedinterpretation and because the interpretation of a visualization depends on humanvisual

Page 3: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

AnalysisData Data

Generation

Mapping Structure

VisualData

Structural Perceptual

Perception Layout

Pattern

DataStructured

Visualization Interpretation

Figure 1. A process model for e¡ective data visualization

VISUAL PERCEPTION IN DATAVISUALIZATION 603

perception, an e¡ective visualization should strongly bene¢t from the capabilities of thehumanvisual system.We formulate, therefore, the e¡ectiveness of visualization as follows:a visualizationpresents the inputdata e¡ectively ifthe intended structure ofthedata and theperceptual structureofthe visualization coincide. Based on this view, a process model for e¡ective datavisualization[10] is schematically illustrated in Figure 1.

This process startswith an inputdatawhich is assumed to consist of data elements.The¢rst step in data visualization is to determine the structure of data, i.e. howdata elementsare related to each other.This is not a trivial decision process since for example a set ofintegers may be related to each other nominally (i.e. integers are used as identi¢ers), ord-inally (i.e. integers are used as ordinals), or quantitatively (integers are used as quantities).This step in the process of data visualization transforms the input data to what is calledstructured data.The second step in data visualization is to determine visual elements thatrepresent data elements in such a way that the perceptual structure of the decided visualelements represents the structure of the data.We assume that the result of this step is notnecessarily a drawable visualization, but an abstract perceptual structure.Therefore, thethird and the last step in data visualization is the layout process which transforms theabstract perceptual structure to a drawable visualization.

On the other hand, the interpretation process starts with a visualization, possibly to-gether with a legend. Since the input of the interpretation process is a visualization andbecause visualizationsmust be perceived before theycan be understood as denotingdata,the ¢rst interpretation step is perception.The process of perception determines percei-vable relations among visual elements on the basis of their visual attribute values. Asvisual attribute values are assigned to visual elements by the layout process, perceptionis considered as the inverse of the layout process. Subsequently, visual elements aremapped to data elements. This mapping is possibly supported by a legend that deter-mines the correspondence betweenvisual and data attributes and their values. In the caseof e¡ective data visualization, the relations between data elements are structurally iden-tical to the perceivable relationsbetween their representingvisual elements. Finally, giventhe structured data, the table of attribute values can be trivially generated. In the rest ofthis section, we elaborate on di¡erent steps of this process model.

2.1. Structuring the Input Data

In this paper, the input data for the visualization process is assumed to be nested rela-tional data. This class of data can be represented as m� n data tables consisting of

Page 4: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Table 1. A data table describing the cooperationdegrees between car companies

company1 cooperation-degree company 2

d1 d 01 good d 02d2 d 01 best d 03d3 d 02 good d 01d4 d 03 best d 01d5 d 03 normal d 04d6 d 04 normal d 03

M. DASTANI604

m columns and n rows. A columnAi ði ¼ 1;y; mÞ consists of values of one data attri-bute and a row Bj ð j ¼ 1;y; nÞ consists of m values each belongs to one of attributesA1;y;Am:A rowof a table is called a data entrya.The values of attributesA1;y; Ammay themselves be rows of another k� l table that consists ofA0

1;y; A0k columns and

B01;y; B0

l rows. In such a case, attributes A1;y; Am are called aggregated attributes.An attribute value, which is a single value, is called an atomic value.

For example, consider Table 1, which describes the cooperation degree between pairsof car companies. In this table, company1and company2 are aggregated attributes since theirvalues are rows fromTable 2.This table describes the number of sold cars having a certaintype and being produced in a certain country. In this data table, car-type, sold-numbers, andcountry are attributes that have atomic values.These two data tables can be combined intoone single data table by means of aggregated attributes.Table 3 is obtained by combiningTables 1 and 2.

According to the proposed data visualization process model, the ¢rst step is to deter-mine the intended structure of the input data. In general, we assume that an input datacan be structured in manydi¡erent ways and that the intended structure of the input datais determined by human users.We also assume that human users have access to di¡erenttransformations that provide di¡erent structures of the input data [2].

Two types of data structures are distinguished.The structures of the ¢rst type are con-stituted by relations that are intended to exist among values of individual data attributes.These relations characterize the type of the corresponding attributes. For instance, valuesof the sold-number attribute are from the domain of integers and theymaybe intended to berelated to each other by some arithmetical relations.These arithmetical relations charac-terize the type of the sold-number attribute. Adata attribute together with its type is called atyped data attribute and a table that is de¢ned on typed data attributes is called a typeddata table. In a typed data table, relations that are de¢ned on values of individual attri-butes are induced on data entries in the sense that two data entries are related in the sameway as their attribute values are related. Relations on data entries may be induced by dif-ferent data attributes. For instance, data entries inTable 2maybe considered as equivalent

aMany researchers in this ¢eld, such as Bertin [1] and Card etal. [2], have used tables as the format of theinput data for the visualization process. However, they have used di¡erent terminology than used in thispaper. For example, Bertin uses the term data characteristics for data attributes and data objects for data entries,while Card etal. use data variables for data attributes and data cases for data entries.The terminology used in thispaper is motivated by the relational database terminology.

Page 5: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Table 2. A data table describing car companies interms of car types, the number of sold cars, and the

countries they produce cars

car-type sold-number country

d 01 vw 40 000 Germanyd 02 toyota 50 000 Japand 03 opel 30 000 Germanyd 04 mazda 20 000 Japan

Table 3. Tables 1 and 2 are combined

company1 cooperation-degree company 2

car-type sold-number country car-type sold-number country

vw 40 000 Germany good toyota 50 000 Japanvw 40 000 Germany best opel 30 000 GermanyToyota 50 000 Japan good vw 40 000 GermanyOpel 30 000 Germany best vw 40 000 GermanyOpel 30 000 Germany normal mazda 20 000 JapanMazda 20 000 Japan normal opel 30 000 Germany

VISUAL PERCEPTION IN DATAVISUALIZATION 605

or non-equivalent according to the car-type attribute, while they are quantitatively relatedto each other according to the sold-number attribute.

The structures of the second type are constituted by binary relations that are de¢nedonvalues of two di¡erent aggregated attributes.The values of these aggregated attributesshould be the entries of one and the same table. In fact, the characterizing relation foreach single aggregated attribute is the identity relation, but since the values of two aggre-gated attributes are fromone and the same domain (i.e. the entries fromone and the sametable), two aggregated attributes together de¢ne a binary relation between domain ele-ments. These binary relations are implicit in the data table and do not characterizeany single attribute. For instance, the binary relation de¢ned on the values of thecompany1 and company2 attributes in Table 1 can be represented as: fðd 01; d 02Þ; ðd 01; d 03Þ;ðd 02; d 01Þ; ðd 03; d 01Þ; ðd 03; d 04Þ; ðd 04; d 03Þg: This binary relation relates data entries fromTable 2.Binary relations can be characterized by their structural properties such as functional(injective, surjective, bijective), transitive, symmetric, and re£exive properties. InTable 1the values of the company1 and company2 attributes de¢ne such a binary relation which isnon-functional, symmetric, non-re£exive, and non-transitive. Notice that this type ofstructures can be extended by allowing n-ary relations between aggregated attributevalues. In such a case, data tables consist of n aggregated attributes.

2.2. Projecting Structured Data to Perceptual Structure

The second step invisualizing data is the projection of the structured data into a percep-tual structure. A perceptual structure is de¢ned as a typed visual table which is basically

Page 6: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Table 4. Avisual table for the data described inTable 1

connection1 Size connection 2

v1 v01 size2 v02v2 v01 size3 v03v3 v02 size2 v01v4 v03 size3 v01v5 v03 size1 v04v6 v04 size1 v03

M. DASTANI606

a table constituted by typed visual attributes.The type of a visual attribute is determinedby the relations that can be perceived among values of that attribute. Aswewill explain inthe next section, the perceivable relation for each visual attribute is determined before-hand based on theoretical and empirical studies of humanvisual perception.The rows ofa typed visual table will then be called visual entries.

The projection of the structured data into a perceptual structure consists of a bijectivefunction from typed data attributes to typed visual attributes and another bijective func-tion from rows of the data table (i.e. data entries) to rows of the visual table (i.e. visualentries).There are di¡erent functions that map typed data attributes to typed visual attri-butes.These di¡erent functions specify alternative visualizations of data.We assume thatthe second function does not map data attribute values to visual attribute values, but tovariables that stand for visual attribute values. These variables will be instantiated withactual values by the layout process. As we will explain in Section 4, for e¡ective datavisualization the projection should be a structure-preserving projection. This can beguaranteed by ensuring that the ¢rst function maps each typed data attribute to a visualattribute with identical type, and that the second function maps data attribute values tovisual variables in such a way that whenever two data attribute values are in a certainrelation (characterizing the type of that data attribute), their corresponding visual vari-ables are in the corresponding perceptual relation as well.

This projection speci¢es abstract visual entries that are related to each other by percep-tual relations induced by the typed visual attributes. For example, considerTable 1 onceagain.The ¢rst function maps the company1data attribute to the connection1visual attribute,company2 to connection2, cooperation-degree to size, car-type to xpos, sold-number to ypos, and countryto color-hue.The second function maps data entries to abstract visual entries such that thestructure preserving condition is satis¢ed. For instance, consider the data attribute countrywhich is characterizedby the identity relation.Whenever twovalues of the country attributeare identical, their corresponding color-hue variables are identical as well.This projectionresults typed visual tables illustrated inTables 4 and 5.

2.3. The Layout of Visualization

The variables in the typed visual table should be instantiated with visual attribute valuesto generate a drawable visualization of data. However, the typed visual table consists ofonly those visual attributes towhich a data attribute is projected.Therefore, it may be thecase that some visual attributes, the values of which are necessary to generate a drawable

Page 7: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

VW

<(TOYOTA,50000,JAPAN),(VW,40000,GERMANY)><(OPEL,30000,GERMANY),(MAZDA,20000,JAPAN)>

<(VW,40000,GERMANY),(OPEL,30000,GERMANY)>

MAZDA

TOTOTA

OPEL

0

10000

20000

50000

30000

40000

NORMAL

BEST

GOOD

sold-numbers

vw toyota opel mazdacar-types

germany

japan normal-relationgood-relationbest-relation

japan

germany

big = best-relationmiddle = good-relationsmall = normal-relation

(a) (b)

(c)

Figure 2. Visualizations of the data described byTable 1

Table 5. Avisual table for the data described inTable 2

xpos ypos color-hue

v01 seg1 pos4 hue1v02 seg2 pos5 hue2v03 seg3 pos3 hue1v04 seg4 pos2 hue2

VISUAL PERCEPTION IN DATAVISUALIZATION 607

visualization, are not used by the projection. For instance, if the position attribute is notusedbya projection, thenvisual entries should receive positionvalues in order to becomedrawable.These unused visual attributeswill be called undecided visual attributes in con-trast to the decided attributes to which data attributes are projected.

Consider the visualizations that are illustrated in Figure 2. These are three possiblevisualizations for the data represented inTables 1 and 2. In visualization 2-A, shape andsize are undecided visual attributes. Moreover, in visualization 2-B position and shape,and invisualization 2-Cposition, are undecidedvisual attributes. Invisualization 2-C, thelabel attribute is decided for the car-type, sold-number, and country attributes. This exampleshows that more than one data attribute can be projected to a visual attribute. Of course,this does not hold for all visual attributes. For example, only one data attribute can beprojected to the color-hue attribute since otherwise the e¡ectiveness of visualization cannotbe guaranteed.We assume that the projection function is de¢ned based on this generalvisualization knowledge.Visualization 2-C shows also that a drawable visualization doesnot need to have values for all visual attributes. For example, in this visualization visualentries do not need shape values to become drawable.We assume that the combination ofdecided visual attributes determines which undecided visual attribute values should bespeci¢ed for visual entries to become drawable.

The generation of values for decided and undecided visual attributes is a constraintsatisfaction process. The constraints on the values of the decided visual attributes areimposed by the perceptual structure determined at the previous visualization step.Although the perceptual structure does not impose any constraints on the values of

Page 8: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

M. DASTANI608

undecided visual attributes, these values cannot be generated arbitrarily since otherwiseunwanted visual implicature [11] mayoccur. In the next section, wewill discuss this issuein more details.The determination of values for decided and undecided visual attributeswill be called the layout process.

It should be noted that the textual information inserted in these visualizations mayeither be the values of the label attribute or it may be a part of the interpretation function.In order to distinguish these two kinds of textual information, we use capital letters onlyto indicate that a textual information is a value of the visual label attribute.

2.4. E¡ective DataVisualization

Adata table canbevisualized inmanydi¡erentwaysbydecidingdi¡erent visual attributesfor each data attribute and by specifying di¡erent visual entries for each data entry. InFigure 3, two di¡erent visualizations for Tables 1 and 2 are illustrated. In visualization3-A, the xpos, ypos, and color-hue attributes visualize the car-type, sold-number, and country attri-butes, respectively. Moreover, the links betweenvisual elements, de¢ned by the connection1and connection2 attributes, visualize the company1and company2 attributes, respectively. Final-ly, the thickness attribute of the links visualizes the cooperation-degree attribute. Invisualization3-B, the shape, color-hue, and label attributes visualize the sold-number, country, and car-type attri-butes, respectively. Like visualization 3-A, the links and their thickness visualize thecompanies and their degrees of cooperations, respectively. In visualizations 3-A and3-B the inserted numbers are parts of the interpretation function (i.e. legend).

Although these two visualizations represent the same data, visualization 3-B is not ane¡ective visualization if the quantitative relations between the numbers of sold cars arethe intended data relations. In this visualization, the perceivable relation induced by theshape attribute is the identity relation which implies that one can only perceive that thenumbers of sold cars are equal or not. Of course, one may deduce the intended quanti-tative data relations indirectly by means of the interpretation information that are in-serted into visualization 3-B. For example, the interpretation values 20 000 and 40 000implies that the second is twice the ¢rst. However, this inference is based on the knowl-edge of the perceiver about the structure of real numbers rather than being based onperceivable relations.The e¡ectiveness of visualization is the ability to perceive data rela-tions directly by structurally similar perceivable relations.

In order to illustrate the direct perception of relations, consider visualizations 4-Aand 4-B (Figure 4). These visualizations are the same as visualizations 3-A and 3-B,

0

10000

20000

50000

30000

40000

OPEL30000

MAZDA20000 VW

40000

TOYOTA50000

sold-numbers

car-typesvw toyota opel

normal-relationgood-relationbest-relation

japan

germany

normal-relationgood-relationbest-relation

japan

germany

mazda(a) (b)

Figure 3. Alternative visualizations of the data described inTable 1

Page 9: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

0

OPELTOYOTA

MAZDAVW

sold-numbers

car-typesopeltoyotavw mazda

japan

germany

best-relationgood-relationnormal-relation normal-relation

good-relationbest-relation

japan

germany

(a) (b)

Figure 4. Visualizations without any interpretation values

VISUAL PERCEPTION IN DATAVISUALIZATION 609

respectively, except that in these visualizations the interpretation information are not pre-sented.Knowing that the shape attribute visualizes the sold-number attribute invisualization4-B, it is impossible to perceive any quantitative relation between visual elements andtherefore it is impossible to infer any quantitative data relation. In contrast, knowing thatthe ypos attribute visualizes the sold-number attribute in visualization 4-A, it is easy to per-ceive quantitative relations between visual elements and therefore easy to infer quantita-tive data relations. In the rest of this paper, whenever we refer to the perceptual structureof visualizationswemean the structure of visualizationswithout using any interpretationinformation.

The above examples show that the choice of visual attributes for data attributes mayin£uence the e¡ectiveness of visualizations. However, the speci¢cations of visual entriesby the layout process may in£uence the e¡ectiveness of visualizations as well.We arguedthat the layout process should specify values for both decided and undecided visual attri-butes. In the context of e¡ective data visualization, the generation of values for thedecided visual attributes is constrained by the perceivable relations. For example, in vi-sualization 3-Awhere the color-hue attribute is decided for the country attribute, the layoutprocess has generated identical color-hue values whenever their corresponding data attri-bute values are identical.

The generation of values for the undecided visual attributes may also in£uence thee¡ectiveness of visualizations. For example, consider again visualization 2-A. Althoughthis visualization represents the same information as visualization 3-A, it is not an e¡ec-tive visualization.The reason is that a humanviewer perceives di¡erences between shapesof visual elements and may infer that these di¡erences visualize di¡erences in the pre-sented data. In order to avoid this type of unwanted visual implicatures, the values ofundecided visual attributes should be speci¢ed in such a way to induce an identity rela-tion on the involved visual elements. In this way, visual elements will be perceived asbeing identical to each other with respect to undecided visual attributes.This issue willbe further explained in Section 4.3

3. An Attribute-based Classi¢cation of Structures

An important step in data visualization is the projection of data structures (typed datatable) to perceptual structures (types visual table). In this section, a unifying approach isproposed to analyze di¡erent classes of data and perceptual structures.This approach is

Page 10: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

M. DASTANI610

based on a formalization of attributes in terms of their characterizing relations.The ad-vantage of this approach is that data and perceptual structures are both formalized in auniform way such that the projection between them can be formulated in a formal andintuitiveway. Based on this formalization a classi¢cation ofdata and perceptual structuresis proposed.

In order to de¢ne attributes in a formal way, we use some notions from themeasurementtheory [12, 13].The theory of measurement provides both a classi¢cation of attributes aswell as theirmathematical de¢nitionsby introducingdi¡erent types ofmeasurementscales. Infact, a measurement scale is a map from a structured set of elements to a structured set ofvalues like the set of real numbers, the set of integers, or a set of strings. In order tomodelaggregated attributes, we also allow an attribute to map a structured set of elements toanother structured set of elements. In the measurement theory, relational systems areused to represent structured sets.

De¢nition 1. A relational system is a pair /A;R1;y; RnS where A is a set ofelements, and R1;y; Rn are relations de¢ned onA:

We consider an attribute as a measurement scale, which map a structured set intoanother structured set.

De¢nition 2. An attribute is a homomorphism H from a relational system/A;R1;y; RnS into a relational system /B; S1;y; SnS:The set A is the set of ele-ments and the set B is either a set of elements or a set of attribute values such as the set ofreal numbers, the set of integers, or a set of strings.

In the context of data visualization,A can be either a set of visual elements or a set ofdata elements,R1;y; Rn can be either perceptual relations or data relations,B is either aset of elements or a set of values, and S1;y; Sn are characterizing relations de¢ned onB:These characterizing relations are abstract mathematical relations such as ¼; r and þ:The homomorphism guarantees that the relations that an attribute induces on elementshave identical structural properties as its characterizing relations. In the next subsections,we introduce data attributes and their corresponding data structures, followed by visualattributes and their corresponding perceptual structures.

3.1. Data Attributes

Below are ¢ve di¡erent types of data attributes that are relevant for data visualization.

Nominal attributes: A nominal attribute is a homomorphic map H from a relationalsystem /A;ES into a relational system /B;¼ S: Note that the homomorphic mapassigns values such as real numbers or strings to elements ofA:

Ordinal attributes: An ordinal attribute is a homomorphic mapH from a relational sys-tem/A;$S into the relational system/B;rS:The homomorphic map matches theordinal relation among attribute values with the ordinal relation among elements ofA:

Interval attributes: An interval attribute is a homomorphic mapH from a relational sys-tem/A;$; 3S into the relational system/Rk;r;~S;where~ is a quaternary me-trical relation de¢ned on real numbers. In the context of data visualization, we may usethe quaternary metrical relation~ de¢ned as: ab~cd3ja bjrjc dj:This attribute

Page 11: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

VISUAL PERCEPTION IN DATAVISUALIZATION 611

is often called the absolute interval attribute because we are interested in the absolutedi¡erence between pairs of elements.

Ratio attributes: A ratio attribute is a homomorphic map H from a relational system/A; $; 3;NullS into the relational system /ðRZ0Þk;r;�; 0S;where � is a binarymetric operator de¢ned on real numbers and 0 is a zero-place operator identifying thezero number. Note that the zero number has the following property:8eAðR>0Þk 0� e ¼ 0: This zero-place operator identi¢es the origin element. In thecontext of data visualization, we may use the binary metrical operator � de¢ned as:a� b3a=b where aAðRZ0Þk and bAðR>0Þk; i.e. b is not the absolute origin element0 (ba0).

Aggregated attributes: An aggregated attribute is a homomorphic map H from a rela-tional system/A;ES into a relational system/B;¼ S;whereA andB are two distinctsets of data elements.This is in contrast with other attributes since the set B is the set ofdata elements instead of atomic values.

Although only interval and ratio attributes are introduced here, there may be manyquantitative attributes, each of which uses di¡erent sets of arithmetical relations andproperties of real numbers. This approach is general enough to de¢ne alternative dataattribute types.

3.2. AClassi¢cation of Data Structures

The proposed ¢ve data attributes result in several classes of data structures. First, wedistinguish between separable data structures that are constituted by characterizing rela-tions of separate data attributes, and aggregated data structures that are based on two ormore aggregated attributes. For example,Table 2 has a separable structurewhileTable1hasan aggregated structure. Separable structures can be subdivided into qualitative andquantitative structures. Subsequently, qualitative data structures can be subdivided intonominal and ordinal structures and quantitative structures into interval and ratio.

Aggregated data structures can be subdivided into pure andmixed structures depend-ing on the existence of non-aggregated attributes. The pure aggregated structures areconstituted onlybyaggregated attributes,whilemixed aggregated structures are also con-stituted by relations that characterize additional non-aggregated attributes. For example,Table 1has a mixed aggregated structure since it is based on two aggregated the company1and company2 attributes and the additional non-aggregated cooperation-degree attribute.With-out this non-aggregated attribute,Table 1 would have a pure aggregated data structure.These data structures can be classi¢ed hierarchically as illustrated in Figure 5.The formalde¢nitions of these data structures are given in Section 4.1.

3.3. Visual Attributes

Avisual attribute maps a relational system/A;R1;y; RnS;where A is a set of visualelements, into a relational system /B; S1;y; SnS; where B is either a set of visual ele-ments or a set of atomic visual attribute values. Since we are interested in perceptualcharacterizations of visual attributes, relations S1;y; Sn for a visual attribute shouldcharacterize the perceivable relations that are induced on visual elements by that visualattribute. For example, the human visual system can identify the equality of shape values

Page 12: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Aggregated

Qualitative

Nominal Ordinal Interval Ratio

Separable

Datastructures

Quantitative PureMixed

Figure 5. A partial classi¢cation of data structures

M. DASTANI612

and perceives that a red square and a green square have the same shapes, while it cannotidentify an ordering among a red circle and a red square. In the rest of this section, westudy a number of visual attributes from the perceptual point of view.

3.3.1. Non-spatialAttributes

The non-spatial visual attributes are studied by Bertin [1]b.These attributes include hue,saturation, brightness, size, shape, label, and texture.

Color-hueattribute:The color hue attribute can be de¢ned as a homomorphic map from arelational system, consisting of a set of visual elements and an equivalence relation, into arelational system consisting of a set of color hue values and the equality relation. Thisde¢nition of the color hue attribute assumes that the humanvisual system can only iden-tify the equality of color hue values, but cannot order themc.

Color-saturation attribute:The color saturation attribute is a homomorphic map from arelational system, consisting of a set of visual elements and an ordered equivalence rela-tion, into a relational system consisting of the set of color saturationvalues and the order-ing relationr: It is then assumed thatboth the identity and the ordering relations amongcolor saturation values can be perceived.

Color-brightness attribute:The color brightness attribute is a homomorphic map from arelational system, consisting of a set of visual elements and an ordered equivalencerelation, into a relational system consisting of the set of color brightness values and theordering relationr: It is then assumed that both the identity and the ordering relationsamong color brightness values can be perceived.

Shapeattribute:The shape attribute can be de¢ned in a similar way as the color hue attri-bute.The range of the shape attribute is then a pair consisting of a set of shape values and

bIn his book,GraphicsandGraphic InformationProcessing, he uses the term‘visual variables’rather than ‘visualattributes’.

cOf course, this is a narrow characterization of the color hue attribute. In fact, color hue values can insome cases be perceived as being ordered. For example, the rainbow colors (a certain sequence of the colorhue values) are perceived as having an ordering structure.This ordering structure can be characterized as aring structure that is constituted by an abstract mathematical relation like modulo relation.The character-ization of the color hue attribute becomes even more di⁄cult whenwe consider the fact that people (espe-cially visual artists) can know that the purple hue value is between blue and red, orange between red andyellow, but have no awareness of what may be between orange and blue. Nevertheless, we assume that theproperties of the color hue attribute can be speci¢ed by abstract mathematical relations, so that we cande¢ne this attribute as a homomorphic map between relational systems.

Page 13: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

VISUAL PERCEPTION IN DATAVISUALIZATION 613

an equality relation de¢ned on it. It is then assumed that one can perceive the equality ofshapes but cannot perceive how one shape is related to another oned.

Sizeattribute:The range of the size attribute can be de¢ned as a pair consisting of a set ofpossible size values and an order relation de¢ned on it. However, the equality of sizevalues can be perceived by the human visual system only in special cases where the setof size values is a small ¢nite sete.

Label and Texture attributes:The label (texture) attribute is de¢ned as homomorphic mapfor which the range consists of a set of label (texture) values.The relation for these rela-tional systems is the equality relation.

3.3.2. SpatialAttributes

A rather di¡erent type of visual attributes concerns various uses of the extension of thespace [14].These attributes are called spatial attributes.The spatial attributes of multidi-mensional spaces interact.This implies that spatial attributes for multidimensional spacesmay employcomplex propertieswhich cannotbe de¢ned as properties of individual one-dimensional spaces. Here, we consider only those properties of multidimensional spacesthat can be de¢ned in terms of properties ofone-dimensional spaces.Therefore, we studydi¡erent spatial attributes for one-dimensional space, henceforth1D-space, and consideronly those spatial attributes of the 2D-space or the 3D-space that can be de¢ned by twoor three spatial attributes of the 1D-space.

Segmentation attribute:The use of1D-space bydividing it into subspaces is called segmen-tation. In this way, the total space is divided into a ¢nite number of segments.The seg-mentation attribute can then be de¢ned as a homomorphic map from a relational system,consistingof a set of visual elements and the equivalence relation, into a relational systemconsisting of a set of real numbers or names, each of which identi¢es a certain segment,and the equality relation de¢ned on it. It is then assumed that the humanvisual system canperceive the space as consisting of a number of perceptually distinguishable segments.

Ordered segmentation attribute:One can also use the ordinal property of 1D-space and con-sider it as an ordered set of subspaces.This consideration results the ordered segmenta-tion attribute, which can be de¢ned as a homomorphic map from a relational system,consistingof a set of visual elements and an ordered equivalence relation, into a relationalsystem consisting of a set of real numbers or names, each of which identi¢es a certainsegment, and an order relation de¢ned on this set.

Relative space attribute:The space can also be viewed in terms of its quantitative proper-ties.The quantitative properties of the 1D-space can be used in two di¡erent ways: the¢rst use of the quantitative properties of the 1D-space is by considering the space as arelative 1D-space consisting of positions in which no origin is de¢ned. In this case, thespace between two positions (interval) is a quantity that can be perceived and meaning-fully used, e.g. two intervals can be compared to each other.The resulting visual attributewill be called relative space attribute. This attribute can be de¢ned as a homomorphic

dIf we consider only a speci¢c set of shapes, we may perceive an ordering relation. For instance, in thecase of polygons, shapes can be ordered according to the number of edges (nodes) involved.

eOne can also argue that the size attribute is a quantitative attribute such that one can perceive that onesquare is twice as big as a second square. In this case, the domain and the range of the size attribute shouldcontain somemetric relation thatde¢nes a quantitative structure on the set of visual elements and the set ofattribute values, respectively.

Page 14: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

M. DASTANI614

map from a relational system consisting of a set of visual elements and a set of relationsconsisting of an equivalence relation, an ordering relation, and a metrical relation.Therange of the map is a relational system consistingof the set of real numbers, each of whichidenti¢es a position, onwhich the equality relation, the ordering relation and the absolutedi¡erence relation between pairs of real numbers are de¢ned.

Absolute space attribute:The second use of the quantitative properties of the space is byconsidering the space as an absolute space in which one position is (explicitly or impli-citly) de¢ned to be the origin of the space. In this case, each position in the space is aquantity such that the proportionality between positions can be perceived and meaning-fully used.The resulting visual attributewill be called absolute space attribute.This visualattribute can be de¢ned as a homomorphic map from a relational system consisting of aset of visual elements and a set of relations consisting of an equivalence relation, anordering relation, and a metrical relation. The range of the map is a relational systemconsisting of the set of real numbers, each of which identi¢es a position, on which theequality relation, the ordering relation and the proportionality relation are de¢ned onreal numbers.

Combined attributes:The separate analysis of one-dimensional spatial attributes suggeststhat some spatial attributes for two- and three-dimensional spaces can be de¢ned as con-sisting of two and three one-dimensional spatial attributes, respectively. Given the abovefour spatial attributes as they are de¢ned for 1D-space, there are 16 possibilities for two-dimensional spatial attributes. But of course, some of those combinationsmayyield iden-tical results. For example, the two-dimensional spatial attribute consisting of the absolutespace attribute and the ordered segmentation attribute may be the same as the two-dimensional spatial attribute consisting of the ordered segmentation attribute and theabsolute space attribute.The perceptual structure of most visualizations is induced by acombination of visual attributes. An example of the combination of visual attributes istwo-dimensional bar-chart. In a bar-chart, one 1D-space (e.g. x-axis) is structured bydividing it into various segments while the second 1D-space (e.g. y-axis) is structuredby its metrical quantities.

3.3.3. TopologicalAttributes

A special type of visual attributes concerns various uses of topological properties of thespace.The perceptual structures that are constituted by perceivable topological relationsare for example, inside, outside, overlap, and connectedness. Topological relations areoften used invisualizations like trees, £owcharts, networkdiagrams, path diagrams, sub-way maps, andVenn diagrams. In these visualizations, the perceivable topological rela-tions are either explicitly represented by visual elements such as arrow-connections andshaded areas, or implicitly represented by placing visual elements in relations such asinclusion, exclusion, or overlap. For example, arrow connections in visualization 6-A(Figure 6) represent topological relations between circles explicitly, shaded areas in visua-lization 6-B represent topological relationsbetween closed-curves explicitly, and overlapsbetween disks in visualization 6-C represent topological relations implicitly.

Topological relations are assumed to be induced on a set of visual elements V bymeans of two or more topological attributes.The values of these topological attributesare elements of V: If topological relations are explicitly represented by some visual ele-ments, like arrow-connections or shaded areas, then topological attributes are considered

Page 15: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Figure 6. Examples of explicitly (A and B) and implicitly (C) represented topological relations

VISUAL PERCEPTION IN DATAVISUALIZATION 615

to be attributes of these visual elements.Visual elements that explicitly represent topolo-gical relations may also have other attributes such as size, color, and texture. Each topo-logical attribute induces only the identity relation on visual elements that representtopological relations explicitly. For example, in visualization 6-A the arrows are de¢nedon the basis of two topological attributes Start-link and End-linkwhich assign two circlesto each arrow and therefore relate two circles to each other. Each of these two attributesinduces only the identity relation on arrows.The size values of these arrows induce anordinal relation on them such that one perceives one arrow as thicker or thinner thananother arrow. Similarly, in visualization 6-B the shaded areas are de¢ned on the basisof two topological attributes included-in-1st-element and included-in-2th-element. In this way, ashaded area relates two closed-curves to each other.

In the case of implicitly represented topological relations, topological attributes generatea set of tuples, each of which relates visual elements to each other. These tuples do notrepresent any visual elements. For example, in visualization 6-C, two attributes ¢rst-overlap-element and second-overlap-element generate a set of binary tuples relating two disks implicitly.

An explicitly represented topological attribute is de¢ned as a homomorphic map from arelational system, consisting of a set of visual elements (i.e. those that explicitly representtopological relations) and an equivalence relation, into a relational system consistingof a setof visual elements (i.e. those that are values of topological attributes) and the identity rela-tion. Notice that explicitly represented topological relations, such as arrow-connections,can alsobe related to each otherbymeans of visual attributes such as size attribute. Similarly,an implicit topological attribute is a homomorphicmap froma relational system, consistingof a set of invisible elements and an equivalence relation, into a relational system consistingof a set of visual elements (e.g. Euler circles) and the identity relation. Although this treat-ment of implicit topological attributes is not intuitive since it assumes invisible elements, ithas the advantage of modelling all visual attributes uniformly.

Topological relations may have various structural properties like being functional (in-jective, surjective, bijective), transitive, re£exive, and symmetric.Topological structurescan be further subdivided into structures such as tree-structures, cyclic and acyclicgraph-structures. A detailed classi¢cation of topological relations and structures ispresented in [15].

3.4. AClassi¢cation of Perceptual Structures

The proposed visual attributes result in several classes of perceptual structures. First,we distinguish between spatial/non-spatial perceptual structures that are constituted bycharacterizing relations of spatial and non-spatial attributes, and topological structuresthat are based on two or more topological attributes. For example,Table 5 has a spatial/non-spatial structure whileTable 4 has a topological structure. Spatial/non-spatial struc-tures can be subdivided into continuous and discontinuous structures. Subsequently,

Page 16: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Topological

Discontinuous

Nominal Ordinal Interval Ratio

Spatial/Non-Spatial

Perceptualstructures

Continuous ImplicitExplicit

Figure 7. A partial classi¢cation of perceptual structures that are used in data visualization

M. DASTANI616

discontinuous structures can be subdivided into nominal and ordinal structures and con-tinuous structures into interval and ratio.

Topological structures can be subdivided into implicit and explicit structures depend-ing on the existence of non-spatial attributes. Implicit topological structures are consti-tuted only by topological attributes while explicit topological structures are alsoconstituted by relations that characterize additional non-spatial attributes. For example,Table 4 has an explicit topological structure since it is based on the two topological con-nection1 and connection2 attributes and the additional non-spatial size attribute.Without thisnon-spatial attribute,Table 4 would have an implicit topological structure.This classi¢ca-tion can be extended by including more speci¢c types of perceptual structures. For ex-ample, explicit topological structures can be classi¢ed further into speci¢c types such asgraphs and trees.

The formal de¢nitions of these perceptual structures are given in Section 4.1.Theseperceptual structures can be classi¢ed hierarchically as illustrated in Figure 7. Note theclose similarity between this classi¢cation of perceptual structures and the classi¢cationof data structures illustrated in Figure 5.

This classi¢cation does not account for the interaction between visual attributes,which may in£uence the perceptual structure of visualizations [16, 17], Shepard-A. Ofcourse, the interaction between visual attributes can be prevented by de¢ning new attri-butes each of which is a constraint combination of the interacting attributes. For example,consider the color-hue, color-saturation, and color-brightness attributes, which are three interact-ing attributes. One may de¢ne a new color-1 attribute for which the values are triples con-sisting of a hue, a saturation, and a brightness value.To prevent the interaction betweenthese three attributes, the triples can be constraint in such away that each triple consists ofone and the same saturation value and one and the same brightness value.This impliesthat the use of one visual attribute in a visualization excludes the use of all interactingvisual attributes in that visualization.

4. A Formal Framework For DataVisualization

In this section, we build on de¢nitions from Section 3 and provide a formal descriptionof the processmodel outlined in Section 2. In particular, we de¢ne two relational systemsthat are constructed based on data attributes and visual attributes, respectively. One rela-tional system speci¢es the structure of the data and the second relational system speci¢es

Page 17: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

VISUAL PERCEPTION IN DATAVISUALIZATION 617

the perceptual structure of the visualization.The structural correspondence condition,which is a necessary condition for the e¡ectiveness of visualizations, is then modeled bydemanding a structure-preserving mapping between these two relational systems.

4.1. Structured Data and Perceptual Structures

As explained in Section 2.1, an m� n typed table consists of a set of attributesA1;y; Am and a set D ¼ fd1;y; dng of table entries. In this section, an m� n typedtable is formally de¢ned in terms of a combination of the mappings between relationalsystems where each mapping speci¢es one attribute of the table.

De¢nition 3. Let attributes A1;y; Am be de¢ned as in Section 3, i.e.A1 : /D;R1

1;y; R1kS-oV1; S1

1;y; S1k >

y

Am : /D;Rm1 ;y; Rm

l S-oVm; Sm1 ;y; Sml >These attributes share the set of elements D and can be combined to de¢ne a com-

pound mapping as follows:M : /D;R1

1;y; R1k;y; Rm

1 ;y; Rml S-/V1 �y� Vm; S1

1;y; S1k;y;

Sm1 ;y; Sml SAnm� n typed table based on attributesA1;y; Am is then the followingrelational system:/D; R1

1;y; R1k;y; Rm

1 ;y; Rml S:

It is important to note that all attributes of one table share one and the same set ofelementsD:The value of attributeAi for each element djAD is from the setVi such thateach element can be speci¢ed as an n-tuple of values from V1 �?�Vm:This de¢ni-tion can be used to de¢ne typed data tables as well as typed visual tables. For example,consider againTables 1 and 2 describing the number of sold cars by di¡erent car compa-nies in various countries and their cooperation degrees.

LetD0 ¼ fd 01;y; d 04gbe the set ofdata entries of Table 2 consistingof values of the car-type, sold-number, and country attributes, and let same car;more car; null; and quantity berelations de¢ned on D0:These relations are the identity relation between car types, theordinal relation between amounts of cars, the zero amount, and the quantitative relationbetween amounts of cars, respectively. Let also D ¼ fd1;y; d6g be the set of data en-tries of Table 1consisting of values of the company1, cooperation-degree and company2 attributes,and let better and same company be relations de¢ned onD:These relations are the ordi-nal relation between cooperation degrees of car companies and the identity relation be-tween car companies, respectively.Then, attributes of Tables1and 2 can be de¢ned as thefollowing homomorphic maps:

car-type : /D0; same-carS-/fvw; toyota; opel;mazdag;¼ S

sold-number : /D0; more-car; quantity; nullS-/Rþ;r;�; 0S

where 8a; bARþif ba0 then a� b ¼ a=b:

country : /D0; same countryS-/fgermany; japang;¼ S

cooperation degree : /D; betterS-/fnormal; good ; bestg;rS

company1 : /D; same-companyS-/D0;¼ S

company2 : /D; same-companyS-/D0;¼ S

Page 18: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

M. DASTANI618

Note that company1 and company2 map elements from D to elements from D0; which arethemselves mapped to values by other data attributes.This is due to the involved aggre-gated data attributes. Tables 1 and 2 can now be represented as the following relationalsystem:

/D,D0; same-car;more-car; quantity; null; same-country; better; same-companyS

In a similar way, Tables 4 and 5 can be represented as a relational system. Let V0 ¼fv01;y; v04g be the set of visual entries of Table 5 consisting of values of the xpos, ypos,and color-hue attributes, and let same-seg, longer, null, and length be relations de¢ned on V0:These relations are the identity relation between segments, the ordinal relation betweenpositions, the zero position, and the quantitative relation between positions, respectively.Let also V ¼ fv1;y; v6g be the set of visual entries of Table 4 consisting of values ofconnection1, size, and connection2 attributes, and let thicker and same element be relations de-¢ned onV:These relations are the ordinal relation between the thickness of arrows andthe identity relation betweenvisual elements, respectively.Then, the attributes of Tables 4and 5 can be de¢ned as the following homomorphic maps:

xpos : /V0; same-segS-/fseq1; seg2; seg3; seg4g;¼ S

ypos : /V0; longer; length; nullS-/Rþ;r;�; 0S

where 8a; bARþif ba0 then a� b ¼ a=b:

color-hue : /V0; same-hueS-/fblack;whiteg;¼ S

size : /V; betterS-/fsize1; size2; size3g;rS

conection1 : /V; same-elementS-/V0;¼ S

connection2 : /V; same-elementS-/V0;¼ S

Tables 4 and 5 can now be represented as the following relational system:

/V,V0; same-seg; longer; length; null; same-hue; thicker; same-elementS

4.2. Structure-preserving Mapping

In the previous section, typed data tables and typed visual tables are de¢ned as relationalsystems that are obtained by combining data and visual attributes, respectively.The visualization relation between a typed data table and a typed visual tablecan now be de¢ned as a mapping between their corresponding relational systems.Although any mapping from the relation system of the data table to the relationalsystem of the visual table provides a visualization, only a subset of them can be claimedto satisfy the structural correspondence condition. This condition will be satis¢edby demanding a structure-preserving map. In fact, the structural correspondencecondition will be satis¢ed if and only if there is an isomorphism between relational sys-tem that represents the typed data table and relational system that represents the typedvisual table.

De¢nition 4.An isomorphismbetween/D1;R1;y; RnS and/D2; S1;y; SnS is aone-to-one and onto mapping between D1 and D2 and a one-to-one mapping between

Page 19: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

VISUAL PERCEPTION IN DATAVISUALIZATION 619

relationsRi and Si (for i ¼ 1;y; n) which satis¢es the following condition: If a relationRi holds between two elements of D1; the corresponding relation Si holds between thecorresponding elements of D2; and if Ri does not hold between two elements of D1; Sidoes not hold between the corresponding elements of D2:

The isomorphic mapping guarantees that each data entry corresponds to a visualentry, and whenever there exists an intended relation between two data entries theircorresponding visual entries are related to each other by a structurally identicalperceptual relation. The isomorphic mapping requires a correspondence betweendata relations and visual relations that have the same type. This condition is a partialreformulation of the e¡ectiveness criterion, which states that in an e¡ective visualizationdata attributes are visualized by visual attributes that have the same attribute type.It is important to note that the isomorphic mapping is a necessary but not a su⁄cientcondition for e¡ective visualization. In fact, other design factors such as cultural conven-tions and subjective preferences play important roles in the e¡ectiveness of visualizationstoo.

4.3. UndecidedVisual Attributes and E¡ectiveVisualization

The isomorphic relation is de¢ned between a relational system that represents atyped data table and a relational system that represents a typed visual table. However,the typed visual table is only based on decided visual attributes. In Section 2.4, it isargued that the choice of values for undecided visual attributes play also an importantrole in the e¡ectiveness of visualization. It is shown that a wrong choice of values ofthe undecided visual attributes may result unwanted visual implicatures, as illustratedin visualization 2-A.

In this section, we brie£y elaborate on the choice of values for undecided visual attri-butes to prevent a certain class of unwanted visual implicatures.The unwanted implica-tures that we aim to prevent are perceivable relations that do not represent any intendeddata relations. For example, consider visualizations that are illustrated in Figure 8.Thedi¡erence between these two visualizations is the position value of the second box at themiddle layer of the graph. In visualization 8-A, boxes 1,3, and 4 are perceived as beingrelated to each other, but di¡erentiated from box 2. If there is no di¡erence between dataelements that are visualized byboxes1,3, and 4 on the one hand and the data element thatis visualized by box 2 on the other hand, then the suggested relation between boxes 1,3,and 4 and their di¡erence with respect to box 2 does not represent any data relation andtherefore it forms an unwanted visual implicature. The same type of unwanted visual

Figure 8. Boxes 1,2,3, and 4 in diagram A cause unwanted visual implicature while in diagram B they donot, assuming that they both visualize the same information and that diagram B visualize it correctly

Page 20: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

M. DASTANI620

implicature has also occurred invisualization 2-Awhere the perceivable relation betweenshapes of visual elements did not represent any data relation.

This class of unwanted visual implicatures can be avoided by demanding that eachundecided visual attribute should only induce the identity relation on visual elements.Visual elements will then be perceived as being in one perceptual groupwith respect toundecided visual attributes. Many visual attributes such as color-hue, size, shape, andtexture induce the identity relation on visual elements when they assign identical valuesto visual elements. For other visual attributes such as position attribute, visual elementscannot have identical values since otherwise visual elements will be placed on the top ofeach other. However, these visual attributes can induce the identity relation on visualelements if there exits perceptually motivated regularity between values that they assigntovisual elementsf. For example, invisualization 8-B, the equidistance regularitybetweenposition values induces the identity relation on the four boxes at the middle layer of thegraph. A detail analysis of the classes of perceptually motivated regularities for di¡erentvisual attributes can be found in [10, 18].

We assume a set of empirically validated and application-dependent perceptually mo-tivated regularity relations for various visual attributes and de¢ne undecided visual attri-butes in terms of these regularity relations. In particular, undecided visual attributes arede¢ned as attributes that induce the identity relation on visual elements by assigningvalues, which are related to each other by perceptually motivated regularity relation, tovisual elements.

De¢nition 5. Let B be the set of values of the visual attribute a among which thereexists a perceptually motivated regularity relation Reg:Then, a is an undecided visualattribute in a visualization if and only if it is a homomorphic map from the relationalsystem /A;ES; where A is the set of visual elements, into a relational system/B;RegS:

For example, consider againvisualization 8-B and suppose that the xpos attribute is anundecided visual attribute. LetN be the set of integers considered as the positionvalueson the x-axis and þ1 be a perceptually motivated regularity relation. Then, the unde-cided visual attribute xpos can be de¢ned as the following mapping:

xpos : /A;ES-/N;þ1S:

It is important to note that the regularity relations, which are supposed to induce theidentity relation on visual elements, are context sensitive. An example of this contextdependency is the Mueller^Lyer illusion, which is illustrated in Figure 9.

This example shows that the regularities of attribute values of visual elements (in thiscase horizontal lines) do not induce the identity relation on them such that humanviewercannot perceive them as being in one perceptual group.This e¡ect is due to the presenceof some other visual elements (in this case the attachedS and/ elements). However, this

fThere is a close relationship between Gestalt laws, which explain the perceptual groupings of visualelements, and the regularity principle. In fact, according to the psychologically motivated Structural In-formationTheory [18], regularity is the basic principle that governs human perceptual system such that allGestalt laws can be explained in terms of the regularity principle.

Page 21: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

Figure 9. The Mueller^Lyer illusion

VISUAL PERCEPTION IN DATAVISUALIZATION 621

context problem is out of the scope of this paper since the aimwas to formulate necessaryconditions for e¡ective data visualization and not su⁄cient conditions.

5. Conclusion and Future Research

The main focus of this paper was to formulate the structural correspondence condition,which is a necessary condition for e¡ective data visualization.We argued that the corre-spondence between data and its visualization should be perceptually motivated in thesense that the intended structure of data should coincide with the perceptual structureof its visualization.To this end, we have studied perceptually motivated structures thatcan be used in visualizations to represent data structures.These perceptual structures areformally de¢ned in terms of perceivable relations that are induced on visual elements bymeans of visual attributes. Based on the formal de¢nitions of data and perceptual struc-tures, we have formulated the structural correspondence condition as a structure-preser-ving map between data and perceptual structures.

In this paper, we have avoided the term ‘visual languages’, which is often used in theliterature of data visualization [3,19, 20]. In these studies, each visualization is consideredas an expression or a sentence of a visual language. Avisual language can be de¢ned inour proposed framework as consisting of two relational systems, a data system and avisual system, and a mapping between them.The mapping between these relational sys-tems represents the denotation relation between visualizations and data. Di¡erent pairsof relational systems specify di¡erent visual languages. Moreover, for each pair of rela-tional systems, there are many mappings possible, each of which speci¢es a di¡erentvisual language. Avisual language may generate di¡erent visualizations by using eitherdi¡erent number of elements from visual systems or by using di¡erent values for theirvisual attributes.Two di¡erent bar-charts that di¡er from each other in the number ofbarsor by their visual attribute values are considered as di¡erent expressions of one and thesame visual language, namely the bar-chart language.Various kinds of maps, diagrams,graphs, and £ow-charts can be considered as di¡erent visual languages.We leave the de-tails of de¢ning various visual languages in our framework for the future research.

An important issue which is not worked out here is the interaction between variousvisual attributes.These interactions are essential since they in£uence the perceptual struc-ture of visualizations [16,17]. In order to study the interaction of attributes, we may use apsychological notion called integrality of attributes. Two attributes are called integralwhen the perceptual structure is not induced by one or the other attribute separately,but rather by the overall similarities of visual elements caused by the combination ofthe integral attributes. For example, using the hue, the saturation, and the brightnessattributes of visual elements, the induced structure is not byone of these three attributes,but it is induced by the overall similarities of elements which is caused by the hue, thesaturation, and the brightness attributes together. In contrast, two attributes are callednon-integral when the perceptual structure of elements is induced by one or the other

Page 22: The Role of Visual Perception in Data Visualizationpdfs.semanticscholar.org/95d1/24d8748ac5bddcd29b2b09a8585b1a697e63.pdf · with respectto agraphical languagewhichwewantto abstract

M. DASTANI622

attribute. For example, considering the shape and the texture attributes of visual elements,a perceptual structure may be induced by the shape attribute, by the texture attribute, orby both of them. In the last case, each attribute supports that perceptual structure sepa-rately.We have focused on perceptual structures that are de¢ned in terms of non-integralattributes. A speci¢cation of structures that are de¢ned in terms of integral attributesremains an open problem for the future research.

References

1. J. Bertin, (1981)Graphics andGraphic Information-Processing.Walter de Gruyter, Berlin NewYork.2. S.K. Card, Mackinglay, J.D. & B. Shneiderman (1999) Readings in InformationVisualization-

FUsingVision toThink. Morgan Kaufmann, San Francisco.3. J.Mackinlay (1986)Automating the design of graphical presentations of relational informa-

tion.ACMTransactions onGraphics 5,110^141.4. E.R.Tufte (1983) TheVisual Display of Quantitative Information. Graphics Press, Cheshire, CT.5. E.R.Tufte (1990) Envisioning Information. Graphics Press, Cheshire, CT.6. M. Twyman (1979) A schema for the study of graphic language. In: Processing of Visible

Language (P.A. Kolers, M.E.Wrolstad, & H. Bouma, eds),Vol.1. Plenum Press, NewYork.7. W.J. Mitchell (1993) A computational view of design creativity. In: Modeling Creativity and

Knowledge-basedCreativeDesign (J.S. Gero&M.L. Maher, eds). Lawrence ErlbaumAssociates,Hillsdale, NJ, pp. 25^42.

8. M. Petre &T.R.G. Green (1993) Learning to read graphics: Some evidence that ‘seeing’aninformation display is an acquired Skill. Journal ofVisual Languages and Computing 4, 55^70.

9. S.S. Stevens (1946) On the theory of scales of measurement. Science 103, 667^680.10. M. Dastani (1998) Languages of perception. Ph.D. thesis, Institute of Logic, Language, and

Computation (ILLC),The Netherlands.11. J. Marks & E. Reiter (1990) Avoiding unwanted conversational implicatures in text and

graphics. ProceedingAAAI, Menlo Park, CA.12. J. Pfanzagl (1968) Theory ofMeasurement. Physica-Verlag,W.urzburg-Wien.13. K. Reichenberger, T. Kamps & G. Golovchinsky (1995) Towards a generative theory of

diagram design. In: IEEEProceedings on InformationVisualization (N. Gershon & S. Eick, eds),October 1995. IEEE Computer Society Press, Silver Spring, MD, pp.11^18.

14. Y. Engelhardt, J. de Bruin,T. Janssen, &R. Scha (1996) The visual grammar of informationgraphics. Arti¢cial Intelligence inDesign (AID’96), in theWorkshop onVisual Representation, Reasoningand Interaction inDesign, 24^27 June, StanfordUniversity, California.

15. T.M.Kamps (1997) Aconstructive theory for diagram design and its algorithmic implemen-tation. Ph.D. thesis.

16. W.R. Garner (1974) The Processing of Information and Structure. Lawrence Erlbaum Associates,Potomac, MD.

17. R.N. Shepard (1991) Integrality versus separability of stimulus dimensions: from an earlyconvergence of evidence to a proposed theoretical basis. In: The Perception of Structure(G.R. Lockhead&J.R. Pomerantz, eds). American Psychological Association,Washington,DC, pp. 53^71.

18. P.Van der Helm (1994) The dynamics of pr.agnanz. Psychological Research 56, 224^236.19. C.A.Gurr (1999)E¡ective diagrammatic communication: syntactic, semantic and pragmatic

issues. Journal ofVisual Languages and Computing, 10, 4.20. D.Wang, & H. Zeevat (1998) A syntax-directed approach to picture semantics. In:Visual

LanguageTheory:Towards a Human^Computer Interaction Perspective. (B. Meyer & K. Marriott, eds)Springer-Verlag, Berlin, 307^324.