INTERACTING WITH GIS: FROM PAPER CARTOGRAPHY TO …hci.epfl.ch/publications/2001/EOLSS-with_images.pdf · In particular, multimedia GIS is the combination of the use of GIS to georeference,

INTERACTING WITH GIS: FROM PAPER CARTOGRAPHY TO VIRTUALENVIRONMENTS

Tiziana Catarci, Fabrizio d'Amore, Dipartimento di Informatica e Sistemistica, Università diRoma "La Sapienza", ITALYPaul Janecek, Stefano Spaccapietra, EPFL, Lausanne, Switzerland

Keywords: Visual information processing, perception, visual interaction, advancedinterfaces

Contents

1. Introduction2. Cognitive science aspects of GIS

2.1. Perceptual aspects of maps2.2. Semiotic aspects of maps2.3. Navigation using maps2.4. Cognitive maps: internal representation of space

2.4.1. Cognitive spatial development in children2.4.2. Cognitive spatial development in adults2.4.3. Accuracy of spatial knowledge from maps2.4.4. The influence of the environment

2.5. Design issues for navigational maps3. Multimedia and hypermedia systems

3.1. Components of hypermedia3.1.1. Common features of nodes and links3.1.2. Nodes3.1.3. Links

3.2. Problems with hypermedia3.3. Functionalities of hypermedia

3.3.1. Views3.3.2. Navigation3.3.3. Annotation

3.4. Design issues for hypermaps4. Virtual reality and 3D GIS

4.1. High level interaction4.2. 3D GIS4.3. 3D visualization4.4. The KarmaVI project

5. Visual Querying6. Animation

6.1. Animation in GIS7. Conclusions

Glossary

Animation: A graphical interface feature that covers all object changes that have a visual effect, including changes inposition, color, structure, texture, rendering, etc.

Automatic perceptual processing: Direct perception of a concept from a sign without decoding. Automatic processingis fast, parallel and unconscious.

Cognitive map: A mental representation that allows a person to find their way through a spatial environment such as acity, a large building, or a virtual world. A cognitive map is based on experience and cognitive spatial ability,and may have properties of the different types of mental representations (i.e., a cognitive map may be acollection of mental representations of the environment of interest).

Controlled perceptual processing: A studied decoding of a sign from its external expression to an internal concept inthe mind of the reader. Controlled processing is slow, serial, and requires conscious thought.

Environment: With respect to cognitive aspects of maps, the environment is the spatial region that is the subject ofthe map (e.g., a city, library, or hypermedia network).

Geographical Information System (GIS): It is a software system providing the user with several ad-hoc features andtools for assembling, storing, integrating, checking, manipulating, displaying and analysing geographicallyreferenced information, i.e. data identified according to their locations. Typically, GIS extensively use differentgraphical maps and allow the user to interact with several views of data possibly at different level of detail.Uses are primarily government related, town planning, local authority and public utility management,environmental, resource management, engineering, business, marketing, and distribution.

Graphical map: A graphical representation of an area of interest showing the spatial relationships between objects.Graphical maps are usually printed on paper or displayed on a computer screen, and are visually perceived andinterpreted by a reader.

Hypermedia: A dynamic, associative network of nodes and links for organizing, relating, accessing, and sharinginformation.

Hypermap: A hypermedia version of a graphical map with embedded links to related information, such as images,text, and adjacent or more detailed maps.

HTML (HyperText Markup Language): It is a language for defining hypertexts, embodying the capability ofspecifying text, graphics, multimedia options, scripting languages, style sheets and printing facilities. HTMLis developed by the World Wide Web Consortium (W3C) and is the most used publishing language of theWorld Wide Web, even if several new standards are emerging.

Mental Representation: Johnson-Laird, in a seminal 1983 book on mental models, proposed three types of mentalrepresentations: propositional representations, which are symbolic and allow logical manipulation; mentalmodels, which are structural and allow visual reasoning; and images, which are perceptual and “show” a modelfrom a particular point of view.

Steven’s Law: The magnitude of a perceived scale is the actual physical scale to a power, ß, where the value ß variesfor different visual attributes and from person to person.

Triangulation: The process of relating the geometry of the map with the geometry of the real world.Usability: The extent to which a product can be used by specified users to achieve specified goals with effectiveness,

efficiency and satisfaction in a specified context of use (from ISO/DIS 9241-11).Virtual Reality: Computer-generated environment with realistic appearence, behaviour, and interaction techniques.VRML: This is is an abbreviation for Virtual Reality Modeling Language. It is a scene description language that

describes the geometry and behavior of a 3D scene or "world". It is used mainly on the Internet as openstandard for modeling 3D multimedia and shared virtual worlds.

Weber’s Law: The probability of perceiving a difference between two visual values (e.g., length, position, area) islinearly related to the percentage difference in magnitude rather than the absolute difference.

SummaryRecent GIS provide several mechanisms to support the user in her/his activities of exploring, analyzing, and queryinggeographical information. The starting point for the development of such mechanisms is the comprehension andmodeling of how the human mind perceives and processes spatial data. This paper first analyzes issues related withcognitive aspects of GIS and then discusses recent improvements to interaction mechanisms available in GIS whichhave been made possible by advanced visual techniques, such as 3D, virtual reality, animation.

1. Introduction

Maps have traditionally been used to explore the Earth and to exploit its resources. GIS technology, has enhanced theefficiency and analytic power of traditional mapping, also enriching the set of available visualizations with drawings,images and animations. In particular, multimedia GIS is the combination of the use of GIS to georeference, structureand analyze geographical data and the creation of multimedia presentations with links to spatial features. Thecombination of audio and video along with traditional data types can improve the effectiveness of GIS towards theusers. Multimedia GIS can possibly create multiple representations for same phenomena. This characteristic enables theusers to analyze certain information in different contexts and help them in understanding complex geographicphenomena. As an example, the changes in vegetation vigor through a growing season can be animated to determinewhen drought was most extensive in a particular region. Working with two variables over time will allow researchers todetect regional differences in the lag between a decline in rainfall and its effect on vegetation.

GIS and related technology help greatly in the management and analysis of large volumes of geographically referenceddata, allowing for better understanding of terrestrial processes and better management of human activities to maintainenvironmental quality and world economic vitality. In order to do this, since GIS have to be used by humans, it isvital that the software and the data be comprehensible to human users. This means that the basic metaphors andconventions used by people in communicating about space and time have to be analyzed and suitably exploited bymodern GIS. Since maps are still the primary means of storing, communicating, and analyzing spatially relatedinformation it is fundamental to understand how people perceive and process maps and how they build "cognitivemaps". A cognitive map is the mental representation of the world that allows a person to recognize their location,retrace a route they have taken before, derive a new shortcut, and estimate distance and direction between points. Thisknowledge is usually gained incrementally over time through direct experience navigating in the environment, but, as

in the case of the interaction with a GIS, may also be gained indirectly through appropriate simulated experience(multimedia and 2D/3D interactive representations of space).

In addition to their role as communication and analytical tools, maps are also the primary tools people use fornavigation in spatial environments (e.g., the network of roads in a city). Navigating with a map requires several distinctcognitive and perceptual operation. A correct design of navigational maps can greatly simplify the user's cognitive effortrequired for navigation. Advanced technology can suitably help in implementing such a design. For instance, widelydiffused hypermedia is a dynamic, associative network of nodes and links for organizing, relating, accessing, andsharing information. Given these characteristics, navigating in hypermedia share many similarities with mapnavigation. As a consequence, GIS may profit from multimedia and hypermedia extensions to electronic maps withoutloosing their key features. More realistic navigation can be obtained exploiting sophisticated visualizations of 3D dataand letting the users to deeply interact with the modelled 3D scenario through the so-called virtual reality. Virtualreality is a tool for providing the users of a computer-based system, dealing with real or realistic data, with a highlyvivid interaction with the data. Such an interaction can be obtained by means of software tools, for instance a VRMLbrowser, or through ad hoc devices, such as head-mounted displays or stereoscopic surround projectors.

High-level interaction answers to the need of widening the set of potential users of a computer-based system to includepeople with no need of technical and specific training. In the case of GIS this issue is crucial while dealing with 3Ddata. Both the spatial analyst, expert user aware of the capability of the GIS tool, and the planner, frequently unaware,but interested to performing "what-if" simulations, need to deeply perceive the visual impact of different decisions.This is the process known as "visual analysis". The more vivid and direct the interaction is, the more effective visualanalysis can be carried out. Thus, in the last 15 years, much effort has been devoted to the integration of 3D GIS andvirtual reality. What is somewhat surprising is the independent evolution of virtual reality, developed in pure computergraphics and computer vision environments, that has led to a set of tools immediately available to the realistic visionof 3D spatial data. This has moved the interest of the applied researchers towards the study of the most suitabletechniques for feeding a GIS with 3D real data. Although a strong possible improvement appears still possible, the lasttechnology already offers appealing solutions that are spreading out on the wide GIS users landscape.

Finally, multimedia GIS are typically equipped with tools supporting the user in the following basic interactionactivities: focusing (highlighting subsets of data), brushing (highlighting specific display objects by pointing them onthe display), linking (simultaneous highlighting in multiple views) and querying (spatial queries). Focusing applied tosingle or multiple map views allows users to adjust a data threshold dynamically. With multiple simultaneous dataviews, focusing or brushing is usually combined with linking. Spatial queries include conditions on the mutualplacements of objects on a map.

The rest of the paper is organized as follows: Section 2 introduces basic notions on how spatial information isprocessed by humans and on the effects this processing has on the way people perceive and use maps. Section 3discusses the similarities between maps and hypertexts/hypermedia and how hypermaps extend GIS functionalities.Section 4 analyzes the role that 3D and virtual reality could play in improving the quality of the user interaction withGIS. Analogously, Section 5 presents visual mechanisms to support spatial querying and Section 6 discusses the roleof animation in interactive GIS. Section 7 draws some conclusions.

2. Cognitive Science Aspects of GIS

Maps are the primary means of storing, communicating, and analyzing spatially related information. At a basic level,maps are a visual communication medium, using graphical symbols and the spatial relations between them to encodeinformation. A fundamental aspect of this approach is that symbols have two components: the content, which is theconceptual information that is represented; and the expression, which is the graphical representation that is perceived.The field of cognitive science studies how the human mind works, drawing from research in anthropology, psychology,linguistics, neuroscience, and artificial intelligence. The following two sections will give a broad overview of twocomplementary areas of research: relevant results from psychophysical research that characterize aspects of perception;and semiotic and linguistic research that attempt to explain the multiple levels of the communication process. Thefollowing sections describe the cognitive aspects of maps for navigation, focusing primarily on the development ofcognitive maps, which are the mental representations of geographic space people use to navigate through theirenvironment.

2.1 Perceptual Aspects of Maps

The human perceptual system processes visual information in two ways: controlled and automatic. Controlledprocessing is a studied decoding of a sign from its external expression to an internal concept in the mind of the reader.For example, reading is a type of controlled processing, requiring recognition of a pattern of symbols into a word thatis decoded into a concept. Automatic processing, on the other hand, allows direct perception of a concept from a signwithout decoding. For example, finding objects of the same color is automatic. Controlled processing is slow, serial,

and requires conscious thought. Automatic processing is fast, parallel and unconscious. The components of a sign thatrequire controlled processing are called internal, and the components that can be automatically processed are calledexternal. Graphical representations of information, such as maps, diagrams and charts, are powerful because theysupport both the “reading” of labeled symbols and the “seeing” of graphical and spatial relationships.

Data domains and visual features both have an intrinsic dimensionality or scale. Nominal information allowsdistinction between elements, such as the set {“orange”, “pear”, “apple”} and the colors {“red”, “green”, “blue”}.Ordinal information has an intrinsic order, such as the set {“cold”, “warm”, “hot”} and a grayscale from white to black.Interval information allows addition and subtraction, such as the set {“Monday”, “Tuesday”, “Wednesday”} and theorientation of hands on a clock. Ratio information has an origin, and allows multiplication and division, such as theset {“100m”, “700m”, “10m”} and position along an axis.

An encoding scheme that matches the dimensionality of the data with the dimensionality of the visual feature will bemore effective at communicating information (i.e., will allow automatic perceptual processing). When a visual featurerepresents data with greater dimensionality, then it requires controlled processing to decode. On the other hand, a visualfeature that represents data with lower dimensionality may lead to misperception. For example, encoding the set{“Japan”, “Peru”, “Thailand”} with areas as {!, •, "} suggests an order {“Japan”, “Thailand”, “Peru”}, which may bemisleading.

Figure 1. Population and area (ratio information) of three countries encoded with texture (nominal), grayscale (ordinal),and position (ratio). When the dimensionality of information does not match the dimensionality of the visual encoding,

the missing dimensions must be stored internally.

As an example, Figure 1 shows the population and area (ratio information) of three countries using three different visualencodings: texture (nominal), grayscale (ordinal), and position along a common scale (ratio). The first encoding allowsthe reader to perceive patterns (e.g., the population/area ratio of Japan and Peru are inversely related), but all higherdimensional information must be calculated internally. The second encoding allows the reader to perceive ordinalinformation (e.g., the population of Peru is less than Thailand’s), and the third encoding allows the reader to perceiveratio information (e.g., Peru is four times larger than Japan but with one fourth the population).

Figure 2. Steven’s Law. The graph shows the actual ratio vs. the perceived ratio between measurements for volume,length and area.

There are several fundamental laws of human perception that help explain the effective dimensionality of a visualfeature. For example, Steven's Law, also called Stevens’ Power Law, characterizes the difference between perceivedmagnitude and physical magnitude of a visual feature (as well as other sensory stimuli). Stevens’ Law predicts that theperceived scale is the actual physical scale to a power, ß, where the value ß varies for different visual attributes and fromperson to person. Figure 2 shows graphically the actual vs. perceived ratio for area, length and volume, assumingaverage values for ß of 0.7, 1.0, and 1.3 respectively. Steven’s Law suggests that physical relationships that arerepresented accurately can be grossly misperceived. For example, the relative lengths of road segments will becorrectly perceived, but a lake represented on a map with an area graphically 10 times larger than another will beperceived as only 5 times larger. On the other hand, a volume 10 times larger will be perceived as 20 times larger. Thissuggests that when the purpose of the graphic is to compare quantitative information (i.e., interval and ratioinformation), length ratios are more likely to be correctly perceived than area or volume.

Researchers have identified two fundamental perceptual and cognitive tasks required to decode and interpret graphicalinformation: information retrieval and information comparison. Information retrieval, in this sense, is the process ofidentifying the value associated with an object along a dimension of interest. This requires search along a basedimension (the dimension used to encode the information), and identification along a target dimension (e.g., the valueassociated with a specific area or texture). For example, in Figure 1, finding the population of Peru (an informationretrieval task) would require finding the symbol representing the population of Peru in the legend, and thenidentification of the value associated with the symbol.

Information comparison is the process of decoding and comparing values within or between dimensions. Similar to theinformation retrieval task, comparison tasks have a base dimension and a set of target dimensions. For example, (usingFigure 1) a within-dimension task would be used to compare the populations of Peru and Thailand, and a between-dimension task for comparing the population and area of Peru.

The perceptual encoding and the match between representing and represented dimensionality affect to what extent thesetasks may be completed externally (using automatic processing) versus internally (using controlled processing). Thecomposition of symbols into higher levels of meaning, such as the alignment used to distinguish the rows andcolumns of the tables in Figure 1, or the lines and scales of the graph in Figure 2, are the subject of the followingsection.

2.2 Semiotic Aspects of Maps

Semiotics is the study of how symbols are used to communicate information. Linguistics is an area of semiotics thatstudies languages. Researchers have drawn from these two fields to attempt to understand and formalize the way maps(and graphics in general) are able to effectively communicate information.

Jacques Bertin’s Sémiologie Graphique in 1967 was the first extensive study on the relationship between data and theirgraphical representation. He identified three basic types of graphical marks (i.e., points, lines, and zones), and seven“visual variables” (i.e., plan (position), size, value (grayscale), texture, hue (color), orientation, and shape). Bertinclassified each of these variables according to the scale of information (i.e., nominal, ordinal, and quantitative) that theyare most appropriate for encoding. He then explained in detail how each of these variables can be manipulated toeffectively perceive the range of data. Table 1 shows a summary of Bertin’s classification of visual variables. All thevariables can be used for expressing nominal information, however grayscale and size will perceptually express ordinaland quantitative information that may not be intended. The three extent variables (grayscale, size and position) are mosteffective for ordinal and quantitative information. Texture is marginally effective for ordinal information.

Table 1. The Perceptual Effectiveness of Visual Variables for Communicating Different Scales of Data. Visual variablesare grouped by whether they are primarily for differentiating information or indicating extent along a scale. A filled box

is more effective at representing a particular type of data.

Bertin’s work is still an essential reference for understanding the process of using graphics to communicate and analyzeinformation. More recent research from semiotics and linguistics has attempted to build a broader theoretical frameworkof map-based communication of geographical information.

First, there is general agreement that the basic unit of communication is the sign, which has a conceptual content and agraphic expression. There is also agreement that signs can be divided (disarticulated) into graphical features that do nothave meaning (e.g., Bertin’s visual variables) and content features that do not have expression (e.g., attributedimensionality). Signs can also be composed (articulated) into units of progressively higher meaning (e.g., clusters ofbuildings represent villages). However, beyond this there are pronounced differences between semiotic and linguisticapproaches. Linguistic approaches assume that, for the composition of signs to be considered a language, the minimumunit and each unit increment must have both content and meaning. Semiotic approaches do not have this requirement.

Second, there is the perceptual and cognitive significance of the communication of spatial information in maprepresentations. Unlike linguistic communication, which requires translation, spatial information can be perceptuallycommunicated through the spatial correspondence between the signs on the map and the represented features in theworld. At the sign level, spatial features of geographical objects (e.g., location, size, shape, and orientation) may bedirectly reflected in their representation. For example, the size and shape of the symbol for a city may represent thespatial area and boundaries of the city. At the map level, the correspondence between geographical space and map spacemay be transformed in three ways: projection, reduction and orientation.

Third, research has focused on the role of the map-reader in understanding map symbols, and assembling symbols intohigher-order concepts. Researchers have identified and developed a large number of different types of codes for

representing different aspects of the relationship between content and expression (e.g., locational, iconic, temporal,historical, thematic, quantitative). A basic example is the iconic encoding of locational information. A sign thatgraphically reflects a component of its content is called “motivated”, a sign that relies on convention for understandingis called “conventional”, and a sign without any visual similarity to what it represents is “arbitrary”. For example, thelocation of a cathedral may be represented by its outline (motivated), by a symbol for a church (conventional), orsimply by a triangle (arbitrary).

Semiotics of maps is still an active research domain investigating the variation and composition of visual symbols torepresent layers of meaning. Effectively communicating and analyzing spatially related information using maps requiresan understanding of these different issues.

2.3 Navigation using Maps

In addition to their role as communication and analytical tools, maps are also the primary tools people use fornavigation in spatial environments (e.g., the network of roads in a city). Navigating with a map requires several distinctcognitive and perceptual operations:

• Landmark recognition. A person must identify one or more locations on the map corresponding to identifiablelocations in the real world.

• Orientation and Triangulation. Triangulation is the process of relating the geometry of the map with the geometryof the real world. This requires knowing the directions to two landmarks, or the direction and distance to a singlelandmark.

• Mental rotation. The map and the individual must be oriented in the same direction by physically rotating themap, by turning the body, or by mentally rotating the map or the body. Mental rotation requires cognitive effort,and time proportional to the difference between the two orientations.

• Route selection. The individual must select a route from the current location to the destination on the map, andidentify the immediate direction to move.

• Translation. Moving through the world requires a corresponding translation on the map.

A person navigating with a map will continuously cycle through these operations until they reach their destination. Inunfamiliar environments, these maps are often graphical, but may also be oral or written. As a person becomes morefamiliar with an environment, they rely less on external representations of space and more on mental representations ofspace, called cognitive maps.

2.4 Cognitive Maps: Internal Representations of Space

A cognitive map is the mental representation of the world that allows a person to recognize their location, retrace aroute they have taken before, derive a new shortcut, and estimate distance and direction between points. This knowledgeis usually gained incrementally over time through direct experience navigating in the environment, but may also begained indirectly through studying maps (graphical representations of space), through written or spoken directions fromothers (linguistic representations of space), or through simulated experience (multimedia and 2D/3D interactiverepresentations of space).

Spatial knowledge is based on a number of closely interacting perceptual and cognitive systems. Neurophysiologicalstudies of human spatial behavior have found that there are at least three separate neural systems that process visualsensory information to create a cognitive map. The first system analyzes visual properties of objects, leading to objectrecognition. The second system analyzes the spatial location of objects relative to each other and to the person. A thirdsystem ties these two types of information together into spatial memory, and allows a person to locate and orientthemselves with respect to their environment and their destination. To navigate through the environment, people mayuse either the recognition or location system alone, or both of these systems together. The integration of these systemsallows people to perceive and conceptualize space.

The actual structure of a cognitive map depends on personal experience and cognitive spatial ability. Johnson-Laird, ina seminal 1983 book on mental models, proposed three types of mental representations: propositional representations,which are symbolic and allow logical manipulation; mental models, which are structural and allow visual reasoning;and images, which are perceptual and “show” a model from a particular point of view. A cognitive map has propertiesof all three of these types of representations. The following sections discuss the development of cognitive maps, andtheir properties in greater detail.

2.4.1 Cognitive Spatial Development in Children

According to the well-known study by Piaget and Inhelder in 1967, children progress through three stages in thedevelopment of their cognitive spatial ability (see Table 2). Perceptual spatial ability, which refers to the ability toperceive the spatial relationships between objects, develops earlier than conceptual spatial ability, the ability to build

and manipulate a mental model of the environment to solve spatial problems. The development of conceptual spatialability allows the integration of spatial relationships into a cognitive map. Table 2 shows the stages in thedevelopment of spatial knowledge in children, and relates them to the stages of spatial knowledge in adults (describedin the following section).

Children up to the age of 6 are in the preoperational stage of cognitive development. The internal model of children inthis stage is egocentric, meaning that they locate objects in their environment with respect to themselves. Theyunderstand limited topological spatial relationships such as proximity, separation, and open/closed.

The second stage of cognitive development is the concrete operational stage, which occurs when children are between 7to 9 years old. In this stage children develop a cognitive map with a fixed frame of reference, which allows them toimagine a view and orientation outside their body. Using this external frame of reference, they develop anunderstanding of more complex topological relations, such as order and enclosure, and they begin to develop projectiverelations, such as before/behind, and left/right.

Spatial RelationsStage ofCognitive

Developmentin Children

Frame ofReference Topological Projective Euclidian

Type ofSpatial

Knowledgein Adults

Preoperational(2 - 7 years) Egocentric

Proximity,Separation,Open/Closed

Landmark

ConcreteOperational(7 – 11 years)

FixedBetween,Order,Enclosure

Before/Behind,Left/Right,Triangulation

Route

FormalOperational(≥ 11 years)

Coordinate Continuity

Proportional ScaleReduction,Distance Estimation,Metric Coordinates

Survey

Table 2. The stages in children’s development of spatial knowledge. Based on Piaget’s theory of cognitivedevelopment.

The final stage of cognitive development in children is the formal operational stage, which begins around the age of11. In this stage, children develop a coordinate frame of reference, where individual routes blend into a network oflocations in fixed positions relative to each other. They develop an understanding of Euclidean spatial relations, such asestimating straight-line relative distances, and proportional reduction of scale.

2.4.2 Cognitive Spatial Development in Adults

As shown on the right side of Table 2, adults progress through three similar levels in their development of spatialknowledge. Of course, whereas children are limited in progressing from one level to the next by their cognitivematuration, adults are limited by other factors, such as their spatial abilities and their experience in the environment.

Landmark knowledge, the lowest level, is a type of declarative knowledge: it consists of an unordered collection of theobjects of interest in the environment and their attributes. Landmarks are unique, recognizable objects or places in thereal world that are used as spatial reference points. Navigation using landmarks is egocentric, relying primarily onvisual recognition, and the constraints (barriers) and affordances (natural flow) of the environment.

The second level of spatial knowledge is route knowledge, which is a type of procedural knowledge. A route is asequence of landmarks, and the directional knowledge or instructions required to move between them. A person at thislevel is able to navigate along known routes, and create new routes by combining parts of known routes.

The highest level of spatial knowledge is survey knowledge, which is a kind of configurational knowledge. Surveyknowledge is acquired by overlapping routes into a topological network, and is characterized by the “emergent” abilityof triangulation. Triangulation allows a person to orient themselves in their cognitive map using direction and distanceto a known point, or direction to two known points. Using triangulation, a person can orient themselves to distant(unseen) landmarks and derive shortcuts. Survey knowledge may also develop a sense of metric relations betweenobjects (distances), but there is some debate in the research community as to whether this can be developed without theaid of physical maps.

The rate at which an individual will progress through these levels varies widely based on a number of factors, such asindividual cognitive ability, and the complexity and imageability of the environment. Adults are often able to learnboth landmark and route knowledge for complex routes in a single exposure. However, survey knowledge developsmore slowly, and even experts (for example, experienced taxi-drivers) rarely develop a model of the environment withcorrect distances and directions between locations.

2.4.3 Accuracy of Spatial Knowledge from Maps

It is tempting to think of a cognitive map as an internal version of a graphical map. However, even the cognitive mapsof experts are often only topologically accurate. Psychometric experiments, which compare a person’s conception ofspace to actual space, have found the following examples of distortions in cognitive maps:

1. Distance errors. Perceived distance is biased by several factors: the number of directional changes; travel time; thephysical and cognitive effort required to follow a route. For example, the distance uphill from point A to point Bwill often be perceived as spatially farther than the downhill distance from point B to point A. Routes withmultiple changes in direction are often perceived as longer than a similar route with fewer directional changes.

2. Rotation and alignment errors. When people store spatial relationships in memory, they tend to position objectsin alignment with each other, and to rotate objects into alignment with larger frames of reference. For example,adjacent streets are often remembered as parallel to each other, and aligned with major axes, such as the cardinaldirections.

3. Directional errors. There are at least three different factors that affect angular distortions in memory. First, humanmemory for angles is often biased towards right angles. Second, the estimated direction to a location will often bebiased towards the direction of the route taken to get there. Third, objects are stored in a hierarchy of superordinategeographical areas, such as neighborhood, district, and country. Directions between locations at one level in thehierarchy are biased by the direction between the centers of their containing areas.

4. Detail. Cognitive maps are abstract and incomplete. People are able to recognize many more landmarks and routesin their environment than they are able to recall.

Experiments have also found a number of differences between spatial knowledge learned from direct experience andfrom graphical maps:

1. Distance errors. People who learn from direct experience are more accurate in estimating route distance thanstraight-line distance, whereas people who learn from maps are equally accurate for both estimates.

2. Rotation and alignment errors. The rotation and alignment errors mentioned earlier for cognitive maps also applyto spatial relationships learned from maps. These are errors created during storage in memory, not errors of recall.Additionally, spatial knowledge learned from maps is oriented to the image of the map, whereas spatial knowledgelearned from navigating in the environment is oriented to the view from the ground.

3. Directional errors. People who learn from maps are often poorer at orienting themselves (without the map) todistant locations in the environment than people with direct experience.

In spite of these errors, most people are quite effective and resourceful at navigating through their environment usingonly their cognitive map and the information in the environment. For tasks requiring metric and angular precision,however, people must rely on external representations of space.

2.4.4 The Influence of the Environment

The environment influences the development of cognitive maps by the distinctiveness and visibility of landmarks, thecomplexity of routes, and the difficulty of relating and integrating the two into survey knowledge. Kevin Lynchpublished a very influential study in 1960 on how features of urban environments influence the development of spatialknowledge. This study focused on the components of the collective cognitive map of a community to determine thefeatures of the environment that were “imageable”. He found five general spatial elements:

1 Paths. These are the typical or potential channels along which an observer moves. Paths are memorable for theirfunctionality (ease and reliability of use, access to important locations), as well as their visual and physicalcharacteristics (the views they give access to, their shape and continuity).

2 Edges. Edges are usually physical barriers or lines of demarcation that separate areas, such as bodies of water,walls, or rows of buildings.

3 Districts. Districts are thematic divisions of space into areas with common characteristics. They may be derived fora variety of reasons, such as architecture, inhabitants, texture, activity, and topography.

4. Nodes. Nodes are the intersections of paths, and are important because people must make decisions about changesin direction. They also often mark the transition between districts, or areas of great activity.

5. Landmarks. There are two general types of landmarks:• Distant landmarks are visually prominent from a distance and provide primarily directional information (e.g., a

tall building, the ocean, the north star);• Local landmarks are recognizable by their visual contrast or personal/social importance and provide locational

information (e.g., a fountain, the gas station on the corner, the house where a famous person lived).

Lynch found that environments that were highly irregular often led people to develop detailed cognitive maps withtopological errors. For example, roads were often assumed to be parallel or perpendicular, which led to confusion whenthe same roads intersected more than once. In spite of these errors, people were able to navigate in these environmentsusing the imageable features mentioned above. On the other hand, people in uniform environments, without imageablefeatures, developed cognitive maps with large empty regions. People in these environments depended on the regulargrid pattern of roads as paths and edges for navigation, and were easily disoriented.Lynch’s research focused on the features in the environment that are remembered (in a cognitive map) and used fornavigation, and especially how to design environments that weave these imageable features into a cohesive whole. Thisstudy has strongly influenced urban planners, architects, and more recently designers of virtual environments, interestedin creating imageable, navigable spaces.

2.5 Design Issues for Navigational Maps

The design of navigational maps can greatly simplify the perceptual and cognitive tasks required for navigation thatwere discussed earlier. For example, static “you-are-here” maps positioned in the environment• Landmark recognition. Face the map towards identifiable landmarks depicted on the map.• Orientation and Triangulation. Place a representation of both the map and the person in the map.• Mental rotation. Align the map with the viewing direction in the environment.• Route selection. Indicate routes to common destinations.• Translation. Place multiple maps in the environment, each tailored to its location.

The design of electronic maps, which may be dynamic, requires at least two additional considerations. First, there aretwo alternative frames of reference for electronic displays: egocentric and exocentric. An egocentric frame of referencemaintains the alignment of the map with the current direction of travel, whereas an exocentric frame of reference isfixed, usually aligned with the cardinal directions. An egocentric frame of reference removes the need to mentally rotatethe map to align with the world, which makes navigation easier. On the other hand, an exocentric frame of referenceaids in the development of survey or configurational knowledge by maintaining orientation with distant landmarks(e.g., magnetic north).

A second important design consideration is the use of visual momentum. The term originates from cinema, where thetransitions between scenes are edited to maintain visual coherence. Applied to transitions between maps or relatedcomputer displays, visual momentum helps the user orient and identify corresponding objects between views. There areseveral guidelines for maintaining visual momentum. First, use consistent or overlapping representations. For example,books of maps often use consistent alignment between adjacent maps as well as some amount of overlap. Second, useintermediate views or animation to reduce disorientation during changes in direction, position, or scale. For example,books of maps often have an overview map that shows the entire extent of the area covered. Third, highlight invariantfeatures between views. For example, an outline on an overview map showing the extent and orientation of a detailedview. Visual momentum, as well as a consistent frame of reference, can ease the cognitive tasks associated withnavigation using electronic maps.

In summary, people develop spatial knowledge for navigation in two ways: direct experience or by using maps. Peoplewho use direct experience progress through three stages of spatial knowledge: landmark, route, and survey. People wholearn from maps develop the directional and distance accuracy of survey-type knowledge, but have more difficulty withthe landmark and route knowledge. Using maps for navigation requires several cognitive operations, such astriangulation and mental rotation, which can be simplified by careful design of physical and electronic maps.

3. Multimedia and Hypermedia

Hypermedia is a dynamic, associative network of nodes and links for organizing, relating, accessing, and sharinginformation. Given these characteristics, navigating in hypermedia share many similarities with map navigation. As aconsequence, GIS may profit from multimedia and hypermedia extensions to electronic maps without loosing their keyfeatures. Today, the most popular example of hypermedia is the World Wide Web (WWW), which is quickly becomingone of the primary mediums for information access and distribution. However, the WWW is only a recent (and limited)example from a research domain with a relatively long history. Vanevar Bush introduced the concept of hypermedia in a1945 article, where he imagined a machine called a memex that managed a vast network of interrelated public andprivate documents. Ted Nelson introduced the term “hypertext” in 1965 to describe his Xanadu project, and the firstworking systems (Andries van Dam’s Hypertext Editing System, and Douglas Engelbart’s NLS) were introducedshortly thereafter. Since then, hypermedia researchers have explored and identified a wide range of important featuresand potential problems in the application of this technology to different domains.

The defining feature of hypermedia is the use of computer-navigated, relational links rather than linear structure to binda collection of information together. Text documents are traditionally linear (i.e., they are read from beginning to end),although they often use references to direct the reader to material that supports the main text. For example, references inbooks and paper-documents direct the reader to remote areas within the same text (e.g., references to footnotes andfigures), or even to other documents (e.g., bibliographic references).

Figure 3. Hypermedia documents and the underlying structure connecting them. The documents on the left show theembedded links represented by their labels, and the diagram on the right shows the network structure with documents

represented as nodes and links as arrows.

In contrast, hypermedia is primarily non-linear, which allows readers to choose paths through the informationdepending on their interests. For example, the simple hypermedia in Figure 3 shows four documents (A, B , C, D)connected by six links (u, v, w, x, y, z). A reader could browse or navigate through the documents in the sequenceA→B→C→D by selecting the links v, w, and x (shown in bold). When a link is selected, the new documentautomatically replaces the current document, giving the impression of movement through the structure.

The nodes in early hypertext systems were text-only. The term hypermedia was coined to emphasize systems where thenodes may be text or any type of multimedia, such as images and sound. In practice, the terms hypertext andhypermedia are used interchangeably because the high-level organizational structure is the same. This section willpresent many of the basic components, features, and potential problems of hypermedia, independent of any particularimplementation. Special attention is given to how hypermedia technology can be applied in GIS to explicitly linkspatially related information within and between maps into a coherent, navigable structure.

3.1 Components of Hypermedia

At a conceptual level, hypermedia is built on a network of nodes interrelated by links. A node is a chunk ofinformation, such as a paragraph, document or image. A link is the explicit connection within and between nodes thatgives the information its structure. An author uses links to express different types of relationships between information,such as the definition of a term or the next document in a sequence. A reader follows links to access information in theorder and at the level of detail desired.

At the implementation level, there are a wide range of definitions of nodes and links. The Dexter Hypertext ReferenceModel, created in the late 1980s by a small group of the leading hypertext researchers, is an attempt to abstract fromthese different implementations to define standard terminology and a formal model of the abstract structure of hypertext

systems. Although existing systems do not implement all of the features and constraints of the Dexter model, it isuseful for giving an overview of hypermedia concepts.

There are three layers in the Dexter model: a storage layer, a run-time layer, and a within-component layer. The storagelayer represents the underlying “database” that contains the node/link structure of the hypertext system. Nodes and linksare both generally referred to as “components” in the storage layer. The within-component layer models the content andinternal structure of different types of components. The run-time layer models the dynamic aspects of hypertext, such asinteraction and presentation. Between the storage and within-component layers, there is an interface for anchoring,which allows links to refer to specific information contained within a component. Similarly, between the storage andpresentation layers, there is an interface for presentation specifications, which allows the system to present contentdifferently according to the specific context.

3.1.1 Common Features of Nodes and Links

In the Dexter model, nodes and links are both components, so they share a set of basic properties. For example, allcomponents have a set of attribute/value pairs and presentation specific information. The attribute/value pairs supportnode and link typing. A node can be classified by its contents, and a link by the relationship between its end-points.The actual semantic types used are dependent on the application domain. For example, in a system to help authorscreate a hypertext version of a book, semantic node/link categories may exist for “definition” /“defines”, “furtherdetails”/“explains”, and “author”/“authored”. These semantic types, when they are visible in the interface, help theauthor in structuring information, and the reader in understanding the relationships between information. Thehypermedia browser may also use semantic types (and the component’s presentation specific information) todynamically choose the appropriate node to display based on the context (e.g., a detailed explanation for a noviceversus a label for an expert, or a summary for a guest versus confidential information for an administrator).

3.1.2 Nodes

Nodes are containers for the information in hypermedia. As shown in Figure 4, the Dexter model describes two basictypes of nodes: atomic and composite. Atomic nodes are containers for different types of information, and are stored asunique objects in the hypermedia database. Composite nodes are like atomic nodes but with the additional property thatthey may contain any number of atomic and composite nodes. A simple example of a composite node is a hypermediadocument that includes a picture. The same picture may be used in many different documents, and any changes to thepicture are immediately reflected in all documents that include it. The containment hierarchy of composite nodes mustbe a directed-acyclic graph (DAG), which prevents a component from containing itself.

Figure 4. Atomic and composite nodes. The hypertext documents on the left are represented diagrammatically on theright. A, B and D are atomic nodes, which may contain any type of information. C is a composite node containing

both information and the atomic node D.

A node may contain a wide range of data types, such as text, sounds, or digitized images. Multimedia data types inparticular may require additional control structures to be properly viewed, such as synchronization between movies,soundtracks, and subtitles. In the Dexter model, these are distributed across both the within-component layer, whichunderstands the internal structure of different media types, and the run-time layer, which would control synchronizationand context-related issues, such as which set of subtitles to use.

3.1.3 Links

Links explicitly express the relationships within and between nodes. In the Dexter model, links have a direction andreferences to two or more other components. The end-points of links are anchors, and may refer to the entire component(the simplest case), or a specific point or region within a component. For example, in Figure 5, link x represents arelationship between nodes A and B, and link y represents a relationship within node C between anchor points 1 and 2.

A browser following link x would replace document A with document B , and following link y causes the browser tomove from position 1 to position 2 in document C.

Figure 5. Links between and within nodes. Nodes and links are shown as hypermedia documents on the left andstructurally on the right. The left pair of nodes, A and B, are connected by link x. Link y connects points 1 and 2

within node C.

There are two basic types of links: referential and organizational. Referential links are used to express semanticrelationships between nodes. Organizational links provide hierarchical and sequential structure for the hypermedia. TheDexter model supports link typing through the general attribute/value pairs associated with all components.

3.2 Problems with Hypermedia

Hypermedia networks are inherently abstract, often complex, and can quickly become immense. There are manysimilarities between navigation in large hypermedia structures and navigation in the environment, which makes manyof the cognitive issues for navigation discussed earlier very applicable in this context. The two most significantproblems users of hypermedia structures have are disorientation and cognitive overhead.

Disorientation occurs when a reader becomes lost in the hypermedia structure (i.e., unable to find the way to a knownlocation), or unable to find a path to a goal. Most links in hypermedia can only be followed one way, making it easy toforget the path taken to a particular point. The labels associated with links often give very little information aboutwhere they lead, which makes it difficult to determine if a node is of interest without following it.

Cognitive overhead refers to the concentration and mental effort required in creating and navigating through ahyperspace. An author can quickly become overwhelmed with the many ways in which material may be linked, inaddition to the difficulties associated with creating nodes and links with the hypermedia tools. A reader has all thedifficulties associated with creating a cognitive map in an abstract environment (recognizing/finding landmarks,creating/following routes, integrating experience into surveys).

One approach to solving these problems is to improve the imageability of the local and global hypermediaenvironment. Descriptive labels for links make the relationships between nodes more apparent. Alternative navigationalstructures, such as hierarchies and sequential paths, can reduce the complexity of the hypermedia network. Thesealternative structures can be created externally to the existing hypermedia (like a map), or by embedding structural linksinto the existing nodes.Two examples of alternative organizational structures implemented in some hypermedia systems are paths and guidedtours. Paths, also called trails, are sequences of nodes created by the author as an alternative thematic navigationalstructure through the hypermedia. Paths are often tailored for particular subjects, such as following a “Dutch Painters”trail through a museum, or for particular readers, such as novices or experts. Paths are not enforced by the hypermedia,so the proper links to follow must be clearly marked, and the reader is always free to leave the path to followinteresting detours at any time. Guided tours, on the other hand, force the reader to remain on the path by disablingunrelated links.

Disorientation and cognitive overhead are significant problems for authors and readers of hypermedia. In addition to thestructural changes to the hypermedia described above, there are many tools and features of hypermedia, which attemptto reduce the effect of these problems.

3.3 Functionalities of Hypermedia

Hypermedia environments provide authors and readers with a rich set of features for exploration and organization. Thefollowing section discusses three different types of features: views, navigation, and annotations.

3.3.1 Views

Many hypermedia systems provide alternative views as an aid for both understanding local and global structure, as wellas rapid navigation to remote areas of the hypermedia.Local and global hypermedia maps are very useful for helping readers understand the content and structure of thehypermedia. Maps may be textual, like a table of contents, or graphical, like the structural map in Figure 6. Mapsallow a person to browse the structure of the hypermedia and move directly to an area of interest. These maps may bestatic or dynamic, i.e., created by an author or generated by a tool. Static maps are tailored to a particular area of thehypermedia, and must be maintained as the hypermedia changes. Dynamic maps automatically reflect changes to theunderlying structure and may offer interactive features, such as an indication of the current position in the hypermediaand variable levels of detail.

Since most hypermedia structures are too complex to be viewed at one time, some hypermedia tools support differentlevels of detail. The more detailed view can be expanded in place, or opened in a second linked window (e.g., replacingthe contents of the current window, or opened in its own window). For example, an outline tool or a dynamic table ofcontents allows the user to expand (or collapse) the detail in the context of the overall structure.

3.3.2 Navigation

The defining characteristic of hypermedia is its support for “browsing” along relational links. An alternativenavigational strategy used when traveling to distant or unknown destinations is searching by query. A thirdnavigational strategy is browsing through the recent and global history of documents visited.

Browsing is the process of following the relational links that give the hypermedia its structure. When a link is selected,the destination node replaces the current node or opens in a separate window. Browsing allows readers to select theorder and level of detail of the content they want to read, as well as to discover related information in a context of use.

Hypermedia structures are often too large to be traversed by browsing alone, which makes search a necessarynavigational alternative. A user conducts a search by entering a query and specifying the area to search (e.g., the entirehypermedia, or limited to within a single node). Search is generally content-based, but advanced methods may supportstructure-based search (e.g., finding all nodes that are a certain distance from a given node).

Backtracking through history is an essential navigational feature supported in most browsers. History allows the user tomove backwards and forwards along the path they have traveled through the hypermedia. Since most links areunidirectional, a browser without this functionality would often strand a reader on a node without a mechanism toreturn to the node they just came from. Some hypermedia systems provide a graphical history, but most only support alist of the nodes visited recently. This list is pruned to reflect the most current path through the hypermedia network.

Figure 6. Pruning of local history. The top part of the figure shows a collection of hypermedia documents on the leftand a diagram of their structure on the right. The bottom part of the figure shows a diagram of the path A→B→A→C

on the left and the history after each link on the right.

Most readers navigating in a hypermedia start at a root node and systematically explore a number of different paths,returning to the root node between each excursion. History mechanisms support this type of browsing by pruning thesesubpaths from the local history. For example, Figure 6 shows a collection of hypermedia documents, their structure,and the history of navigation. The user followed the path A→B→A→C through the network, but the circular pathA→B→A was pruned from the history as shown on the bottom right.

3.3.3 Annotation

Annotations are reader-created content that is related to existing hypermedia documents. These annotations may becomments, such as highlighting, margin notes, and “stickies”, or indexes, such as references, and bookmark lists. Thedistinction between annotation and normal hypermedia content is that annotations are often only accessible to theirauthor.

There are many different implementations of annotations. Some systems modify the existing document, which makesan annotation similar to authoring new content. Other systems store annotations separately, allowing them to be viewedonly by the intended target audience.

Comments are reader-created text or graphics that are associated with a particular region of a hypermedia document.Comments are usually separate from the original document, and can often be filtered. In some multi-user environments,comments are stored with the author name and timestamp to aid cooperative work.

Bookmarks are reader-created links to hypermedia documents that are stored either in the user’s browser, or in a newdocument. Bookmarks are often grouped into categories or assembled into hierarchies. These bookmarks allow the userto move directly to an area of interest without following existing links.

3.4 Design Issues for Hypermaps

Hypermaps are a hypermedia version of graphical maps that link to spatially related information such as images, text,and adjacent or more detailed maps. The general components, problems and functionality described for hypermedia, aswell as the cognitive issues for maps described in the previous section, are all directly applicable to the design ofhypermaps. For example, a hypermedia version of a road atlas could use the overview and detailed maps as nodes, andcreate links between them for navigation. In addition, the user could create annotations of specific regions within amap, such as comments or bookmarks.

The extension of GIS databases to hypermedia structures offers exciting possibilities for relating and navigating throughspatially related information. The design guidelines for visual momentum are essential for avoiding the problems ofdisorientation and cognitive overload in hypermedia environments, which are more abstract than interactive electronicmaps. These guidelines would apply both at the map level (e.g., the design of the physical maps), as well as at thehypermedia level (e.g., the transitions between maps).

4. Virtual Reality and 3D GIS

A step forward in applying advances interaction technologies to GIS is represented by the joining of GIS and virtualreality. Actually, GIS and virtual reality have been developed in different areas of information technology, finding,from the very beginning, different sets of researchers and explicitly looking at different communities of users. Althoughone of the major motivations beyond the growth of virtual environment is the willing to provide new kinds of vividand realistic interfaces, and this requirement well holds in the interaction with GIS, it is true that for many years thetwo areas independently developed without trying to define common bases for their integration.

Virtual reality is the main tool for providing the users of a computer-based system, dealing with real or realistic data,with a highly vivid interaction with the data, in which users have a realistic perception of a 3D scenario based onadvanced 2D visualisation or more immersive interactions, until the so-called full immersive 3D navigation. Whilereal data are information originated from some measure and have the purpose of representing some fragment of reality,realistic data are artificially built with the objective of presenting ideas, what-if analysis and simulations. For bothcases, present virtual reality technologies offer an extremely promising way for improving the interaction with GISsystems, and several examples and advanced prototypes are available at time of writing (2001).

4.1 High level interaction

The task of widening the set of potential GIS users to include people with no need of technical and specific training,coupled to the importance of deeply interacting with spatial data, have recently brought the efforts of the researchers tothe direction of integrating GIS functionalities and virtual environments. This requires the availability of 3D data to be

supplied to the visualising system. On the other hand, the first tools for capturing spatial and geographical data wereproducing two-dimensional (2D) spatial data, in some cases integrated semantic information. As a consequence, thefirst GIS dealt with 2D maps, at most exploiting semantic information to produce thematic maps.

Starting from the mid nineties the issue of adding three-dimensional (3D) capabilities to the visualisation of GIS dataemerged as a strong requirement that has oriented the applied research on GIS functionalities and GIS integration withcomplex visualisation systems. 3D visualisation is requested, for instance, to rapidly sketch and visualise design ideasin urban design or in the final stage of a complex design process. More generally, it comes out as a basic tool for visualanalysis and spatial cognition, when one is mainly concerned with the general arrangement of objects, as well as size,dimension and the relationships between objects.

The problem of the 3D visualisation of GIS data can be decomposed into the sub-problems of getting and ofvisualising 3D spatial data. Although there are several methodologies for dealing with these sub-problems, there are noleading technologies, and the choices of the implementers still strongly depend on the application and the expected userprofile.

4.2 3D GIS

3D GIS have the capability of representing real or realistic scenarios and therefore have a full set of functionalities fordealing with 3D data. At a first approach, a rough simplification sees two possible methodologies, not mutuallyexclusive, for providing GIS with 3D data: augmenting 2D data and directly obtaining 3D data. The former consists ofaugmenting 2D data through the introduction of further information derived from the semantics of data _ the height ofthe sea surface is zero, the vertical size of manufactured objects cannot exceed the magnitude of one kilometre _ ordirectly from the available attributes. Another way of augmenting 2D data consists of integrating original GIS data with3D data resulting from some computer-aided design (CAD) process. For instance, modern CAD systems havefunctionalities allowing direct 3D drawings that can effectively filled with predefined schemes, textures and photo-originated landscapes.

The second methodology consists in directly obtaining 3D data. This is in principle more effective and attractive, butunfortunately suffers from a few limits due the intrinsic difficulties of the task. The main approaches are presently basedon the use of suitable software that processes photo-originated landscapes or other geo-referenced data to obtain cleanand simplified 3D models, such as triangular irregular networks (TINs), consisting of a large number of connectedtriangles used for representing 3D surfaces (e.g., earth's surface). An example is shown in Figure 7.

Figure 7. The TIN (Triangular Irregular Network) of the whole sea floor for the Mediterranean Sea (Black Sea excluded)with a precision of 20 meters derived from bathymetric data set (polylines, points and grid) through a specific tool

(ESRI-Spatial Analyst).

TINs are basic 3D representations that are typically integrated by 3D real or artificial data, to obtain quite realisticvisualisations. Figure 8 illustrates the process of such an integration.

(a) (b) (c)

Figure 8. Visualisation of a TIN in the KarmaVI (K2VI) system. (a) Visualisation of the original TIN. (b) 2D GIS dataare placed on top of the TIN. (c) 3D CAD models are used for improving the presentation.

4.3 3D visualisation

In the visualisation of spatial data the major task is to present 3D data to users so that they can have a realisticperception of the modelled scenario. Modern computer graphics offers traditional visualisation tools and, since a fewyears, virtual reality (VR). Traditional visualisation tools are based on geometric transformations that map original 3Ddata to the screen through an articulated process, which can be considered mainly a static process. This approach has thedisadvantage of offering just a 2D screen representation of a modelled 3D reality. Due to this limitation, the spatialperception can only be conveyed when the observer conducts an internal abstraction, but this process is far from beingobjective since it is deeply related to the perceptual experience and competence. The observer needs to internally convertthe picture seen on the screen to three dimensions and to the respective scale. In contrast, VR exhibits the characteristicof allowing the observer to have a more vivid, immersive and realistic perception of the 3D reality. Indeed, VR is basedon the direct coupling between the viewing position used to generate the image on a display, the head position andviewing direction of the user. The head position of the user is constantly tracked and fed into the display algorithm tocalculate a correct stereoscopic and perspective display of the scene. This strong coupling of the current eye position ofthe user and the image offered by the display system gives the user the illusion of “immersiveness”; for instance, VRtechnologies can be based on head-mounted displays (HMD) or stereoscopic surround projection (see Figure 9).

Figure 9. Standard VR technologies.

The VR approach, although offering more vivid visualisation, has the intrinsic limit of reducing the capacity of theobserver to effectively interact with the visualisation system, making more difficult in practice to formulate queriesabout the scene. Special glove devices are available but they are still limited as do not allow a precise interaction. Agood compromise is presently offered by less immersive tools, still based on the VR paradigm, although using a screenbased visualisation and a graphical user interface (GUI), through which functionalities such as walking, flying, panning,turning, rolling and planning are offered to the observer. This approach is based on the Virtual Reality Modelling

Language (VRML), which is substantially a platform-independent language capable of modelling the 3D reality forallowing its visualisation through a so-called VRML browser, ad hoc software often made available as a plug-in forsome well-known WWW browser. These further characteristics of VRML have made it an emerging standard for a Web-integrated visualisation in the Internet/intranets.

The limit of present technology is in the coupling of the VR visualisation system and the GIS, specifically in thegeneration of a suitable description of 3D data, still basically static, and in the ability of the user to precisely formulatequeries, even through a GUI, which is normally strongly related to the VR system.

4.4 The KarmaVI project

K2VI (KarmaVI version 2) is an application for modelling, manipulating and analysing two and three-dimensionalspatial data within a virtual reality environment supporting stereoscopic viewing, 3D interaction devices and providinga 3D user interface for easier modelling and manipulation.

K2VI is developed in the framework of the 3-D GIS & Virtual Reality project, funded by the Dutch Land WaterEnvironment Information Technology (LWI) foundation, by the Delft University of Technology in The Netherlands andby Asset Information Systems in New Zealand.

The system provides multiple views to support 3D GIS interaction in a virtual environment (see Figure 10): plan view,model view and world view.

Plan view (a) Model view (b) World view (c)

Figure 10. Different view modes visualising a part of the centre of the city of Utrecht in The Netherlands. The yellowdot in the plan view represents the position of the viewpoint for the model view and world view.

In the plan view, geographic data are visualised in a conventional cartographic map format (Figure 10 (a)) and spatialobjects are represented by 2D points, poly-lines, polygons and symbols, while thematic data are visualised usingstandard text annotation and classification techniques. The model view offers a so-called bird's eye view on geographicdata (Figure 10 (b)), which makes it possible to visualise data that contains rough information along the thirddimension. Visualisation in this view is a simplified model of users' 3-D geographic data. The world view has thepurpose to give a realistic impression of the changes in the landscape, using both visual and audio output. The user can"walk through" the geographic data, which are visualised using detailed 3D CAD models and textures (Figure 10 (c)).

Each view has its own preferred display system. The plan view is thought for standard GUI interaction while using amonitor. For the model view, a virtual workbench is a more appropriate display system. It is a non-immersive virtualreality display system where the user views a large stationary display that generates stereoscopic imagery similar to thatused in 3D movies. Due to the stereoscopic imagery the virtual objects appear to exist above and below the physicaldisplay surface (Figure 11 (a)). The world view is best viewed with a truly immersive display, such as a HMD or asurround projection display, for example, the CAVE (Cave Automatic Virtual Environment) a (fully immersive,multiple screen) projective display system that offers stereoscopic surround projection to several users simultaneously(Figure 11 (b)). In addition, for avoiding the inconvenient to always have to change display system when switching toanother view, all display systems are in practice supported on all available systems.

Virtual workbench at the College of ComputingGeorgia Institute of Technology (a)

CAVE (Cave Automatic Virtual Environment) (b)

Figure 11. (a) Virtual workbench and (b) surround projection in a CAVE (EVL 1999).

The three views require different ways to navigate through, and interact with the data. The plan view is best suited forspecifying, selecting and exploring data. The user is able to navigate through scrolling, panning and zooming orthrough browsing via hypertext links. The user interface accommodates most features of today's GIS systems, usingstandard interface components like menus and icons. The model view lets a user arrange objects like buildings, roads,trees and bridges. When using a virtual workbench, navigation and manipulation can be steered by gestures, real headmovements or by using more conventional devices like a trackball. In the world view, attention is directed tonavigation and evaluation by 'looking at' the objects and their surroundings. Because of the lack of overview,manipulation of objects is not considered useful in this view. The user's main activity is navigating through the 3Dscene from a first-person viewpoint, gathering information such as object attributes and distance from the viewpoint. Inan immersive VR environment, a data glove or "magic wand" is used as input device, to support (and enhance) theuser's sense of immersion. GIS functions (select by SQL query, identify, buffering, measuring distances) are supportedin all views, although each view has specific ways of interaction to select objects and to specify the query operators.For instance, in the world view, it would be most natural to have a spoken-language interface, while in the plan view amore conventional interaction method such as a keyboard may give a better user support. Other available functionalitiesinclude manipulation and navigation.

5. Visual Querying

As it was discussed in the above sections, geographical information is most naturally conveyed in visual format. Mapsand diagrams (i.e., schematic maps such as a bus network map) are the core means in all user interaction activities,including spatial querying. A typical query would be “show me on a city map where is the post office that is closest tothis location”. The query itself would most likely be expressed using preformatted forms and menus to select the cityand the reference location. The result would be a blinking or otherwise highlighted point in the displayed map. Once amap is displayed, as a result of a previous query or as an initial background screen in a query formulation interaction,the map can be used to specify a new query. This typically supports queries such as “give me more information onthis”, where the value of the this parameter is specified by pointing in some way to a location in the map (i.e., a pointon the screen). Thus in some sense visual interaction is common practice in GIS systems, due to the intrinsicallyspatial reference that is associated to the data.

Although not yet implemented in commercial systems, the research literature has proposed different ways to supportquery formulation in a more visual approach than just forms and menus. The issue itself is actually more complex thanfor traditional databases. Formulating a query visually implies using the space on the screen to display the queryelements. In traditional databases the way query elements are positioned within the screen is irrelevant: a person-owns-car schema diagram has the same semantics whether the person type is on the right hand side and the car type on theleft hand side, or vice versa. The only thing that matters is to know whether the query is looking for persons or forcars. In a geographical database a query to retrieve cities that have a lake within 5 km of the city limits will similarlyrequire that city and lake types are displayed on the screen. But now the question arises: is the relative positioning ofthe city and lake visual notations relevant or not to the query? Does the fact that city has been positioned left of lake onthe screen mean that the user is looking for cities that have a lake east of the city, or is the relative positioningirrelevant? To correctly understand the query that is formulated as a configuration of set of objects on a screen, thesystem has to be able to determine to what extent the organization of the space in the screen corresponds to a search

criterion versus the organization of the database objects in the geographical space they belong to. A disambiguationinteraction with the user is needed, for which different techniques may be used (e.g., resorting to additionalspecifications, entering a dialogue with the user, or having the query reformulated by the system in natural language forvalidation by the user).

More recent approaches use icon-based or sketch-based approach to visually formulate a spatial query. In the formercase, each object type in the database has an associated icon. To formulate a query the user selects the icons of interestand positions the icons along axis that represent directions supported by the system (e.g., North, East, West, andSouth). Icons positioned adjacent to each other express an adjacency topological relationship that must hold betweenthe selected instances. Icons that do not touch each other imply that corresponding instances are separated by somemetric distance that can be separately specified if a distance criterion is to be part of the query. Although very intuitive,these approaches have obvious limitations. The first is intrinsic to iconic systems and is the difficulty to define a largenumber of icons that can be recognized and understood by the users. This makes iconic interfaces best adapted fordatabases with a relatively small number of object types, which is definitely not the case for many geographicaldatabases. The second limitation is in expressive power and is due to the limited number of positions available on thescreen to dispose the icons that participate into the query. For instance, an icon can only be adjacent to four other icons(one in each direction), and only two icons may visually intersect on the screen. Menus can be used to overcome suchlimitations, but this degrades the visual quality of the interface. Figure 12 shows two examples of such an iconic queryformulations.

Figure 12: Visual Queries in iconic interfaces

The first one, on the left, shows how the query "Which paths from Paris to Vienna avoid tows of more than 1000inhabitants?" is formulated. Two labeled town icons are used to denote Paris and Vienna, and one more is used to statethe predicate that restricts towns to those with requested population. Geometries of Town and Road objects are drawnas an area for Towns and as a line for Road. Putting those two geometries apart is meant to express a disjointednesspredicate. Finally, a question mark on the Road icon shows that the query is to retrieve roads that satisfy the selectioncriterion, i.e. being disjoint from towns with population greater than 1000. Referring to the previous discussion on theambiguity of spatial configurations in visual query, it is worthwhile noting that in this example the relative position ofthe road line and the town area are irrelevant as far as they are disjoint. So is irrelevant the distance between the roadand the town. The second query, on the right, shows a labeled road icon, denoting road RN7, whose geometry isassociated with a balloon that includes a question mark, a crash icon and a timeline with the December 1988 intervalhighlighted. This configuration conveys the spatial and temporal query "Where did crashes occur on road RN7 inDecember 1988?". Finally, Figure 13 shows a spatio-temporal query involving both moving objects and object'slifecycle. The query is "Which trucks did drive in a riverside expressway five hours before a flood?" The query is splitinto two parts. Part (a) represents the requested spatial configuration, i.e. the road and the river being adjacent. Part (b)holds the temporal configuration, i.e. the requested 5 hours delay between the passing of the truck and the flood. Thelabeled anchors in part (a) link the temporal events to the spatial configuration.

Figure 13: A Visual Spatio-Temporal Query

A different paradigm aims at supporting query by sketch. Sketches do not use predefined icons, but let users draw asketch of the geographical configuration they are looking for. Forms drawn by the user are considered to denote regions,lines and point features. The exact shape that is drawn is not considered (the form is rather a symbolic notation than anapproximate drawing of an existing shape). The semantics of these features, i.e., to which object types they correspond,has to be separately specified. Further specifications are also needed to distinguish relevant spatial combinations fromaccidental ones, as discussed above. Based on pragmatic rules, topological relationships are considered first, whilemetric conditions are seen as second order criteria. The approach is clearly rooted in similar sketching facilities providedby some well-known query systems for image databases, where the sketch is used as a prototypical skeleton to selectimages from the database.

While most GIS developments, focus on topological relationships, some efforts have been directed towards support oforientation relationships, e.g. north of. These relationships are often used in multimedia databases for image retrievalby content. They allow selection predicates such as "images where an object of type A appears northeast of an object oftype B" (e.g. a picture showing the moon over a palm tree).

6. Animation

The last, but not least important, visual feature that has been added to GIS interfaces to facilitate interaction isanimation. Animation is a graphical interface feature that covers all object changes that have a visual effect, includingchanges in position, color, structure, texture, rendering, etc. It is typically used in conjunction with other advancedvisual features, such as image visualization, virtual reality, and hypermedia.

Conventional, not necessarily computer-supported, animation is based on storyboards, that is sequences of sketchesshowing the structure and ideas of animation. Several steps of conventional animation are suited to computerassistance. For instance, in a storyboard just some key frames are drawn, those that allow one to derive the missingintermediate ones. Specific interpolation methods have been devised to automatize this procedure but open problemsstill remain, especially with complex figures, such as the human body.

Complete languages for describing animation have been also proposed, and are almost continuously proposednowadays. The first languages have a linear-list notation where each event is described by a starting and ending framenumber and an action that is to take place. Another way to describe animations is to embed animation capability withina general-purpose programming language. However, in both cases it is difficult for an animator to see what will takeplace in an animation just by looking at the code. More recent graphical animation languages describe animation in amore visual way. These languages are used for expressing, editing, and comprehending the changes taking place in ananimation. The main idea in these languages is to provide pictures of the actions rather than descriptions.

Animations have not only to be described but also controlled in some way. Controlling an animation is quiteindependent of the language used for describing it. Animation-control mechanisms range from full explicit control, inwhich the animator describes the position and attributes of every object in the scene by means of translations, rotations,etc., to the automated control provided by knowledge-based systems, where the animator just gives a high-leveldescription of the scene.

Modern animation extends its range of application from traditional cartoon-characters to realistic 3D and 4D (adding thetime dimension) environments. Moving to 4D introduces additional problems, such as the need of rendering manysimilar images and to avoid temporal aliasing. If some aspects of an animation change too quickly with respect to thenumber of animated frames displayed per second, temporal aliasing occurs (for instance, when wagon wheels seem toturn backward) and correction techniques need to be introduced.

6.1 Animation in GIS

Animation is used in modern GIS which exploit the power of sophisticated visualizations to support the exploration,analysis, synthesis, and presentation of georeferenced information. In particular, animation is often used for depictingtime series of geographically enumerated quantities. It is particularly interesting to investigate whether animated mapslet spatiotemporal patterns to emerge, help forming different conceptual models of spatio-time processes, and causeexpert users changing their hypothesis about processes.

A small number of usability experiments have been performed aiming at clarifying the above points. In particular, usershave been exposed to two kinds of tasks: spatial pattern analysis and comparison (including subtasks: highlight lowand high data values; enhance visibility of regions, clusters or trends; relate data in geographic and attribute spaces;explore associations between different variables) and spatiotemporal analysis (including subtasks: examine time series;examine time series for selected attribute ranges; compare change over time for two variables). Experiment results showthe usefulness of animation in performing such tasks. Participants typically had few problems understanding how touse the animated maps and associated tools. Some errors occurred about seeing false patterns and missing real patterns,but spatiotemporal analysis was generally carried on more effectively with the help of animated maps.

There are basically two kinds of animation in GIS: temporal and cartographic. In temporal animation, a change in thepresented objects is based on temporal data. In cartographic (spatial) animation there is a change depending on factorsother than time, such as a change in the camera position, light source or other non-temporal variables. In some systemsthe two animations are used together, originating the so-called spatio-temporal animation

TEMPORAL ANIMATION

Systems exist which provide the user with time navigation techniques allowing for investigating the temporaldimension of the visualization. Like in 3D-navigation, a special viewpoint in the time dimension is needed from whichthe temporal world is currently viewed by the user. In this view, only those objects and their representations are shownwhich are valid at the time given by the current viewpoint. Changing this viewpoint allows the viewer to move throughthe temporal dimension of the world.

Temporal animation is often used in the study of environmental phenomena, especially when datasets are available inform of time series and for highly dynamic phenomena. In this case, remotely sensed datasets are suitably modeled andmanipulated in order to eliminate possible data gaps and scene composition is applied in the way most suitable toemphasize the aspects that are under study. It is worth noting that animation is particularly effective in so-calledconfirmatory visualization, as opposed to exploratory visualization. Exploratory visualization is used for the initialexploration of very large data quantities in order to make some sense out of data, while confirmatory visualization isused to confirm a hypothesis that is based on theoretical or empirically derived knowledge.

CARTOGRAPHIC ANIMATION

Realistic visualization of maps and terrains is very important for many application fields, such as flight simulation andenvironment assessment. In landscape visualization it is possible to enhance the readability of an image by usingdifferent threshold functions and focus points. The user can fly over the terrain by driving its viewpoint as a virtualairplane. The threshold function linearly increases with distance from the current viewpoint, therefore changing as theviewpoint moves.

SPATIO-TEMPORAL ANIMATION

The time navigation should be performed in parallel to the cartographic navigation because both dimensions areindependent. It should be also performed synchronously to all objects defined in the temporal world. Typically, thenavigation is controlled through instruments, e.g. a time display, which shows the current valid time; the navigationcontrols, which allow the user to change the valid time; a virtual time line, which visualizes the valid periods of thevarious temporal objects in relation to the current valid time. For instance, the historical development of a cityaccording to the construction of the buildings could be effectively illustrated. VRML extended with a Java-basedcontrol of the navigation can be effectively used to implement spatio-temporal navigation interfaces.

Many GIS databases contain spatio-temporal data such that the geographically distributed values change continuouslyby time (e.g., population and precipitation distributions). For such data a suitable animation which displays value-by-area cartograms at different time instances can reveal many information that are otherwise hidden in the huge database.

Finally, simulation of transportation systems has been one important area for computer applications since early 70s.Animation is a valuable feature currently used in traffic simulation systems, especially for detecting collision ofentities, analyzing the reasons for traffic jams, validating the traffic model to the final users. Advanced trafficsimulation systems effectively couple GIS and traffic management technology, exploiting different kinds of animation

to reflect dynamic traffic properties. For instance, the spatio-temporal animation of traffic data values for a specificcollection of network components (which may be a user-defined route).

7. Conclusions

Sophisticated visual techniques, graphics, multimedia environments and general advances in hw/sw technology willhelp the user to overcome the problems inherently related with map navigation, such as disorientation, difficulties inroute selection, distance errors, cognitive overhead, and offer her/him new mechanisms to explore and query GIS data,so to make sense out of them, discovering hidden relationships and non-obvious features. However, many problemsremain still open and the need exists of integrating different features and techniques in a homogeneous environment thatcould be really usable and effective in supporting the various needs of the disparate GIS users.

Bibliography

Bertin, J. (1967). Sémiologie Graphique. Paris. [This is a seminal work on the components of graphics that can bemanipulated and assembled to both explore and communicate information].

Bevan N. and Macleod M. (1993), Usability Assessment and Measurement, in: The management of software quality,M. Kelly (ed), Ashgate Technical/Gower Press. [This paper defines the notion of software usability and evidentiates itsrole among the most important software qualities for interactive systems].

Bonhomme C., Aufaure M.-A., and C. Trépied (2000). Metaphors for Visual Querying of Spatio-TemporalDatabases. In Proceedings of Fourth Int. Conf. on Visual Information Systems - Visual 2000. [An interesting proposalof iconic interfaces for expressing spatio-temporal queries].

Calcinelli D. and Mainguenaud M. (1994). Cigales, a Visual Language for Geographical Information System: theUser Interface, Journal of Visual Languages and Computing, vol. 5, no. 2, pp 113-13. [This is a key article for theicon-based approach to visual querying in GIS].

Câmara, A. S. and Raper, J. (Eds.). (1999). Spatial Multimedia and Virtual Reality: Taylor and Francis. [A collectionof articles discussing design issues and implementations for spatial hypermedia].

Card, S. K., Mackinlay Jock, D., and Shneiderman, B. (1999). Readings in Information Visualization: Using Visionto Think. San Francisco, Calif.: Morgan Kaufmann Publishers. [A compilation of influential information visualizationresearch articles that discuss the state of the art in computer interfaces for exploring and understanding large complexcollections of abstract information, such as networks and databases].

Chase, W.G. (1986) Visual Information Processing. In Boff, et al. (Eds.) Handbook of Perception and HumanPerformance: Vol II, Cognitive Processes and Performance. Wiley and Sons, New York. [A survey of experimentalresearch results investigating how people perceive and process visual information].

Conklin, J. (1987). Hypertext: An Introduction and Survey. Computer, 17-41. [A survey article of early research inhypertext systems, which discusses the essential characteristics, features, and problems of hypermedia].

Egenhofer M.J.(1996). Spatial-Query-by-Sketch, in Proceedings of VL'96 (IEEE Symposium on Visual Languages), pp60-67. [This is an article that both surveys relevant approaches to spatial querying and presents a sketch-based querysystem].

Foley J.D., van Dam A., Feiner S.K. and Hughes J.F. (1996), Computer Graphics - Principles and Practice, AddisonWesley. [This is a book that presents fundamental principles and applications of computer graphics].

Germs R., Van Maren G., Verbree E., and Jansen F.W. (1999) A multi-view VR interface for 3D GIS, Journal ofComputer and Graphics, vol. 23., no 4, pp 497-506. [An article describing an interesting 3D GIS, equipped with avirtual reality based interface].

Laurini, R., and Thompson, D. (1992). Fundamentals of Spatial Information Systems. [A broad, introductorytextbook on Geographical Information Systems].

Luttermann, H. and Grauer M. (1999), VRML History: Storing and Browsing Temporal 3D-Worlds, in Proceedings ofVRML’99, ACM Press, pp. 153 – 160. [This paper presents a system supporting temporal navigation, which is basedon extensions of VRML].

Lynch, K. (1960). The Image of the City, The Technology Press and Harvard University Press. [The results of a studyon the features that people remember in cities, their effect on navigation, and recommendations for urban design].

MacEachren A.M., Boscoe F.P., Haug D. and Pickle L.W. (1998), Geographic visualization: designing manipulablemaps for exploring temporally varying georeferenced statistics, in Proceedings of IEEE Symposium on InformationVisualization, pp. 87 – 94. [This paper reports about usability experiments on a prototype system that facilitates dataexploration for spatiotemporal analysis].

Mark, D. M., and Frank, A. U. (Eds.). (1990). Cognitive and Linguistic Aspects of Geographic Space: KluwerAcademic Publishers. [An excellent collection of articles discussing research on conceptualization and navigation ingeographic space, as well as semiotic, linguistic, and mathematical aspects of spatial representations].

Meisner R.E., Bittner M. and Dech S.W. (1997), Computer animation of remote sensing-based time series data sets,IEEE Transactions on Geoscience and Remote Sensing, Vol. 37, N. 2, pp. 1100 – 1106. [This is a paper presenting anadvanced visual tool for analyzing time series datasets]

Nielsen, J. (1990). Hypertext and Hypermedia Academic Press. [This is an introductory book describing the history,architecture, and usability of early hypermedia systems].

Nyerges, T. L., Mark, D. M., Laurini, R., and Egenhofer, M. J. (Eds.). (1994). Cognitive Aspects of Human-ComputerInteraction for Geographic Information Systems (Vol. 83): Kluwer Academic Publishers. [A collection of articles froman interdisciplinary workshop covering a wide range of related research topics, such as spatial cognition, the design ofGIS user interfaces, and cultural influences on the use and meaning of spatial information].

Ouyang M. and Revesz, P. (2000), Algorithms for cartogram animation, in Proceedings of the International DatabaseEngineering and Applications Symposium, pp. 231 – 235. [This paper compares different algorithms for spatio-temporal animation].

Soomro T.R., Zheng K. and Pan Y. (1999), HTML and Multimedia Web GIS, in Proceedings of the Third InternationalConference on Computational Intelligence and Multimedia Applications, IEEE, pp. 371 – 382. [This paper introducesmultimedia GIS and their advantages with respect to more traditional systems]

Documents

INTERACTING WITH GIS: FROM PAPER CARTOGRAPHY TO …hci.epfl.ch/publications/2001/EOLSS-with_images.pdf · In particular, multimedia GIS is the combination of the use of GIS to georeference,