33
Int. J. Human-Computer Studies (1997) 46, 409441 Content-based visualization for intelligent problem-solving environments JEFF YOST,ANDREW F. VARECKA AND MICHAEL M. MAREFAT Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA. email: marefat @ ece.arizona.edu (Received 15 December 1995 and accepted in revised form 15 October 1996 ) Mechanisms are proposed to allow visualization to become an active agent in the problem-solving environment. Two main problems are addressed: dealing with too much data and allowing simulation steering. The agent solves these problems by extracting semantically interesting data for spatial and temporal understanding, enabling a dynamic and flexible behavior for simulation-integrated control, and intelligently creating visual- izations that intuitively display the selected data. A significant event language is proposed to capture the semantically interesting data through event expressions which can then be parsed and monitored. Behavior is achieved through programmable, hierarchical finite- state machines with events mapped to the arcs and interactions with the visualization and simulation mapped to the nodes. A knowledge-based data mapping process has been designed which uses visual perceptual knowledge and problem-specific knowledge rep- resented in cognitive maps to create visualizations. This architecture has been applied within a watershed simulation application and a prototype has been developed. ( 1997 Academic Press Limited 1. Introduction Scientific visualization combines techniques from computer graphics, image processing, computer vision, user-interface studies, and cognitive science to provide a method for understanding large, complex data sets (Kaufman, 1994). Much work has been done in each of these fields, improving the visualization framework. However, little research has been done in studying the visualization framework in terms of how it can be used more effectively and efficiently. More specifically, the visualization framework needs to become an active component in problem solving, and issues related to this have not been completely analysed. A 1987 report by the ACM special interest group on computer graphics identified two main problems in the domain of visualization. The problems are: dealing with too much data and steering calculation (McCormick et al., 1987; DeFanti & Brown, 1991). Solving the problems defined by the ACM report requires three key processes. First, the appropriate subset of data must be selected for viewing. Brute force searching to determine data of semantic interest is ineffective and inefficient. In this research, a signi- ficant event language is developed that allows semantically interesting data patterns or trends to be defined as events. Occurrences of these events can then be automatically recognized. The definition/recognition scheme provides machine assistance in answering questions such as: ‘‘When does a certain data pattern or trend occur?’’ and ‘‘Where does a certain data pattern or trend occur?’’. Second, once the data are selected they must be 409 1071-5819/97/040409#33$25.00/0/hc960098 ( 1997 Academic Press Limited

Content-based visualization for intelligent problem-solving environments

Embed Size (px)

Citation preview

Page 1: Content-based visualization for intelligent problem-solving environments

Int. J. Human-Computer Studies (1997) 46, 409—441

Content-based visualization for intelligentproblem-solving environments

JEFF YOST, ANDREW F. VARECKA AND MICHAEL M. MAREFAT

Department of Electrical and Computer Engineering, University of Arizona, Tucson,AZ 85721, USA. email: marefat @ ece.arizona.edu

(Received 15 December 1995 and accepted in revised form 15 October 1996 )

Mechanisms are proposed to allow visualization to become an active agent in theproblem-solving environment. Two main problems are addressed: dealing with too muchdata and allowing simulation steering. The agent solves these problems by extractingsemantically interesting data for spatial and temporal understanding, enabling a dynamicand flexible behavior for simulation-integrated control, and intelligently creating visual-izations that intuitively display the selected data. A significant event language is proposedto capture the semantically interesting data through event expressions which can then beparsed and monitored. Behavior is achieved through programmable, hierarchical finite-state machines with events mapped to the arcs and interactions with the visualization andsimulation mapped to the nodes. A knowledge-based data mapping process has beendesigned which uses visual perceptual knowledge and problem-specific knowledge rep-resented in cognitive maps to create visualizations. This architecture has been appliedwithin a watershed simulation application and a prototype has been developed.

( 1997 Academic Press Limited

1. Introduction

Scientific visualization combines techniques from computer graphics, image processing,computer vision, user-interface studies, and cognitive science to provide a method forunderstanding large, complex data sets (Kaufman, 1994). Much work has been done ineach of these fields, improving the visualization framework. However, little research hasbeen done in studying the visualization framework in terms of how it can be used moreeffectively and efficiently. More specifically, the visualization framework needs to becomean active component in problem solving, and issues related to this have not beencompletely analysed.

A 1987 report by the ACM special interest group on computer graphics identified twomain problems in the domain of visualization. The problems are: dealing with too muchdata and steering calculation (McCormick et al., 1987; DeFanti & Brown, 1991). Solvingthe problems defined by the ACM report requires three key processes. First, theappropriate subset of data must be selected for viewing. Brute force searching todetermine data of semantic interest is ineffective and inefficient. In this research, a signi-ficant event language is developed that allows semantically interesting data patterns ortrends to be defined as events. Occurrences of these events can then be automaticallyrecognized. The definition/recognition scheme provides machine assistance in answeringquestions such as: ‘‘When does a certain data pattern or trend occur?’’ and ‘‘Where doesa certain data pattern or trend occur?’’. Second, once the data are selected they must be

409

1071-5819/97/040409#33$25.00/0/hc960098 ( 1997 Academic Press Limited

Page 2: Content-based visualization for intelligent problem-solving environments

FIGURE 1. Two-dimensional map of Brown’s Pond data. Elevation, building and road information is color-mapped in the display according to the key.

410 J. YOST E¹ A¸.

mapped to graphic parameters so that the data can be displayed. To efficiently createintuitive, well-designed visualizations, expert visualization assistance can be very helpful.To address this need, a cognitive mapping method is developed to assist the user inthe design of visualizations. Through cognitive mapping, well-designed visualizationsthat best represent the data can be created for the scientist using expert visual-ization assistance so that different scenarios can be set up quickly, different ‘‘what-if ’’questions can be asked, and the results visualized efficiently. Third, visualizationas an active component in a problem-solving environment needs the capability ofsimulation steering. Through enabling a flexible and modifiable behavior forthe problem-solving environment, a flexible control system is designed to allowsimulation steering.

The methods outlined here have been applied to the development of an environmentfor hydrologic/ecologic problem solving and simulation using Geographic InformationSystem (GIS) databases. For example, suppose the simulation is being used to determinethe best location for new construction, one can use the methods described here in theanalysis of the potential sites under various rainfall conditions. The scenario is shown inFigure 1 where three potential sites, A, B and C, need to be analysed. For instance, thefollowing tasks can be modeled: (1) show all locations having a significant amount ofwater runoff; and (2) focus the simulation on areas identified as having runoffs andincrease simulation resolution. The figure shows altitude and road information for thegeographic area near Brown’s Pond. The tasks are accomplished by representing the

Page 3: Content-based visualization for intelligent problem-solving environments

FIGURE 2. Three-dimensional display of water runoff. Water runoff and elevation information is displayed.Regions A and C are affected as water flows from area A down to the valley.

CONTENT-BASED VISUALIZATION 411

water runoff at each of these sites as significant events, defined by the user throughvalid language expressions. Next, reactions to these significant events are specified bythe user or through the cognitive-data-mapping method. For this example, cognitivemapping specifies a three-dimensional display to show the occurrence of the waterrunoff. The second task is achieved by specifying a control behavior to change theresolution and focus of the simulation based on recognized events. Thus, simulationscould be run at a low resolution of 100 m2 cells over the entire spatial region of Brown’sPond data. When in a certain rain scenario a significant water runoff is detected at pointC, the system generates the display shown in Figure 2, and the simulation scenario ischanged to focus on the sub-region surrounding point C at higher resolution of, say,25 m2 cells.

In general, the language allows semantic processing of data which can then beselectively visualized using a graphical presentation plan generated by a cognitive-mapping technique which intelligently maps data to displays based on data character-istics and semantics. The overall framework enables the visualization to become anactive and integrated component in simulation-based problem solving. The followingsummarizes the contributions of this paper in the development of an intelligent visual-ization agent (IVA), these contributions directly address the three processes discussedearlier.

Page 4: Content-based visualization for intelligent problem-solving environments

FIGURE 3. Abstract IVA architecture. The three main components of IVA are the data monitor, reactionexecutor and automatic graphic planner.

412 J. YOST E¹ A¸.

(1) A significant event language that enables specification and recognition of semanticmeaning for patterns or trends in data.

(2) A visualization planner that uses a knowledge-based approach to create graphicsand displays of data based on perceptual rules and data dependencies.

(3) A flexible and dynamic control structure which enables the system to have a data-driven behavior based on significant events.

1.1. IVA SYSTEM OVERVIEW

IVA consists of three main components each of which will be discussed in more detail inlater sections. First is the data monitor. This component uses a predicate language todefine significant events. It then monitors the data for occurrences of the specified events(details of this component are discussed in Section 3). The second component is anautomatic graphics planner. Visualization is a valuable tool for better understandingdata. However, creating effective visualizations is often a burden for the researcher. Thegraphics planner uses knowledge of visual communications to create effective andexpressive visualizations (details of this component are discussed in Section 4). The lastcomponent of IVA is the reaction executor. Once an event is detected by the datamonitor the reaction executor will perform a specified set of tasks. Tasks include creatingnew displays or altering (steering) the simulation (details of this component are discussedin Section 5). Figure 3 shows an abstract view of IVA’s architecture.

2. Related work

Previous research works have addressed the issue of combining computation andvisualization into a dynamic environment. One of the earliest works discussed RSYST,

Page 5: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 413

which was developed at the University of Stuttgart (Lang, Lang & Ruhle, 1991).The problem that RSYST addressed was creating an integrated software system thatcould provide direct feedback of visualized scientific calculations for simulation steering.The approach taken was to create a consistent homogeneous environment. The environ-ment included problem modeling, input generation, simulation and visualization. Theproblem was solved by creating a modular system where each part of the environmentbecame one or more modules. With a fully integrated system, modules can be put intoa loop, interrupted and restarted as needed. Through the interruptions and restarts thesimulation is steered as needed.

GRASPARC (Brodlie, Poon, Wright, Brankin, Banecki & Gay, 1993) is anothersystem that integrates computation and visualization. It does so by creating a frameworkwhich assists in the management of a set of visualization and computation tools.GRASPARC was designed from the perspective of the problem-solving point of view.The problem of creating an integrated system is solved through the use of a history tree.This tree records information about the computation process thereby allowing thecomputation to be backed up and restarted as needed. The framework was designed sothat any numerical and graphical software could be integrated. The main components ofthe framework that allow the integration of these various systems are the data manage-ment system and application managers. The data management system saves an audittrial which is used by the history tree. The application managers interface betweenGRASPARC and the numerical and graphical systems.

Both of the systems described have similar goals as IVA. GRASPARC has the distinctadvantage of having the most control over the simulation process. The history treeallows the simulation to be backed up and restarted. One disadvantage is that theapplication managers have to be programmed for each numerical and graphical packagebefore GRASPARC can be used with the packages. Two areas were not addressed byeither GRASPARC or RSYST which are addressed by IVA. First, the problem of datareduction is not addressed. With both GRASPARC and RSYST the user is responsiblefor searching for and finding the data that might be of interest or that may result in thesimulation to be steered in another direction. Second, automatic response based on thecomputation results is not addressed. IVA addresses these issues by specifying an eventlanguage. The language enables IVA to highlight events that are interesting to the user.These events are directly associated with computational actions which can exert controlover the simulation, automatically, upon detection of an event.

3. Significant events

Many current visualization systems are data-driven (Upson, Faulhaber, Kamins,Laidlaw, Schlegel & Van Dam, 1989; Dyer, 1990; Silicon Graphics Corporation, 1991;Khoros Group, 1992), meaning that displays are created or updated whenever new dataare available. However, for large dynamic databases it may not be possible to show all ofthe data because of display and data passing limitations. Also large data sets may hinderthe scientist’s ability to recognize semantically interesting or important information.A method for reducing data is needed. A method based on significant events has beendeveloped to address the issue of data reduction. A significant event is an occurrence ortrend in the data that is semantically interesting to the researcher. By keying in on these

Page 6: Content-based visualization for intelligent problem-solving environments

414 J. YOST E¹ A¸.

events IVA can provide more useful information to the scientist. The concept of thesignificant event allows the scientist to look beyond the raw data to extract useful andmeaningful information.

The main goal of running a simulation in a problem-solving environment is to set updifferent scenarios, to be able to ask different ‘‘what-if ’’ questions, and to comprehend thedatabase. From this comprehension the user is able to make decisions about the modeledsystem. For a given scenario certain data patterns or trends are relevant or significant inthe outcome and analysis. For example, if a model is being used to determine the bestconstruction site in terms of water runoff and flooding, the occurrence of significantwater runoff is relevant and semantically interesting. Significant events can represent thewater runoff and occurrences of the events can be used for graphical communication ofpertinent data to the user, and for focusing or steering simulation (parameters or entirescenarios). Semantic events of interest may have spatial dependencies; in terms of ourexample, the scientist may only be interested in runoff in regions A, B or C, but not atother locations. Thus, the proposed mechanism must allow for such spatial depend-encies. Also, not recognizing and/or not analysing all occurrences of such semanticallysignificant data may lead to less-informed decisions. Scientists know that the informationof interest in large data sets is usually episodic, it tends to be clumped in space and/ortime, and data sets usually have much data of little importance. While relying onoperator eyes to find occurrences of all semantically important data (in a brute force way)is inefficient, it may also lend to overlooking some data with semantic importance. Thenew approach to visualization needs to be able to specify and recognize all semanticallyimportant data and intelligently display (graphically) pertinent information to the user.As a result, a great reduction in the amount of data to be displayed is achieved. Thisintelligent reduction in data allows the visualization to be more effective by disregardingunimportant data.

Enabling the use of significant events requires solutions to two sub-problems:(i) providing mechanisms for defining events and (ii) providing mechanisms for detect-ing the events (Yost, Marefat & Kim, 1995). Event definitions represent the semanticallyinteresting data, and the detection recognizes the occurrence of the events. The methodfor defining events should flexibly allow any generic pattern and/or trend in data to bedefined as a significant event. The event definition problem is one of developing anappropriate language that has elements for data comparisons and rules that allow eventdescriptions to be properly formed (Yost, 1995).

3.1. SIGNIFICANT EVENT LANGUAGE

A major contribution of this paper is the development of a significant event language.This language accomplishes two tasks. First, it enables the scientist to specify specificoccurrences or trends in the data as significant events. This allows IVA to focus onspecific research goals. Second, the language associates the significant event with specificcomputational actions. The detection of a significant event triggers an automatic re-sponse from IVA. The essence of the language is to transform data (raw numbers) intoinformation, an interpretation of the data. The significant event language is similar tospaciotemporal languages. However, the event language is not limited to spatial ortemporal queries. Events can be defined within any context.

Page 7: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 415

A language is composed of two parts: syntax and semantics. The syntax defineslegitimate expressions and the semantics define the meaning of expressions. The seman-tics are supported by a vocabulary whose elements are combined to form the semanticsof the language expressions. The vocabulary of the significant event language is made ofelements for comparing data and constants, matching data, calculating particular prop-erties of the data subset or other pattern recognition operations which are used torepresent interesting data patterns or trends. The significant event language is developedas a predicate-calculus-based language without qualifications. Predicate calculus pro-vides a simplistic yet powerful syntax that is familiar to many. The significant eventlanguage consists of predicates, functions, connectives, variables and constants(Charniak & McDermott, 1985) and an event is specified by a formula of predicate termsand connectives combining terms. For example,

AND [GREATER—THAN [20; (rainfall)] ,GREATER—THAN[10, Minimum(rainfall)]]

is a valid event which determines if both the average rainfall and minimum rainfall overa region are above certain values. Average and Minimum are functions that returna numerical value. GREATER—THAN is a predicate that compares is first and secondparameters to determine the truth value of the predicate. AND is a logical connective.The exact functions and predicates implemented should depend on the problem domainand each implementation should be flexible such that new predicates and functions canbe added with little programming effort. The basics of each part of the language arediscussed below along with some examples based on the prototype developed forapplication in hydrology/ecology.

Predicates. Examples include

f LESS—THAN (LT) f LESS—THAN—OR—EQUAL (LE) f EQUAL (EQ)f GREATER—THAN (GT) f GREATER—THAN—OR—EQUAL (GE)

Functions. Basic categories of functions include: statistical, calculus and image pro-cessing. Example functions are

f Minimum (Min) f Maximum (Max) f Average (Avg) f Total (Tot)f Standard—Deviation (Sd) f Slope f Rate

The above functions operate as expected with Slope calculating a spatial derivative andRate calculating a temporal derivative.

Connectives. AND, OR and NOT are the three connectives that combine predicates inlogical manner.

»ariables. Variables are objects that are used in event expressions such as the rainfallvariable in the event example given above. The variables need to be defined separatelyfrom the event expression. Within a heterogeneous simulation environment, variablesare defined in terms of machine name, file name, spatial area of relevance and temporalperiod of relevance. The variable specification fields are used to locate the data of thevariable domain and to define the domain of the variable. Spatial and temporalspecification is necessary since the domain of the variable may be a subset of the entiredata map or simulation run time.

Page 8: Content-based visualization for intelligent problem-solving environments

416 J. YOST E¹ A¸.

Source: Specifies the location of the database and the location of the data within thedatabase.

Spatial information: Variables representing some phenomena generally take place withina certain region; therefore geometric coordinate specifications are required.

¹emporal information: Simulation variables are time-dependent so a starting time/end-ing time for monitoring as well as a sampling interval may be needed.

An example of a variable specification is

rainfall: source"/data/bpond/rainfall .datip"255.255.255.255coord[100, 345, 560, 890, 0, 0]Tstart"120Tend"700

The specification defines a variable, rainfall, that represents a sub-region of the rainfalldata found on a machine in the rainfall .dat file. The sub-region is a rectangular regiondefined coordinates of (100, 560) to (345, 890). The variable is significant only during thesimulation time period of 120—700.

Below are five example events to show some of the characteristics and capabilities ofthe significant event language:

EVENT1: GT[runoff1, 20.0]EVENT2: GT[runoff2, 10.0]EVENT3: GT[runoff3, 15.0]EVENT4: AND[EVENT1,GE[Avg(rainfall1), 12.0],NOT[EVENT2],NOT[EVENT3]]EVENT5: AND[EVENT1, LT[Avg(rainfall1), 12.0], OR[EVENT2, EVENT3]]

In the expressions, predicates and connectives are shown in all capital letters withbrackets enclosing their arguments. Functions have only their first letter capitalizedwith their arguments enclosed with parentheses. Variables are shown in all lowercase.Figure 4 shows the region for which the significant events are defined. Four variables areused in the event definitions. Variables runoff1, runoff2 and runoff3 correspond to waterrunoff at points p1, p2, p3, respectively. The variable rainfall1 is defined over the region,r1, EVENT1, EVENT2 and EVENT3 represent the occurrences of water runoffs greaterthan the specified amount. For example, if the water runoff at point p1 is above 20.0 ft3/s,EVENT1 is triggered. EVENT4 and EVENT5 are more complicated events that showthe hierarchical and reusable nature of the significant event language. Hierarchy isattained by using previously defined events to define new events. This allows separationof definitions into more primitive parts making them easier to understand. EVENT4 andEVENT5 are used to isolate the cause of excessive water runoff at point p1. EVENT4 istrue if four conditions are met. First if EVENT1 occurs, second if the average rainfall isgreater than 12 mm/h, third if runoff does not occur at p2 and fourth if no runoff occursat p3. This represents the case of water runoff at point p1 caused by the rainfall in regionr1 and not from water runoff at point p2 or p3. In a similar way, EVENT5 representsexcessive water runoffs upstream causing the runoff at point p1 and not rainfall in theregion.

Page 9: Content-based visualization for intelligent problem-solving environments

FIGURE 4. Region of investigation for example events. Example significant events are in terms of the two-dimensional display. The display shows streams, wetlands, and ponds using a color map of orange, green andblue for each water type, respectively. Runoff variables relate to points p1, p2 and p3. Rainfall, infiltration and

evapotranspiration variables are defined in reference to region, r1.

CONTENT-BASED VISUALIZATION 417

3.2. EVENT DETECTION

Along with the language an evaluation scheme is needed to enable the use of significantevents. To maintain the logical structure of the expressions as well as to provide a formthat can readily be evaluated with the database, a parse-tree structure is developed forevent detection. Event expressions are parsed into the tree form which is then used toevaluate the event.

There are two specialized types of nodes in the parse-tree. Worker objects are used torepresent the predicates, and coordinator objects represent connectives. Worker objectsexecute a task defined by a represented predicate and each returns a Boolean value to itsparent node. A coordinator object collects the Boolean values from its children nodesand logically combines them to return a value to its parent node. Leaf nodes of the treeare either constants or variables.

The parse-tree is generated by assigning the first predicate or connective of the eventexpression to the root node, and then recursively making the parameters of the pre-dicates, connectives and functions children nodes of the parent node. The recursioncontinues until all leaf nodes are constants or data variables. Evaluating the tree beginsby replacing the defined variables with actual data using the specifications shown earlierfor such variables. Each branch is processed by invoking the functions and executing theworker and coordinator nodes. Figure 5 shows the parse-trees for two events, EVENT1and EVENT5, which are defined in the previous section. In Figure 5, coordinator objectsare denoted as circles and worker objects as rectangles. The first step of evaluation for

Page 10: Content-based visualization for intelligent problem-solving environments

FIGURE 5. Parse-tree examples. The parse-trees for EVENT1 and EVENT5 defined above allow the events tobe evaluated to a true or false.

418 J. YOST E¹ A¸.

Figure 5(a) involves replacing the runoff1 data variable with the actual data values. Theevaluation of the second parse-tree [Figure 5(b)] is done in the same manner with a fewmore steps.

The use of semantic events is the first step in creating a problem-solving environmentthat can deal with large dynamic databases. Important data patterns or trends aredefined as significant events, and subsequently the occurrences of these events aredetected, recognized, pertinent information displayed and/or focus of simulation modi-fied. Significant events provide IVA with the ability to selectively display data that are ofsemantic interest. In Pang (1991) and Neumann et al. (1995), semantic methods for dataselection are provided, but the region selection process is almost completely usercontrolled. They more or less provide a method for highlighting certain data patterns,whereas IVA provides a way of defining, recognizing and extracting the important dataautomatically.

Page 11: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 419

4. Cognitive data mapping

Cognitive data mapping maps data of interest to graphical presentations. The purpose ofscientific visualization is not simply to present data but to explore and gain insight intothe analysis and understanding of data being studied. To assist this aspect, cognitivemapping uses the knowledge about visual perception and graphic techniques along withan understanding of data characteristics to generate a graphical representation for thedata of interest.

A few other research works have dealt with automated generation of graphicalpresentations. One of the earliest works was APT (A Presentation Tool) by Mackinlay(1986). APT used a graphical language to describe visualization techniques. The lan-guage defined each technique in terms of expressiveness and effectiveness. The expres-siveness criteria are used to generate a list of candidate techniques, techniques that arecapable of expressing the data variable. The effectiveness criteria are then used to rankthe candidate list by how effective each technique is. However, APT’s inferencing doesnot deal with subjective criteria. Also APT does not address semantic issues in datamapping. Two other systems, AutoVisual by Beshers and Feiner (1993) and Vista bySenay and Ignatius (1994), address automated graphic generation by encoding expres-siveness and effectiveness criteria as a series of rules in a knowledge base. The rules areused to guide a depth-first state-space search which maps data to graphics. The difficultywith a depth-first search is that it does not consider all possibilities and the final mappingcannot be guaranteed to be optimal.

In order to develop effective graphical presentations for visualization, we have usedcognitive maps to develop an inferencing engine based on a decision-making paradigm.A cognitive map is a directed graph with the nodes representing the concepts of thedecision domain. The concepts are connected by arcs that indicate causal influencebetween nodes (Axelrod, 1972, 1976). The arcs are valued either positive or negative toshow a causally increasing or causally decreasing relationship, respectively. A causalrelationship between characteristics and properties increases or decreases the influencea property has on the generated visualization. To make an informed decision about howto map the variables to visualization techniques and presentations, this method based oncognitive maps considers simultaneously all factors and influences to determine theutility of the mapping choices. Based on the utility of the possible mappings, a relativeranking of available techniques is generated. These are subsequently used to constructan optimal final visualization scene based on a desired criterion (scene compositionapproach).

4.1. COGNITIVE MAPPING IN VISUALIZATION

Each variable that is to be visualized is represented by a cognitive map. The cognitivemap is a directed graph consisting of three types of nodes: data type, visual property andvisual technique, connected by either an expressiveness or effectiveness causal arc. Thecausal arc types correspond to MacKinlay’s evaluation criteria. Expressivenesscriteria/arcs indicate whether a technique is capable of encoding a data variable.Effectiveness measures how well the visual technique communicates with the user(MacKinlay, 1996). Expressiveness arcs represent quantitative relationships and are

Page 12: Content-based visualization for intelligent problem-solving environments

420 J. YOST E¹ A¸.

weighted #1 for a causally increasing relationship, !1 for a decreasing relationshipand 0 for no relationship. However, effectiveness arcs represent subjective, qualitativeproperties and therefore these arcs are weighted with real values to capture the subjectiv-ity. These weights represent relative importance as subjectively viewed by the user.Noncausal relationships (no arc present) are weighted as a 0 value. This value is treatedin a similar manner as any other weight. It can, therefore, be changed according to theuser or to the domain the map is being used for. This means that arcs can be added orsubtracted easily for any purpose. Thus, a cognitive map can be tailored to any user orany domain.

The data type and visual property nodes are used to characterize the data variable.The data type indicates the kind of data being visualized. There are five data types:nominal, ordinal, scalar, vector and tensor. A data variable will map to a single data type.For instance, consider a data variable representing temperature. This data variable isa set of numeric values. Therefore, it maps to the data type: scalar. The second type ofnode used to characterize the data variable is visual property. A visual property node isdescribed by a syntactic structure. The syntactic structure represents the relation mappedby the variable. For example, the temperature variable is mapped vs. longitude andlatitude. This relation can be represented by two syntactic structures: (x*y)Pz andx*y*z, where x and y are longitude and latitude and z represents the temperature. Thefirst structure shows a two-dimensional plot with z being mapped to a visual character-istic such as color or texture. The second structure is a three-dimensional plot withtemperature plotted along the z-axis.

Graphic techniques are characterized by a single node, the visual techniquenode. Every graphic tool available for visualizing the data variable is representedby its own visual technique node. The visual techniques are grouped together accordingto syntactic structure. The syntactic structure indicates the type of relation a techniqueis capable of representing. For instance, a surface diagram has the syntactic structurex*y*z since it is a three-dimensional plot. The pseudocolor image has thesyntactic structure (x*y)Pz since it is a two-dimensional plot with a third dimensionencoded by a color map. The syntactic structure of the visual techniques correspondsto the syntactic structure of the data variable. By using the syntactic structure,the data variable can be mapped to a set of visual techniques. For example, thetemperature variable maps to the syntactic structure (x*y)Pz. This means that visualtechniques with the same syntactic structure could potentially encode the temperaturevariable.

Figure 6(a) shows the expressiveness arcs for the temperature example. The temper-ature variable is linked to the data type node and visual property nodes by causallyincreasing expressiveness arcs. These nodes characterize the data variable. The visualproperty nodes have causally increasing arcs to all the visual techniques that have thesame syntactic structure. The data variable is mapped to visual techniques that canpossibly encode it through the visual property node. The data type node also maps to thevisual techniques. The causally decreasing link is used to eliminate visual techniques thatare mapped to the data variable through the syntactic structure, but are not capable ofrepresenting the variable’s data type. For example, in Figure 6(a), the data type nodeis negatively linked to the two-dimensional arrow plot node and the flow ribbonnode. These visual techniques are capable of representing the syntactic structure of the

Page 13: Content-based visualization for intelligent problem-solving environments

FIGURE 6. (a) This cognitive map shows the expressiveness arcs for the temperature variable. The structure ofthis map indicates which graphic techniques can encode the variable. (b) This cognitive map shows theeffectiveness arcs for the temperature variable. Effectiveness arcs are subjective and are defined by the users

preference.

CONTENT-BASED VISUALIZATION 421

temperature variable but both techniques encode vector data. The temperature variableis the type scalar and therefore the techniques should not be used.

Figure 6(b) shows the effectiveness arcs for same example. One should keep in mindthat effectiveness arcs are subjective and qualitative, thus reflecting user preferences andsemantic qualities. For this example the pseudocolor image is preferred to the contourplot, a three-dimensional plot is preferred to a two-dimensional plot, and a surfacediagram is preferred to the three-dimensional scatter plot. These preferences are reflectedin the effectiveness arcs.

Page 14: Content-based visualization for intelligent problem-solving environments

422 J. YOST E¹ A¸.

An additional concept used to characterize the data variable is the data semantic. Thedata semantic concept represents what the data mean. Often the meaning of the data willdetermine the best way to represent it visually. For instance, if the data variablerepresented water then the color blue is preferred to green when using a color map. Thedata semantic can also represent domain-specific preferences. The data semantic conceptis not represented as a node in the cognitive map. The data semantic concept is anadditional set of effectiveness arcs that are added to the cognitive map when the datasemantic concept is asserted.

Cognitive maps can be represented using an adjacency matrix. This matrix is a squaren]n matrix, where n is the number of nodes in the map. The value at position (i, j )represents the influence of node i on node j. The value at position (i, j ) is the value of the arcbetween the nodes or 0, indicating no arc exists between the nodes. Diagonal elements areall zero since a node will not influence itself (Axelrod, 1976; Kosko, 1986, 1993).

From the maps in Figures 6(a) and (b) an adjacency matrix can be generated (Axelrod,1972, 1976; McKenna, 1980). When constructing the matrix, the data variable itself onlyexcites the cognitive map, and thus it is not an integral part of the map. The matrix forthis example is

Rows/columns correspond to

1 2 3 4 5 6 7 8 9 nodes in the cognitive map

A"

0 0 0 0 0 !1 !1 0 0

0 0 0 1 1 1 0 0 0

0 !0.3 0 0 0 0 1 1 1

0 0 0 0 !0.5 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 !0.5 0

1

2

3

4

5

6

7

8

9

1. Data type node

2. Visual property node: x*yPz

3. Visual property node: x*y*z

4. Visual technique node: pseudocolor image

5. Visual technique node: contour plot

6. Visual technique node: two-dimensional arrow plot

7. Visual technique node: flow ribbons

8. Visual technique node: three-dimensional scatter plot

9. Visual technique node: surface diagram

The rows and columns of the matrix are ordered according to the concept type. Thereare a total of nine concepts, i.e. data type, visual property and visual techniques, in theexample. The first row and column of the matrix corresponds to the data type in thisexample, i.e. scalar. The visual properties x*yPz and x*y*z are the next concepts in thematrix; in this example x*yPz corresponds to row and column 2 and x*y*z correspondsto row and column 3. The last group of concepts are the visual techniques. In this example,the visual techniques are a pseudocolor image, contour plot, two-dimensional arrow plot,flow ribbons, three-dimensional scatter plot and surface diagram, corresponding to therow and columns 4—9, respectively. The value located at location (i, j ) in the adjacencymatrix indicates the influence of concept i on concept j.

4.2. INFERENCING USING A COGNITIVE MAP

Inferencing uses the knowledge stored within the cognitive map to map the data variablesto a final visualization. The inferencing process for a cognitive map is a series of

Page 15: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 423

state transitions (Kosko, 1993). The initial state contains an assertion of the variables thatshould be visualized. The final state is a vector containing the utility factors for variousvisual techniques. A utility factor is a measure of the value of a decision. The utility factorsare used to compose the semantically interesting variables into an optimal visualization.States are written in the form of an n length vector, n being the number of concepts in thecognitive map. State transitions are performed by iteratively multiplying the state vectorwith an adjacency matrix representation of the cognitive map:

qt`1

"qtA

The iterative process continues until the final state vector becomes zero.The initial state indicates which concepts are asserted by the data variable through the

expressiveness arcs, giving those concepts the value 1. All other concepts should be zero,since they have yet to be asserted by another concept. For the above example the initialstate vector is

qt"[1 1 1 0 0 0 0 0 0]

For the example the state transitions of the state vector all become zeros after the fourthiteration.

After the state transitions, each variable of semantic interest will have a utility vectorº associated with it. Based on the state vectors which are iteratively generated before thefinal state vector of 0 is reached, a utility vector is computed which represents thecumulative effects of the cognitive map on each visual technique. This vector containsa utility factor for each visual technique. The utility vector for each variable is calculated by

º"Cn+t/1

qtD C

where C is a matrix, [01] contains only ones and zeros and n is the number of state

transitions carried out in the iterative process. I is an identity matrix of dimension m,which m is equal to the number of visual techniques in the cognitive map. In our example,m"6 because there are six visual techniques used. The matrix C is used to isolate theutility factors for the visual techniques from the other concepts. The result of themultiplication is the utility vector for the variable, which contains the utility values for eachvisual technique relative to the specific variable being visualized. For our example, C is

Rows/columns correspond to4 5 6 7 8 9 nodes in the cognitive map

C"

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

1 0 0 0 0 0

0 1 0 0 0 0

0 0 1 0 0 0

0 0 0 1 0 0

0 0 0 0 1 0

0 0 0 0 0 1

1

2

3

4

5

6

7

8

9

1. Data type node

2. Visual property node: x*yPz

3. Visual property node: x*y*z

4. Visual technique node: pseudocolor image

5. Visual technique node: contour plot

6. Visual technique node: two-dimensional arrow plot

7. Visual technique node: flow ribbons

8. Visual technique node: three-dimensional scatter plot

9. Visual technique node: surface diagram

Page 16: Content-based visualization for intelligent problem-solving environments

424 J. YOST E¹ A¸.

As previously mentioned, there are four intermediate state vectors qtin this example. The

resulting utility vector is

º"[0.7 0.35 !0.3 0 0.5 1.0]

Individual elements in the resulting utility vector are utility factors, and theseutility factors rank each technique according to the effectiveness and expressivenesscriteria present in the map. According to this utility vector the best method forrepresenting the temperature variable is a surface diagram, which is the techniquethat corresponds to the last entry of the utility vector. The next best is a pseudo-color image, corresponding to the first entry, then a scatter plot and then the contourplot. The other two techniques are negative or zero. This indicates that they shouldnot be used. The cognitive map provides a mechanism for all influences to exertthemselves, and the result of these influences is seen in the utility factor (in º ) for eachtechnique.

4.3. COMPOSITION

Composition uses the utility factors calculated previously to assign a specific techniqueto each variable and combine those techniques into a single scene. When composinga scene the overall utility of the visualization should be maximized. The choices areevaluated according to utility level of a final composed scene. The product of thecomposition process is a scene definition. The complete scene definition is a listing of thevariables, the techniques used to represent the variables and the parameters of thevisualization techniques. The scene definition becomes a part of the control structurediscussed in the next section.

The composition process consists of three steps.

(1) All simple scene definitions are generated.(2) Techniques are grouped according to graphic composibility within each scene

definition.(3) The definition with the highest utility, weighted according to the approach used, is

selected and composed.

The first step in the composition process generates simple definitions from the sets oftechniques associated with each variable. All combinations of techniques are considered,using one technique from each set. For example, assume a visualization needs to begenerated for three variables: elevation, temperature and water concentration. Theavailable techniques are: (1) color map, (2) contour plot, (3) three-dimensional scatterplot and (4) surface diagram. The cognitive mapping process may reveal that alltechniques are capable of encoding each of the variables. Assume that the utility vectorsfor each variable are

elevation: [0.7 0.35 0.5 1]temperature: [0.7 0.21 !0.79 0.609]water: [1 0.65 !0.2 0.22].

Page 17: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 425

There are 48 possible scene definitions that can be generated from the four availablechoices. Examples of possible scene definitions are: [1 : 1 2 : 1 3 : 1], [1 : 1 2 : 2 3 : 3],[1 : 3 2 : 4 3 : 2], [1 : 4 2 : 2 3 : 1], etc. The simple definition is written with the variablenumber followed by a colon and the technique number. For example, the seconddefinition listed would indicate that elevation would be encoded by a surface plot,temperature by a color map and water concentration by a contour plot.

The second step groups the techniques within scene definitions according to graphiccomposibility. Graphic composition is possible only if the variable relations beingcomposed have the same dimensionality and are defined over the same space. If theseconditions are met then it is possible to combine the individual graphic representationsinto a single representation that encodes all the relations at once. If the above conditionsare not met it is necessary to create multiple displays in order to show all the variables.Techniques that can be composed are grouped together. A technique that cannot becombined forms another separate group, indicating that it requires a separate display.For example, the definition [1 : 1 2 : 1 3 : 1] is divided into three display groups, sincemost techniques cannot be composed with themselves in the same display. The modifieddefinition looks like [1 : 1, 2 : 1, 3 : 1]. The techniques in the definition [1 : 1 2 : 2 3 : 3]can all be combined; therefore, the scene definition looks the same, having only onedisplay group.

The third step associates a utility factor with each scene definition. The utility value fora scene definition is calculated from the utility factors of the techniques in the definitionweighted according to one of three approaches. The approaches are as follows.

(1) Maximize variable utility.(2) Maximize scene utility.(3) Scene reduction.

The first idea attempts to maximize the utility of each variable. The idea behind thisapproach is that it is better to use the technique with the highest utility for each variablerather than sacrifice individual utility for overall scene utility. The second approachattempts to maximize the utility of the entire scene. The idea behind this approach is thatthe utility for individual techniques can be lower than maximum if it means the utility ofthe overall scene will increase. The third approach attempts to generate as few displays aspossible. The idea with this approach is that cluttered scenes are less effective, therefore itis better to have as reduced a scene as possible.

For example, if variable maximization is used then the utility value, called theprincipal utility factor, is the sum of the techniques that comprise the definition. Noadditional weight is given for graphic composition. For example, the principal utilityfactor for the scene [1 : 4 2 : 1, 3 : 1] would be 2.7. Under the other approaches theprincipal utility factor is different and reflects the number of displays needed. Thedefinition with the highest utility factor, relative to the composition approach used, isselected.

Once a scene definition is selected, the complete scene definition is generated. Thecomplete definition is a display specification that will be used by the graphics engine toconstruct the scene. In the above example two display specifications are generated. Thefirst scene definition encodes two variables using a surface diagram and a color map. Thisis specified as a three-dimensional display with the height determined by the elevation

Page 18: Content-based visualization for intelligent problem-solving environments

426 J. YOST E¹ A¸.

variable and the surface color mapped according to temperature. The second display willshow a two-dimensional color map of the water levels. In our implementation thegenerated specifications look like the following:

two-dimensional display(water level)three-dimensional display(SURFACE"temperature, HEIGHT"elevation)

These specifications are used by the control structure as discussed in the next section (seeSection 5).

5. Flexible control

The previous two sections developed methods for semantically selecting important dataand intelligently mapping pertinent or desired data sets to graphical presentations. Toeffectively use these methods to create an event-driven problem-solving environment,IVA needs a flexible control structure. The control structure needs the capability to bedriven by significant events and react by providing appropriate responses. Responsesinclude generating graphical presentations, executing visualization plans developed bythe mapping process and/or steering the simulation focus. The control structure isdeveloped through the use of behavior modeling and implemented using programmable,hierarchical finite-state machines (Ehrman, Marefat & Yost, 1994).

5.1. BEHAVIOR

A first step in creating a control structure which can have a complex behavior is toconsider behavior from a hierarchical point of view. A behavior can be looked at fromvarious levels of abstraction. Each simple perception-action is a primitive behavior(Draper, Fennema, Rochwerger, Riseman & Hanson, 1994), and a set of primitivebehaviors can be combined to create a complex behavior. In terms of the hydrology/ecology application being discussed, a behavior could be reacting to a runoff eventthrough a complex process involving many steps or by a combination of primitivebehaviors such as: display a hydrograph for surface runoff at point (200, 200) displaya two-dimensional color map of the water level for the whole sub-region, and stop therain input to the simulation. Each step is a generic primitive behavior (i.e. creating a two-dimensional color map), but the combination of the primitive behaviors results ina complex behavior.

By modeling primitive generic behaviors, and combining instances of these primitivebehaviors, the system is able to generate a more complex and desired behavior. Primitivebehaviors are needed for every simple action that might be desired. The example in theprevious paragraph included a few primitives such as creating a two-dimensionaldisplay, creating a hydrograph and stopping the simulation. In general, for the visualiz-ation agent, primitives for creating displays, interfacing with the simulation and interac-ting with the user are needed. Using generic primitives facilitates development ofa system with dynamic and modifiable behavior because the primitives can be combinedin various ways creating different behaviors. Once facilities for a set of primitives havebeen developed, adding to the primitives in the set is straightforward and it can result inan extensible system.

Page 19: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 427

5.2. PROGRAMMABLE, HIERARCHICAL FINITE STATE MACHINE

The control structure is modeled as a finite-state machine (FSM). There are two parts inmodeling the behavior as described above using finite-state structures. Events must bemapped to control structure transitions or changes in state, and the actions whichinvolve interacting with the graphics engine and simulation need to be mapped to thestates. A formal definition for the FSM (Hopcroft & Ullman, 1979) is

FSM"SQ,&, d, q0T

where Q is the set of all possible states; & the set of all possible inputs; d the transitiontable; and q

0the initial state

IVA uses a specialization of an FSM (as described above) to model behaviors. Thiscontrol structure will enable IVA to work in an intuitive manner. The nodes of thecontrol structure describe the actions the system will be executing when it is in the givenstate or node. Events are followed by reactions to events which describe semanticallyinteresting data in visual terms to the user and/or steer the simulation. These reactionsare represented and encoded in the nodes of the control structure. Reactions may becomposed of several individual actions such as creating individual displays and/orchanging simulation parameters. Two types of inputs effect the FSM. First, the generateddata through recognized significant events and second, the user input. Automatictransitions are also implemented, and they are triggered by primitives that call a nextprimitive as soon as they complete executing their task. Generic primitives are represent-ed and encoded by scripts. Scripts are used to specify the tasks to be executed by a genericprimitive. The script format is similar to the BDL developed by Draper, Fennema,Rochwerger, Riseman and Hansen (1994), the three-dimensional map script for example,causes a three-dimensional display to be created upon its execution. Appropriate scriptsare invoked by the control structure (FSM) when a transition from a state to anotherstate takes place. The invoked script is an instance of the primitive and performs thedesired actions while that particular FSM state is active.

5.3. EXAMPLE

Consider the hydrology/ecology application which has been discussed so far. Threeevents are of interest. First, a flood that occurs in region 1 which is a housing develop-ment. Second, a flood that occurs in region 2 which is uninhabited. Third, the ending ofthe rain is an event which indicates the simulation needs to be stopped. If the first eventoccurs, a three-dimensional map of the elevation and water level over region 1, anda two-dimensional map of water levels and housing information over region 1 need to bedisplayed. If event two occurs, the elevation and water level over region 2 needs to bedisplayed. If event 3 occurs, the simulation is halted. From a user’s perspective, thedesired behavior is defined by specifying reactions and relating reactions to events. Thesystem uses the user input to generate, automatically, the appropriate FSM capturing thedesired behavior. Reactions can be fully specified graphics commands, simulation scen-arios or a list of pertinent data variables to be graphically presented. In the third case, theuser defines the pertinent data to be shown in response to an event, but the user doesnot specify graphical details, display types or parameters for presentation. The cognitive

Page 20: Content-based visualization for intelligent problem-solving environments

FIGURE 7. An example user-input specification. The specification is used by the system to create the FSM forcontrol shown in Figure 8.

FIGURE 8. FSM, for example, shown in Figure 7.

428 J. YOST E¹ A¸.

mapping procedure generates the appropriate graphics commands which are executedby the graphics engine in this case. Figure 7 shows the user-entered specifications for thisparticular example. Three event reactions are defined relating to events e1, e2 and e3. Thefirst is a user-specified visual response. In the second, the user has specified what datashould be displayed. The cognitive mapping defines a visual response for this case. Thethird reaction stops the simulation. Figure 8 shows the resulting FSM generatedautomatically by the system which captures and executes the desired system behavior.

6. Implementation and experimentation

6.1. IVA’S ARCHITECTURE

IVA interacts with the database to determine the occurrence of significant events. Theinteraction between IVA and the simulation allows for simulation steering as IVA can

Page 21: Content-based visualization for intelligent problem-solving environments

FIGURE 9. Abstract IVA architecture. IVA receives information from the user, and then interacts monitors thedatabase for certain data. If the data are found IVA sends commands to the visualizer or simulator.

CONTENT-BASED VISUALIZATION 429

change simulation parameters or update whole scenarios. By interfacing with thegraphics engine, IVA can design and display visualizations. Figure 9 shows the architec-ture developed for IVA.

The definitions module holds information about the semantically interesting data andabout the types of reactions to be executed. If an event occurs the data monitor sendsa message to the reaction executor regarding the event. The reaction executor executesthe desired behavior. Reactions consist of creating displays, steering the simulation andchanging the structure of IVA’s control system or event definitions. High-level com-mands are sent by the reaction executor to the graphics engine which creates theappropriate graphics display. The graphics planner uses the described cognitive mappingto design visualizations that intuitively display the data. The planner is invoked if theuser has specified the data to be displayed but has not specified the graphics commandsto display the data.

IVA is an integrated system; however, each component is maintained modularly andindividually. Each component of IVA is responsible for maintaining its own structuresand tools. Therefore, the automatic graphics planner will store and manage all thecognitive maps needed for data mapping. The data monitor is responsible for theparse-trees used to detect significant events. The FSMs are stored and maintained by thereaction executor.

IVA is developed as a component of a hydrology problem-solving environment. Theoverall architecture of this problem-solving environment is shown in Figure 10. A dis-crete event simulation (DEVS) engine is implemented on a CM-5. GIS databases arestored on a Sun Sparc 10 workstation along with data handler objects which allow accessto the data. IVA and the graphics engine have been developed on a Silicon GraphicsIndigo2 workstation. IVA and the graphics engine are implemented using object-oriented design in C##. Motif, OpenGL and RPC libraries were used to create

Page 22: Content-based visualization for intelligent problem-solving environments

FIGURE 10. High-performance computing environment for ecosystem simulation. The HPCC project is builton a heterogeneous platform of computers exploiting the advantages each machine architecture has to offer.

430 J. YOST E¹ A¸.

a client—server system. IVA is the server and graphics displays are clients. Data andcontrol signals are passed between components of the system via PVM (parallel virtualmachine).

6.2. EXPERIMENTATION WITH GIS DATA

This example shows the visualization system operating with a GIS database. Brown’sPond is an area in New England and the data contain information about the region’saltitude, water, roads, trees, building and soils. The data are represented by a series of180]180 maps with a resolution of 50 m in each map axis. Figure 11 shows an examplefrom these data which show the roads and elevation data. The goal of the experiment wasto investigate the effect of water runoff on bridges and buildings in the area which theyare located.

Figure 12 shows the inputs to IVA. The variables runoff1, runoff2 and runoff3represent runoff at regions A, B and C in Figure 11. Two events have been defined.EVENT1 is triggered if there is significant water runoff in regions B and C (representinga runoff that takes place by water flow coming from the southwest side of the valley).EVENT2 is triggered if there is significant water runoff in regions A and C (representingrunoff from the northeast side of the valley). Figure 13 shows the parse-tree for significantevents EVENT1 and EVENT2. If EVENT1 is detected, a two-dimensional color-mapshowing water runoff over the rectangular region of (0, 0)—(180, 180) will be displayed. IfEVENT2 is detected, a three-dimensional color-map showing water runoff over therectangular region of (0, 0)—(180, 180) with the altitude mapped to the z-coordinatewill be displayed and the simulation will be focused on region C with an increase in

Page 23: Content-based visualization for intelligent problem-solving environments

FIGURE 11. Two-dimensional map of Brown’s Pond data. Elevation, building, and road information data arecontained in the display. A, B and C show regions of interest during the simulation. Events are defined to

monitor these areas.

CONTENT-BASED VISUALIZATION 431

resolution to 25 m2 cells. The FSM controlling the system behavior which is automat-ically constructed from the input in Figure 12 is shown in Figure 14. The simulation wasrun in conjunction with IVA for two separate rain scenarios.

6.2.1. Event1In the first scenario, rain input was centered in the south central area (the area aroundpoint B) of Brown’s Pond. As the simulation progresses, the rain model continuallyupdates the runoff data. The water begins flowing into the valley and builds upsufficiently in regions B and C (Figure 11) to trigger EVENT1.

As a result, a two-dimensional color map is generated. This map, which is shown inFigure 15, shows the water runoff in shades of red color. The intensity of the colormatches the amount of runoff. The runoff flowed down the valley as expected, and is thelargest near the east central region corresponding to the lowest elevation. The elevationand road data are also shown in the map. The figure shows that both regions B and C areaffected with region C most heavily affected. By using tools provided with the displayexact flooding amounts can be determined. Using this information, an evaluation can bemade whether a new bridge in region B could withstand a rainfall with characteristicssimilar to the input.

Page 24: Content-based visualization for intelligent problem-solving environments

runoff1 : source"runoff1.datip"255.255.255.255coord[30,40,80,90,0,0]Tstart"0Tend"200

runoff2 : source"runoff2.datip"255.255.255.255coord[125,135,150,170,0,0]Tstart"0Tend"200

runoff3 : source"runoff3.datip"255.255.255.255coord[125,135,100,110,0,0]Tstart"0Tend"200

EVENT1:AND[GT[Avg(runoff1),2.32], GT[Avg(runoff2),2.231],NOT[GT[Avg(runoff3),10.0]]];

EVENT2:AND[LT[Avg(runoff1),2.32], GT[Avg(runoff2),2.231],GT[Avg(runoff3),1.12]];

EVENT:EVENT1SHOW: 2D(runoff(HIGH"red, LOW"black,

elevation(HIGH"green, LOW"black,roads(ROADS"pink, EMPTY"black), 0, 0, 180, 180)

EVENT:EVENT2SHOW:3D(SURFACE"runoff(HIGH"red, LOW"black),

elevation(HIGH"green, LOW"black),HEIGHT"elevation,0, 0, 180, 180)

SIMULATE: (resolution, 25) (region, C)

FIGURE 12. Event and control information. Three variables are defined to represent runoff in areas A, B andC of Figure 11, two events are defined based on the amount of runoff at these areas, and reactions to the two

events are defined to specify IVA’s behavior.

432 J. YOST E¹ A¸.

6.2.2. Event2A second experiment was carried out with a different rain scenario. For this case, raininput modeled a rainfall occurring in the northeastern area (the area around point A) ofBrown’s Pond. During this experiment, with this rainfall scenario, the significant event,EVENT2, is detected and a three-dimensional map is created. This three-dimensionalmap is shown in Figure 16. Elevation is mapped to the z-coordinate of the three-dimensional display, and elevation and water runoff are color-mapped to the surface.

Page 25: Content-based visualization for intelligent problem-solving environments

FIGURE 13. Parse-trees of Experiment 2 events. The parse-trees generated are used by IVA to determine if anevent has occurred.

CONTENT-BASED VISUALIZATION 433

The elevation is color-blended between green and black, and runoff is color-blendedbetween blue and black. The water flows from the storm area to the valley. As the runoffnears the valley and the output of Brown’s Pond the water builds up more andpotentially causes flooding as can be seen from the figure. In this case regions A andC are affected by the rain storm. Again, using the provided graphic tools the exact runoffamount can be determined for the regions under investigation. Region C is equallyaffected in both rainfall scenarios.

In addition to creating the display, the event reactor alters the simulation parameters.When EVENT2 is detected the simulation region and resolution are changed to focus onregion C with a higher resolution. This is an example of simulation steering. The user iscapable of altering simulation parameters based on the occurrence of significant events.

Page 26: Content-based visualization for intelligent problem-solving environments

FIGURE 14. Control structure for Experiment 2. The FSM determines the response of IVA. Depending on thedata, one of two displays will be created.

434 J. YOST E¹ A¸.

IVA allows a larger number of scenarios to be simulated more effectively andmore efficiently than a brute force search. In the case of the above experimentmany rainfall scenarios can be investigated. The rainfalls can be varied by region,by intensity and by time duration in order to study the effect on the land andother structures. Hence, the system can support a better analysis and decision-makingcapability.

6.3. USER EVALUATION

Three researchers in hydrology and GIS tested and evaluated IVA. Each user waspresented with the two scenarios described in Section 6.2. The users were asked toevaluate the usefulness of IVA in exploring these scenarios and to compare IVA totraditional research methods. Each user was given a set of questions to address in theirevaluations. The results of the testing and evaluations are described in the remainder ofthis section and are summarized in Figure 17.

The first two tests and questions dealt with the speed and accuracy of IVA. Two eventswere defined. Each of these events monitored runoff data for significant water levels inspecified regions (details of the events are given in Section 6.2). The users were asked tocompare the ability of IVA’s data monitoring and event detection to traditional methods.Traditional methods to these researchers generally meant a brute force search of the GIS

Page 27: Content-based visualization for intelligent problem-solving environments

FIGURE 15. Reaction to event1 of the experiment. Water runoff and elevation information data are displayed.The water flows from the rainfall area down through the valley. Regions B and C are affected by the runoff withregion C most heavily affected. This information may be particularly useful in design of bridges, roads and

buildings.

CONTENT-BASED VISUALIZATION 435

data for characteristics associated with the semantic events. There was a consensus(as shown by the three user columns) among the users that IVA was faster andmore accurate at detecting events. Also, IVA eliminated subjectivity in what thehydrologic event meant to different researchers by explicitly defining the semanticevents.

The next two tests and questions dealt with the event language. The users were shownhow to define events, variables and event reactions. They were asked to evaluate whetherthe language was sufficiently expressive to meet their needs as researchers needing to useIVA and how difficult the language was to use. The users thought the event language wasadequate for the needs of most users but that it could be more complete. User 1 alsothought the language could be more user-friendly.

The remaining questions dealt with the efficiency of IVA. All users thought that theintegrated problem-solving environment made IVA more efficient than traditionalmethods. Specific reasons cited in addition to speed and accuracy were the ability to viewevents as they occurred in the simulation and the ability to alter the simulationparameters immediately upon detection of an event to refocus the simulation.

Page 28: Content-based visualization for intelligent problem-solving environments

FIGURE 16. Reaction to the second rainfall scenario in the experiment. In this case significant event EVENT2occurs which generates a three-dimensional display. Water runoff and elevation information are displayed.

Regions A and C are affected as water flows from area A down to the valley.

436 J. YOST E¹ A¸.

6.3.1. User 1User 1 is a university professor with extensive experience in computational scienceincluding GIS and visualization. He has performed research in natural resources,hydrology and watershed models for more than a decade.

User 1 described four benefits of IVA. First, IVA provides speed and accuracy infinding interesting events. Second, the event language provides a common basis fordefining events, enabling scientists to communicate better with other scientists andnonscientists. Third, the event detection coupled with the reaction executor were valu-able for pursuing specific research goals. Last, the automatic graphic planner was cited asproviding clear intuitive visualizations that are very useful for better understanding thedata.

User 1 indicated two areas of IVA that need improvement. First, whereas the eventdetection is valuable for specific goals, it does not provide significant benefit when usingIVA for more general observations which do not have defined events. Second, the eventlanguage is merely adequate for defining events. The event language could be moreuser-friendly, using natural language and fuzzy definitions.

6.3.2. User 2User 2 is a Ph.D. candidate with significant research experience in hydrology modeling.He has fairly limited experience with computer programming and visualization.

Page 29: Content-based visualization for intelligent problem-solving environments

Questions User 1 User 2 User 3

How does the search time for semantically Faster Faster Fasterinteresting data compare to traditionalmethods?

FasterSameSlower

How does the accuracy in finding interesting More accurate More accurate More accuratedata compare to traditional methods?

More accurateSameLess accurate

How difficult is it to define events? Adequate Not difficult Not difficultNot difficultAdequateVery difficult

Is the language able to express desired events? Adequate for Adequate for Able to expressAble to express all desired events needs needs most desiredAble to express most desired events eventsAdequate for needsUnable to express desired events

How does research efficiency using significant More efficient More efficient More efficientevents compare to traditional methods?

More efficientSameLess efficient

FIGURE 17. Table shows a summary of responses of users evaluating IVA. Users’ responses indicate that usingIVA provides more efficient and accurate research. The area most in need of improvement is the completeness

of the significant event language.

CONTENT-BASED VISUALIZATION 437

User 2 described three benefits of IVA over his experience with traditional methods.First, IVA is fast and accurate for detecting specific events. Second, IVA associates expertopinion with real data and simulation models which is valuable for decision making.Third, the visualization tool provides clear description of the data and enables directcomparison of multiple scenarios.

User 2 indicated that the event language needs some improvement. The languageprovides a good framework for defining events; however, it is not complete. The languageshould provide simple mechanisms for extending functionality.

6.3.3. User 3User 3 is a Ph.D. in renewable natural resources and hydrology. His research focus is onhydrologic modeling and simulation. He has moderate experience with programmingand visualization.

User 3 stated that the primary benefits of IVA over traditional methods are speed,efficiency and ease of use. Other benefits of IVA described by user 3 were first theability to focus on specific questions. The event detection mechanism of IVA enablesthe user to focus directly on specific research goals. Second, the events are detected

Page 30: Content-based visualization for intelligent problem-solving environments

FIGURE 18. This graph shows the users’ ratings of IVA vs. their computer experience. All users had similarnatural resource and hydrology scientific knowledge. The first bar for each experience level represents threequestions of search time, accuracy in finding data of interest and research efficiency. These three attributes allhad the same highest-level response. : Questions 1, 2 and 5; : question 3; : question 4. * The answeraxis quantifies the response choices listed with each question in Figure 17. The value 3 represents the highestresponse, such as faster, more accurate, etc. Other responses are quantified relative to 3 and the number of

responses available.

438 J. YOST E¹ A¸.

as the simulation is running. This allows the user to directly associate the systembehavior with the simulation model. IVA could also be used as a decision supportsystem.

6.3.4. Summary of experimentation and conclusionsThe results of the testing support this paper’s assertion that IVA provides a moreeffective research environment. Figure 18 shows the users’ ratings of IVA vs. the users’computer experience. The y-axis shows a quantified value for the users’ responses to thetest questions. Each of the users’ responses were quantified between 0 and 3, relative tothe number of responses available. The x-axis is the users’ level of computer experience.The first bar in the graph represent the responses to questions 1, 2 and 5. These questionsdealt with search time, search accuracy and research efficiency, respectively. All three ofthese questions had the same highest level response for all users and are therefore shownas a single bar. The graph shows that IVA provided a faster, more accurate and moreefficient research environment, regardless of computer experience. For the less experi-enced computer users, IVA’s significant event language was simple to use. The moreexperienced user stated the event language should be made more user-friendly byincorporating natural language conventions. He also tied the complexity of the languagewith the completeness of the language. All levels of users stated that the language couldbe more complete.

All users, with varied levels of experience, agreed that IVA was faster, more accurateand more efficient than traditional research methods. Also, IVA’s integrated problem-solving environment provides additional capabilities such as automatic visualizationgeneration and simulation steering. The area indicated in most need of improvement wasthe significant event language. The users stated that the language was adequate for

Page 31: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 439

meeting researcher’s needs for defining events. However, the users thought the languageneeded to be more complete.

7. Summary and conclusions

This paper has presented an integrated problem-solving environment. Such an environ-ment provides the scientist with needed information contained within the data beingstudied. It is the goal of obtaining information and knowledge from raw data andenabling the scientist to understand and use this information to control the simulationthat is the focus of IVA. Three developments are made to reach this goal. First,a significant event language is used to extract useful information from large dynamicdatabases. In such environments more data are generated than can effectively beanalysed and understood without machine assistance, and much data are semanticallyinsignificant. Significant event definition provides a mechanism for recognizing anddisplaying semantically important data. Second, an automatic graphics planner is usedto create informative visualizations to assist in understanding the data. Even for thosewho have the experience with generating presentations and visualizations the task couldbe tedious. A graphics planner was presented that exploited perceptual knowledge aboutmapping data to graphics and problem-specific knowledge containing insight aboutthe data to design visualization plans. Third, a dynamic control structure is introducedto enable the user to use the information provided by the visualization to focusthe simulation as needed. The typical paradigm for a simulation is to input a certainscenario, run the simulation and then view the results. If the desired results arenot achieved or the desired resolution of results is not achieved, the simulation isrerun. This approach is not very efficient. A better model allows simulation steering. Inthis way the state of the simulation can be monitored and the simulation focus adjustedas needed.

The significant event description language developed attaches semantic meaning todata patterns and/or trends. Once specified, data can be searched for and tasks areexecuted based on semantic data content. Significant event mechanisms allow machineunderstanding of data, and thus machine assistance in viewing and interpreting data.They allow large dynamic databases to be searched, viewed and analysed more effici-ently. Event specification also helps the user focus on a specific research task byhighlighting only semantically interesting data. The data set is reduced to informationabout the task at hand.

The graphics planner uses intelligent visualization techniques to create graphicpresentations intelligently and takes care of tedious details. Graphical presentationplans are generated based on cognitive maps. Cognitive maps consider subjective andsemantic influences to make an optimal quantitative decision about how to representthe data.

Dynamic programmable FSMs are used to control the integrated problem-solvingenvironment. Events are the arcs in these FSMs, and scripts are executed when the nodesof an FSM are activated. Scripts encode the generic tasks to be executed by a primitive.Tasks include creating displays, redefining simulation scenarios, etc. The arcs and nodesof the FSM model a certain desired behavior. This approach allows the behavior of thesystem to be extensible and modifiable without the need for reprogramming.

Page 32: Content-based visualization for intelligent problem-solving environments

440 J. YOST E¹ A¸.

References

AGARWAL, S. & MITRA, S. (1994). Specification and automated implementation of coordinatedprotocols in distributed controls for flexible manufacturing cells. In Proceedings of the 1994IEEE International Conference on Robotics and Automation, pp. 2877—2882. San Diego, CA.

AXELROD, R. M. (1972). Framework for a general theory of cognition and choice. Ph.D. Thesis,Institute of International Studies, University of California, Berkeley, CA, USA.

AXELROD, R. M. (1976). Structure of Decision. Princeton, NJ: Princeton University Press.BESHERS, C. & FEINER, S. (1993). AutoVisual: rule-based design of interactive multivariate

visualizations. IEEE Computer Graphics & Applications, 13, 41—49.BRODLIE, K., POON, A., WRIGHT, H., BRANKIN, L., BANECKI, G. & GAY, A. (1993)

GRASPARC — A problem solving environment integrating computation and visualization.Proceedings of IEEE »isualization ’93, pp. 102—109. San Jose, CA, October.

CHARNIAK, E. & MCDERMOTT, D. (1985). Introduction to Artificial Intelligence. Menlo Park, CA:Addison-Wesley.

DEFANTI, T. A. & BROWN, M. D. (1991). Visualization in scientific computing. Advances inComputing, 33, 247—305.

DRAPER, B., FENNEMA, C., ROCHWERGER, B., RISEMAN, E. & HANSON, A. (1994). Integration fornavigation on the UMASS mobile perception lab. Proceedings of AIAA/NASA Conference onIntelligent Robots in Factory, Field, and Service, pp. 473—482. Houston, TX, March.

DYER, D. S. (1990). A dataflow toolkit for visualization. IEEE Computer Graphics & Applications,10, 60—69.

EHRMAN, J., MAREFAT, M. M. & YOST, J. (1994). Intelligent visualization agents and managementof large scale ecosystem simulations. Proceedings of Decision Support 2001, pp. 463—472.Toronto, Canada, September.

GOMI, T. & LAURENCE, J. (1993). Behavior-based AI techniques for vehicle control. Proceedings ofIEEE-IEE »ehicle Navigation and Information Systems Conference, pp. 555—558. Ottawa,Canada, October.

HOPCROFT, J. & ULLMAN, J. (1979). Introduction to Automata ¹heory, ¸anguages, and Computa-tion. Reading, MA: Addison-Wesley.

KAUFMAN, A. E. (1994). Visualization. IEEE Computer, 28, 18—19.KHOROS GROUP (1992). Khoros ºser’s Manual. Department of ECE, University of New Mexico.KOSKO, B. (1986). Fuzzy cognitive maps. International Journal of Man-Machine Studies, 24, 65—75.KOSKO, B. (1993). Virtual worlds as fuzzy cognitive maps. 1993 IEEE Annual Virtual Reality

International Symposium, pp. 471—477. Seattle, WA, September.LANG, U., LANG, R. & RUHLE, R. (1991). Integration of visualization and scientific calculation in

a software system. Proceedings of IEEE »isualization ’91, pp. 268—273. San Diego, CA,October.

MACKINLAY, J. (1986). Automating the design of graphical presentations of relational information.ACM ¹ransactions on Graphics, 5, 110—141.

MCCORMICK, B. H. et al., Eds. (1987). Visualization in scientific computing, Computer Graphics,21, 1—14.

MCKENNA, C. K. (1980). Quantitative Methods For Public Decision Making. New York, NY:McGraw-Hill.

NEUMANN, U., YOO, T. S., FUCHS, H., PIZER, S. M., CULLIP, T., RHOADES, J. & WHITAKER, R.(1995). Achieving direct volume visualization with interactive semantic region selection.Technical Report TR91-012, Department of Computer Sciences, University of NorthCarolina—Chapel Hill, CA, USA.

PANG, A. (1994). Spray rendering. IEEE Computer Graphics & Applications, 14, 57—63.SENAY, H. & IGNATIUS, E. (1994). A knowledge-based system for visualization design, IEEE

Computer Graphics & Applications, 14, 36—47.SILICON GRAPHICS CORPORATION (1991). Explorer’s ºser’s Guide. Mountain View, CA.UPSON, C., FAULHABER, T. A. JR., KAMINS, D., LAIDLAW, D., SCHLEGEL & D. VAN DAM, A. (1989).

The application visualization system: a computational environment for scientific visualiz-ation. IEEE Computer Graphics & Applications, 9, 30—42.

Page 33: Content-based visualization for intelligent problem-solving environments

CONTENT-BASED VISUALIZATION 441

YOST, J., MAREFAT, M. M. & KIM, J. (1995). Dynamic, simulation-integrated, intelligent visualiz-ation: methodology, and applications to ecosystem simulation. Proceedings of 6th IFAC/IFIPSymposium on Man-Machine Systems. Cambridge, MA, July.

YOST, J. (1995). From graphical algorithms to intelligent visualization agents: architecture,methodology, and application to eco-system simulation. Master’s Thesis, Electrical andComputer Engineering Department, University of Arizona, Tucson, AZ, USA.

Paper accepted for publication by Associate Editor, Dr. M. Linster.

.