Phenomena – A visual environment for querying heterogenous spatial data

Elsevier Editorial System(tm) for Journal of Visual Languages and Computing

Manuscript Draft

Manuscript Number: JVLC08-19

Title: Phenomena - A Visual Environment for Querying Heterogenous Spatial Data

Article Type: Full Length Article

Section/Category:

Keywords: Visual environments; visual query languages; geographic information systems; continuous fields;

usability evaluation.

Corresponding Author: Dr. Monica Sebillo,

Corresponding Author's Institution: Università di Salerno

First Author: Monica Sebillo

Order of Authors: Monica Sebillo; Luca Paolino; Genoveffa Tortora; Giuliana Vitiello; Robert Laurini

Abstract: The need to perform complex analysis and decision making tasks, has motivated growing interest

in Geographic Information Systems (GIS) as a means to compare different scenarios and simulate the

evolution of a phenomenon. However, data and function complexity may critically affect human interaction

and system performances during planning and prevention activities. This is especially true when the

scenarios of interest involve continuous fields, besides discrete objects.

In the present paper we describe the visual environment Phenomena, where continuous and discrete data

may be handled through a uniform approach. We illustrate how users' activity is supported by a visual

framework where they can interact with, manipulate and query heterogeneous data, with a very small

training effort. A preliminary experimental study suggests that when users perform complex tasks, a higher

usability degree may be achieved compared to the adoption of a textual spatial SQL.

1

Dear Editor-in-Chief, Dr. Chang,

We would like to indicate as possible reviewers for the manuscript “Phenomena – A Visual Environment for Querying Heterogenous Spatial Data,” we have submitted to JVLC the following researchers:

1) Angela Guercio, Kent State Univ., USA, [email protected]) Erland Jungert, Swedish Defence Research Agency, Linköping, er-

[email protected]) Alfonso Cardenas, University of California Los Angeles, USA carde-

[email protected]

Thank you for your kind attention. Best regards,

Monica Sebillo, corresponding author

1

Phenomena – A Visual Environment for Querying

Heterogenous Spatial Data Luca Paolino1 , Monica Sebillo

1, *, Genoveffa Tortora1, Giuliana Vitiello1,

Robert Laurini2 1 Dipartimento di Matematica e Informatica, Università di Salerno,

via ponte don Melillo 84084 Fisciano (SA), Italy +39 (0)89963324

{lpaolino, msebillo, tortora, gvitiello}@unisa.it 2 LIRIS – INSA de Lyon

69621 – Villeurbane Cedex - France +33 472438172

[email protected]

*Corresponding Author

2

Abstract. The need to perform complex analysis and decision making tasks, has motivated

growing interest in Geographic Information Systems (GIS) as a means to compare different

scenarios and simulate the evolution of a phenomenon. However, data and function com-

plexity may critically affect human interaction and system performances during planning

and prevention activities. This is especially true when the scenarios of interest involve con-

tinuous fields, besides discrete objects.

In the present paper we describe the visual environment Phenomena, where continuous

and discrete data may be handled through a uniform approach. We illustrate how users’ ac-

tivity is supported by a visual framework where they can interact with, manipulate and

query heterogeneous data, with a very small training effort. A preliminary experimental

study suggests that when users perform complex tasks, a higher usability degree may be

achieved compared to the adoption of a textual spatial SQL.

3

1 Introduction

Geographic Information Systems (GIS) are raising growing interest among environmental ex-

perts and territorial organizations, for their ability to manage spatial data [20, 21]. As a matter of

fact, several application domains may benefit from this capability, which results crucial when it

is meant to perform complex analysis tasks and provide decision making support. Indeed, pro-

viding professionals and researchers with a means to compare different scenarios and simulate

the evolution of a phenomenon may represent a significant support to planning and prevention

tasks, where data and function complexity affects human interaction and system performances.

Motivation

In order to reach this aim, the present research faces different challenges. First of all, the speci-

fication of more and more advanced spatial analysis procedures is a crucial requirement, which

arises from the complexity of real-world scenarios. Moreover, in order to handle the heterogene-

ity of spatial data, that may refer to both discrete objects and continuous fields, a uniform

framework has to be defined, where the specified data structure may be queried in agreement

with the standard SQL formalism, properly extended to spatial specifications. Last but not least,

much attention has to be devoted to design systems, that are able to effectively support experts’

activity through usable yet powerful interfaces, embedding all typologies of functionalities and

present them through a uniform approach, so hiding the inner complexity of data and functions.

As for the first issue, expert user communities are involved in the definition of algorithms and

functions to better understand and represent the status of the Earth surface and the evolution of

https://www.researchgate.net/publication/261528190_Geographic_information_systems_and_science?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

4

phenomena related to it. The interest in such activities is also proved by the conspicuous in-

vestments made by European and US organizations in terms of projects [35].

As for the management of data and the specification of query languages, much work has been

done with respect to discrete data. Leading companies offer different solutions for data man-

agement and querying, while literature proposes several approaches towards the definition of

spatial query languages. In particular, when users need to query spatial information systems to

locate or compute spatial data, two main approaches are available. The first one allows to dis-

cover information by using programming languages. This is the case of ArcGIS™, which pro-

vides users with the ability to navigate and select spatial data by means of specific API devel-

oped according to several programming languages such as Visual Basic, Java, etc. [34]. This

kind of approach is certainly hard to most users because it requires specific computer science

expertise. On the other side, spatial data may be organized as databases in specific DBMS and

queried though high level spatial query languages, defined in agreement with the Open Geospa-

tial Consortium specification [37]. This approach is easier than the previous one because it al-

lows users to find out information through a method that employs natural language-like sen-

tences in a declarative way. However, this still represents a complex solution for most categories

of GIS end users, especially for advanced spatial analysis tasks.

A different approach should be considered when dealing with continuous fields. They represent

features which are continuously distributed over an area, without a specific extent, such as tem-

perature and other environmental indicators.

The idea of defining a continuous field as an abstract data type involving all the information

necessary to simulate continuity was first introduced by one of the authors, Robert Laurini, and

5

Silvia Gordillo in [12]. They defined a framework for modelling continuous fields, based on the

object-oriented paradigm. Within such a framework, continuous data are managed in terms of

samplings, on which a query can be posed in order to build an interpolating function characteriz-

ing the resulting field.

Since then, different proposals have been defined which take into account several aspects, rang-

ing from the underlying query languages to data visualization, from architectures aimed to store

samples and geometries of continuous fields to functions managing complex queries.

Finally, as for the design of systems able to both embed spatial data and functionality and sup-

port expert users with a quite intuitive approach, visual query languages and environments rep-

resent a well recognized solution. Indeed, several studies have shown that visual metaphors

reveal a promising means for allowing unskilled users to query geographic databases and to

interpret and possibly reuse recorded queries [28].

Solutions about visual environments for discrete data are well-established [2-5, 15, 25-26, 28,

30, 33]. Visual query languages are used to associate data and functions with a visual represen-

tation which users may manipulate and spatially arrange in order to compose visual queries. The

underlying query languages are then automatically invoked for determining the resulting sets.

Much work has been done also to develop query-by-example methodologies based on iconic,

diagrammatic, graph-based or multi-modal approaches [18, 23]. Additionally, some specific

work about spatial DBMS allow users to sketch queries on specific interfaces [10, 14].

Anyway, an integrated management of continuous and discrete data is still missing in most sys-

tems aimed at supporting expert users in analysis and decision making about real-world scenar-

https://www.researchgate.net/publication/2500393_Query_Processing_in_Spatial-Query-by-Sketch?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

https://www.researchgate.net/publication/220579025_The_metaphor_GIS_query_language?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

https://www.researchgate.net/publication/2691046_Querying_GIS_with_Animated_Spatial_Sketches?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

https://www.researchgate.net/publication/220649980_An_Iconic_Query_Language_for_Topological_Relationships_in_GIS?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

https://www.researchgate.net/publication/239063899_Beyond_Icons_Towards_New_Metaphors_for_Visual_Query_Languages_for_Spatial_Information_Systems?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

6

ios. As a matter of fact, increasing attention is being devoted to interpret the mental model

which users adopt when dealing with continuous phenomena. Acquiring samples, applying in-

terpolating functions and computing data requires a tailored expertise which goes beyond the

common abilities required to manage information systems.

Goals of the present research

The research we have carried out is meant to provide users with a visual environment where

heterogeneous data may be handled through a uniform approach. In particular, we aim to sup-

port users’ activity by means of a framework where they can interact with, manage and query

discrete objects and continuous data, with a very small training effort.

In order to reach this aim, in [27] we have first introduced an extended OpenGeospatial-based

architecture able to store continuous data along with their spatial and temporal properties.

Moreover, the associated query language extends OpenGeospatial specification with respect to

either the binary or the numeric strategy, depending on the SQL specifications of the underly-

ing DBMS.

In this paper we present Phenomena, a visual environment embedding a visual query language,

which inherits features of the previously defined language and associates visual representations

with its components for discrete objects as well as continuous data manipulation.

Our previous work on visual representations for discrete data [28, 26, 8] forms the basis for the

present proposal, as well as our investigation started from the research on modelling continuous

fields carried out in [12].


https://www.researchgate.net/publication/262251536_A_usability-driven_approach_to_the_development_of_a_3D_web-GIS_environment?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

7

The main extension we propose with Phenomena consists in querying a continuous field by

manipulating it directly as a geographic data, which is characterized by a domain and its geome-

try and by a function describing its behaviour. This allows us to consider also topological rela-

tionships, and, generally speaking, to handle continuous fields and discrete objects in a uniform

way. In particular, the visual environment of Phenomena provides users with a uniform style of

interaction with the world, which is conceptually modelled as a composition of continuous fields

and discrete objects. Fields and data are associated with a visual representation, named ge-

ometaphor, able to capture the double nature of geographic data, made up of a geometric com-

ponent (needed to define spatial properties and relations) and a thematic component (referring to

a real world phenomenon). Also query composition is handled through a visual approach, where

a proper visual representation is associated with both spatial and continuous operators. Thus,

geometaphors can be combined in order to represent complex events, where phenomena involv-

ing both discrete data and continuous fields occur.

As for the expressive power of Phenomena, it is enhanced by supplying users with a visual

support to express complex conditions and aggregate functions. Indeed, the hard usage of a

fixed syntax, logical operators and parentheses is overcome by means of two new metaphors,

named Condition Tree and Nested Rectangle, which enable users to easily compose either com-

plex predicates for selecting spatial data or complex functions for their computations.

Another important feature of Phenomena is the ability to translate any visual arrangement com-

posed within the environment into sentences compliant with the OpenGeospatial specification

for SQL, so that queries may be run on the most common spatial database systems.

8

The usability of the environment has also been evaluated, in terms of efficacy and user’s satis-

faction, through an experimental study, meant to compare Phenomena against an extension of

the spatial SQL, the ESSQL language [27].

The paper is organized as follows. In Section 2, we introduce the concept of continuous field, as

defined in physics, and describe the formal notation we adopt to structure a continuous field, in

terms of time, domain and surface. Section 3 provides an overview of the Phenomena environ-

ment. Some particular features of the interface and its organization are given in terms of compo-

nents and communication flow. Phenomena components, metaphors and objects, are then de-

scribed in Section 4, by means of a running example, concerning with fire risk prevention activi-

ties. The results of the comparative usability study are reported in Section 5. A discussion on

related work is given in Section 6. Some final remarks conclude the paper.

2 Preliminaries

The concept of field originates from physics and describes an entity distributed over a space

A, whose properties are functions of space coordinates. Formally, a function

f : D A →V (1)

which assigns every location s belonging to D a unique value f(s) V, is named a continuous

field on D. D A represents the domain of f, and V corresponds to its range of values (or value

domain).

When dealing with the spatial data handling, the previous definition should be customized in

order to better represent Earth's phenomena. In such a case, A refers to the Earth's surface and

therefore is commonly 2- or sometimes 3-dimensional, D refers to the portion of Earth's surface

9

under investigation, and the range V is usually a subset of Rn. When n = 1, the field is named a

scalar field, otherwise it is a vector field. Figure 1 shows an example of a continuous field tem-

perature as calculated through meteo-stations, which can be useful for a reliable weather fore-

cast.

<< Figure 1 >>

Besides the previous parameters, sometimes it should be necessary to take into account also

temporal properties. Indeed, data acquired along time at the same geographic point can be ex-

tremely different, even if evaluated in a brief temporal range. It depends on several factors

which cannot be precisely evaluated. As an example, let us consider that the temporal validity

period for a pressure system consists of few hours, while a digital terrain model varies along

many centuries. Thus, the time variability issue of phenomena cannot be disregarded in order to

have a deeper comprehension of the Earth's phenomena.

By taking into account previous considerations about phenomena, a continuous field has been

introduced as a structure having three parameters, namely:

CF = ( T, D, F),

where:

- T is the time period when the continuous field representation is valid,

- D is the continuous field domain, and

- F is the function representing the phenomenon surface.

As for the time parameter, actually it is handled as a variable characterizing the temporal evo-

lution of phenomena, that is to say, each sample is acquired with a temporal value, indicating the

validity period of the corresponding shot.

10

3 Overview of Phenomena

Most visual environments handling spatial data focus on discrete objects. They allow users to

manage and query geometric and descriptive properties of data through a visual approach, and

display resulting sets in terms of tables, forms and highlighted objects. The underlying query

language may automatically run when visual sentences are built, thus letting users disregard

details about its syntax, as well as the inner complexity of geographic data.

A similar approach may be followed in order to provide expert users with a visual support for

the management of continuous fields. As a matter of fact, besides efforts related to the specifica-

tion of data models, structures and functions devoted to data visualization and processing, recent

literature has emphasized an increasing attention towards the interpretation of the mental model

which users adopt when dealing with spatially continuous phenomena such as temperature,

population density or surface elevation [12, 16, 17, 27]. Indeed, given the capability of such

continuous fields to describe and represent variations across the Earth surface, it is crucial to

understand how people perceive them in order to obtain a more realistic view of the involved

phenomena and their effective management.

In this Section we describe a visual environment, named Phenomena, embedding a visual

query language, able to handle both discrete and continuous data, with their double descriptive

and spatial nature. Within Phenomena, users may query discrete objects and continuous data

through a uniform approach, which is based on homogeneous visual iconic representations and

on two new metaphors, named Condition Tree and Nested Rectangles, which may be used to

compose complex query conditions and invoke spatial operators, respectively.

11

Moreover, Phenomena offers the additional capability to associate the structure of fre-

quent/complex queries with visual representations, that may be subsequently selected any time

those queries must be invoked, possibly setting some parameters.

As for the design of the graphical user interface (GUI), we aimed to guarantee a high-level

graphical access to the system and provide efficient communication facilities for database query-

ing and analysis, thus assuring a crucial requirement for the fields of GIS and spatial decision

support, where the inner complexity of data strongly affects human-computer interaction.

Basically, as shown in Figure 2, the interface of the environment is divided into four parts, the

Dictionary, containing the iconic representation of data on which visual queries can be posed,

and three interactive working areas, named WHERE, SELECT and PUBLISHED, where users

may respectively select objects, elaborate filtered data and display the final visual query. In

particular, the WHERE area enables users to visually define the SQL WHERE clause content,

which expresses a condition of a typical SELECT… FROM… WHERE statement. It contains

both some basic panels allowing users to visually build simple condition representations, and a

special panel, named COMBINE, where those representations may be merged in order to build

more complex queries.

<< Figure 2 >>

Similarly, the SELECT working area allows users to build up a SELECT clause. In this case,

two basic panels provide a means for managing the sets of surface and domain functions, re-

spectively. Moreover, a special panel of the SELECT area, named FUNCTIONS, is used to pro-

vide a general visual representation for any kind of function classes other than domain or sur-

face. Their output may be sent either to other function panels in order to be further computed or

12

to the PUBLISHED area, which contains the set of composed visual representations that will

eventually appear within the final SELECT clause (see Figure 3).

Additional buttons and menus are also defined on the top of the interface to provide such gen-

eral operations as load or remove panel, or again, save or run queries.

As for the communication flow, Figure 3 illustrates how Phenomena integrates different

components and data through specific panels. In particular, each panel is related to an XML

descriptor which contains information about the specific kinds of input and output it may re-

ceive and send. As an example, a panel that manages some geometric functions such as intersec-

tion, union and difference can just receive and send visual GEOMETRY or SURFACE typed

elements.

<< Figure 3>>

When users select a visual element from the Dictionary, the system compares the XML rep-

resentation of the element with each panel XML descriptor, in order to determine which panels

within the SELECT and WHERE areas may receive that visual element. A visual representation

of the element will be presented within each of the identified panels for possible manipulation.

For each panel, the available functions/operators may be applied in order to construct subparts

of the target visual query, as well as the underlying XML representations. In particular, in case a

panel of the SELECT area is involved, its output may be forwarded either to the PUBLISHED

area in order to compose the final SELECT clause, or to other panels of the SELECT area to

specify more complex visual SELECT clauses. On the other hand, in order to build the WHERE

13

clause, within the WHERE area, the atomic conditions which may be specified in the different

panels are sent to the COMBINE panel to be visually merged.

Finally, the visual representations within both the PUBLISHED area and the WHERE area

will be put together to generate the corresponding SQL code according to the specifications

given in [27].

A more detailed description of Phenomena functionality is given in Section 4.

4 The Role of the Phenomena Components

In this section we provide a detailed description of the components featuring into the Phenom-

ena visual environment. They are grouped in three main sets, namely the Geometaphor Diction-

ary, the WHERE working area and the SELECT working area, depending on the role they play

in the query composition task. In particular, we focus on the definition of the objects that are

manipulated within the environment, as well as we give a complete description of the visual

query language underlying the system.

In order to simplify the comprehension of single components and illustrate how they work, in

the following section we describe a scenario that will be used to exemplify Phenomena func-

tionality.

14

4.1 The Running Example

The following example represents a sufficiently complete scenario to describe basic and innova-

tive features of Phenomena. It involves both discrete and continuous data, as well as the time

parameter.

Users are interested in areas of the Italian Campania region where the likelihood of fires is

particularly high during August.

Fire risk mainly depends on three sets of factors:

fixed environmental elements like slope, exposition and lighting system, and variable ele-

ments like temperature, precipitations, humidity, wind, etc.;

vegetable ground coverage and its characteristics like density, humidity, height;

anthropic activity in all forms and its interaction with the environment.

For the sake of simplicity, in the following we consider only four parameters, which seem to

be the most important, namely two continuous fields, Temperature and Vegetation, and two

discrete objects, Roads and the boundaries of the Campania region. Then, the previous factors

may be interpreted as follows.

Users are interested in looking for areas where:

the temperature is higher than 35 Celsius degrees,

the vegetation status is dry,

areas of interest are located less than 100 mts far from roads (in order to consider the possi-

ble human influence).

15

These requirements may be described as a spatial SQL statement according to the specification

proposed in [27, 37], that is:

<<Figure 4>>

In particular, the WHERE clause determines both roads belonging to the Campania region and

the Vegetation samples joint with the Temperature samples acquired during July or August. The

elements which satisfy such conditions are then computed by the SELECT clause to extract the

areas resulting from the intersection among the buffer zone around the roads, the areas where

the temperature is higher than 35°C and the vegetation dryness is higher than 65%.

4.2 The Geometaphor Dictionary

Following the traditional approach of visual query languages, in Phenomena queries are ex-

pressed in terms of spatial arrangement of visual elements, which may represent objects, opera-

tors and functions.

As for the objects featuring in the Dictionary area, Figure 5 depicts their underlying structure

which takes into account the complex nature of geographic (discrete / continuous) data by visu-

ally integrating the iconic and the property components. As for the former, the iconic represen-

tation has been chosen due to its ability to support users’ visual cognitive styles. Indeed, a

visual element, named geometaphor, assemblies a physical part, corresponding to the graphical

aspect, and a meaning, referring to the semantic component. In such a way, users may quickly

perceive the meaning associated to the objects and use them properly. As for the property com-

ponent of a dictionary object, the description of a real world object is divided into two parts, a

16

type attribute representing the object type used to store data (e.g., POLYLINE for rivers or

SURFACE for pollution degree), and a source indicating where data should be retrieved (e.g., a

table or a view name, a SQL query or a function).

<<Figure 5>>

As an example of a Dictionary object, let us consider the Campania district geometaphor con-

tained in the CampaniaDistrict layer. It may be described through an icon having the image

as physical representation and ―Campania District‖ as the corresponding meaning. It is

Polygon typed and has the SELECT * FROM districts WHERE region = “Campania” SQL code as

source. The code shown in Figure 6 describes the XML representation of the previous example:

<< Figure 6>>

Another example concerning with a continuous field is the Temperature. Its physical representa-

tion is , the meaning is ―Temperature‖, it is SURFACE typed and has the SELECT *

FROM Temperature SQL code as source. Code shown in Figure 7 describes the XML represen-

tation of the previous example:

<< Figure 7 >>

Once the object representation is explained in terms of both geometaphor and XML representa-

tion, we are ready to describe how users could best interact with Phenomena when manipulat-

ing those objects in order to find out and process information from the underlying spatial data-

base.

It is worth to note that when geometaphors are selected from the Dictionary, each working

area receives them or subparts of them, in order to prearrange its panels for interaction required

17

by users, which may consist of building up a subpart of a query or merging the subparts to-

gether. In the following subsections more details about the working areas are given.

4.3 The WHERE Working Area

The reference example described in Section 5.1 embeds the three kinds of conditions that can be

expressed in a WHERE clause, namely alphanumeric, spatial and temporal conditions. The

requirements specified by users in the example scenario may be translated in terms of conditions

as follows:

roads contained within the Campania region and

continuous fields sampled during July and August.

Within the WHERE working area, users may visually represent topological and temporal rela-

tionships, by interacting with the Topological and Temporal panels, respectively.

In particular, the Topological panel is configured to capture both the geometries (as defined by

the OpenGeoSpatial) and the SURFACE data types. Moreover, when the panel receives such

components, it associates a green color to geometries of continuous data, to distinguish them

from the black-geometries geometries of discrete objects. Finally, in order to provide users

with a deeper comprehension also the iconic representation is visualized (see Fig. 8).

Four geometaphors are dragged from the Dictionary and dropped into the WHERE working

area, namely Temperature, Vegetation, Campania and Roads, which are SURFACE,

SURFACE, POLYGON and POLYLINE typed, respectively.

18

<< Figure 8 >>

Once geometaphors are positioned in the panel, users may spatially arrange them, in order to

make up the visual representation of the required operation. According to the running example,

Figure 9 shows a visual representation for the overlap operator applied to the geometries

representing the Campania region and the Roads.

<< Figure 9>>

An extensive description of visual representations associated with operators can be found in

[28].

As for temporal conditions, Phenomena offers a direct manipulation panel, named Temporal

panel, where users may compose temporal conditions through simple mouse actions. The panel

can receive only the SURFACE data type.

When it receives a geometaphor component, the validity time periods are set by means of red

bars, named Temporal bars. Figure 10 shows the Temporal panel containing the representation

of validity time periods for Temperature and Vegetation.

<< Figure 10>>

Basically, bars represent the validity time periods of continuous fields, namely the periods when

the sampling of continuous fields may be considered constant. Arranging bars within the panel,

users are able to give a visual representation for all the temporal operators. Figure 11 shows the

correspondence between temporal operators and visual representations, as defined in [1].

<< Figure 11>>

In the given scenario, we need to select Temperature and Vegetation continuous fields whose

validity time periods intersect July or August. As an instance, in order to create the part of the


https://www.researchgate.net/publication/224285663_Allen_J.F._Maintaining_Knowledge_About_Temporal_Intervals._Communications_of_the_ACM_26_832-843?el=1_x_8&enrichId=rgreq-f9593483-76f7-4331-b4ad-4525d1372b3f&enrichSource=Y292ZXJQYWdlOzIyMzUwMjI5NjtBUzoxNTE3NDg4MTU2MjYyNDBAMTQxMzE5MTE0NTE2OQ==

19

query which selects the ―temperatures sampled during August‖ we can select the period we

need. As a result, new temporal bars appear on the panel. Finally, we have to compose the bars

as shown by the visual arrangement in Figure 10. This action should be also repeated for July.

Once time and space conditions have been created and uploaded into the COMBINE panel of

the WHERE working area, it is possible to combine them by using the Condition Tree, as de-

scribed in the following.

4.3.1 Combining Simple Conditions

One of the most important problems of visual languages for database querying, both spatial and

traditional, is the low expressiveness of visual techniques to represent complex conditions, in-

volving Boolean operators, such as (P1 AND (P2 OR P3)) OR P4.

In this section we present a new visual technique, named Condition Tree, which is used to repre-

sent complex logical expressions, with no textual sentence needed. The Condition Tree supports

users in defining visual complex conditions through a tree structure where nodes represent sim-

ple conditions, edges represent AND connectors and edges starting from the same node are

ORed connected to each other. For example, let P1, P2, P3 and P4 be simple conditions com-

posed in agreement with the tree shown in Figure 12. Then, elements satisfying P1 and P2 , or

P1 and P3, or P4, contemporarily, will be selected.

<< Figure 12>>

Figure 13 describes the algorithm devoted to translate the tree into a SQL statement.

<< Figure 13>>

20

Two for cycles are defined in order to return references to paths and to single nodes in each

path. In the previous example, the algorithm returns the (Root, P1, P2), (Root, P1, P3) and

(Root, P4) paths, sequentially.

Once obtained the node reference, the system checks whether it is not the root, and in that case,

its textual representation is inserted into the query string. Analogously, if the system verifies that

the node reference is not a leaf, then the AND string is concatenated to query in row 5. Finally,

the last statement of the outer cycle verifies whether the referenced path is the last. If not, the

OR operator is appended to query.

In the example, when the algorithm terminates the textual query (P1 AND P2 OR P1 AND P3

OR P4) is returned.

Figure 14 illustrates the application of the ConditionTree metaphor on the outputs produced

by the Topological and Temporal panels defined according to the requirements of the running

example.

<< Figure 14>>

4.3 The SELECT Working Area

Once the user has exploited the WHERE working area to set conditions about both the conti-

nuous fields and the discrete data he/she needs, geometaphors can be manipulated through some

spatial functions in order to produce new information answering users’ particular necessities. A

typical example is the SELECT clause from the SQL statement shown in the reference example,

21

where the intersection among a buffer zone and two areas selected through specific functions is

required.

To this aim, the SELECT working area in Phenomena provides three panels named DOMAIN,

SURFACE and FUNCTIONS, which enable users to apply spatial functions concerning with

geometries, surfaces and some spatial aggregation functions, respectively.

Figure 15 shows the application of the Domain buffer function on the road geometaphor in order

to isolate regions which are 100 meters far from any roads.

<< Figure 15>>

As for the Surface panel, it may be used to detect continuous field subparts satisfying particular

surface conditions. In the running example, such subparts correspond to regions where tempera-

ture is higher than 35 degrees and vegetation is higher than 65%, respectively. Thus, we have to

apply the functions:

Temperature.getValue(―>35‖), and

Vegetation.getValue(―>65‖).

In order to provide users with a visual approach also for this kind of functions, the basic func-

tion metaphor has been implemented within the Surface panel of Phenomena. It resembles a

generic 2D function diagram featuring all the critical points of a continuous field, i.e., maximum

and minimum points, concave and convex regions, flex points, as well as the Gradient and the

GetValue functions, as shown in Figure 16.

22

<< Figure 16>>

According to the principles of direct manipulation, sensible zones get visible whenever the

mouse pointer moves over them. By clicking, the surface function associated with the sensible

zone is selected and the visual representation is shown with possible parameters. Figure 17 de-

picts the visual representation resulting from the application of the GetValue function within the

running example.

<< Figure 17>>

Again referring to the running example, so far, we have distributed among several panels of

Phenomena the different subparts of the query under construction. In order to assemble the

visual representation of the SELECT clause, geometries resulting from the Domain and Surface

panels should be compared in order to determine the common subpart satisfying both the surface

conditions. This step may be performed by using the FUNCTIONS panel which implements the

Nested Rectangle metaphor.

The FUNCTIONS panel is divided into two subparts, that is, a working area containing the ge-

ometaphors involved in the query, and a set of buttons that allow users to select a specific func-

tion. The latter is further divided into two subsets, that is, the property function buttons and the

aggregate function buttons. The first set indicates the property which can be extracted from each

feature table instance (Density, Area, Integral, Surface), whereas the second set indicates how

the properties can be aggregated (MIN, MAX, Mean, Sum, Count, Intersection, Union, Differ-

ence).

Once either a property or an aggregate function is chosen, a black rectangle around the ge-

ometaphors may be drawn, which implies that the chosen function will be applied to the in-

23

cluded geometaphors. This operation can be easily repeated in order to use the result of an

applied aggregate function as a parameter for a new one, i.e. nesting the inner rectangle inside

the new rectangle, as one of its parameters. In terms of SQL syntax, a black rectangle represents

a parenthesis within a SQL query, which can be recursively applied until the needed result is

get. It is worth to note that input type errors at run time may not occur, because input and output

types are verified during the function application.

At present Phenomena performs several checks to avoid that users build incorrect queries. In

particular, controls have been provided to avoid that:

more than one geometaphor is selected after selecting a column aggregate function,

just one geometaphor is selected after selecting a row aggregate function,

some not compatible properties and aggregate functions are selected as a pair, such as the

intersection and the density,

an aggregate function contains another aggregate function,

a property function contains another property function, or a property function contains an

aggregate function.

Figure 18 shows the FUNCTIONS panel as it appears at the end of visual query construction.

<< Figure 18>>

24

5 A Comparative Usability Study

In order to evaluate the usability of the proposed environment, we performed a comparative

study of Phenomena against the ESSQL textual query language, which contains operators and

functions to manipulate discrete and continuous data. ESSQL was chosen as the most appropri-

ate comparative language, because the underlying SQL language is the most common query

language, adopted by both expert and novice users.

The experiment was meant to quantify user’s ability to solve some complex tasks by using either

ESSQL or Phenomena. To this aim, we conducted two separate studies, targeted at measuring

user’s accuracy and user’s satisfaction, respectively. Accuracy was evaluated by performing a t-

Test analysis meant to provide a quantitative measurement of user’s ability to solve problems

with either languages. Whilst, user’s satisfaction was measured by submitting a questionnaire to

the involved subjects in order to obtain a qualitative evaluation of their feelings about the soft-

ware being tested.

We did not measure the time required to solve each problem (efficiency) because it would have

required the use of a tool for managing the ESSQL.

5.1 Independent Variables

The independent variables used to control the experiment were:

1 User skill level (Non-Expert Programmers vs. Expert Programmers )

2 Query language (ESSQL vs. Phenomena)

25

To select the subjects of our experiment, we randomly recruited 10 students playing the role of

non-expert programmers, from the first year course of Computer Science degree. As for the

group of expert programmers, we considered the students who passed last term exams in Geo-

graphic Information Systems, with expertise in Databases, and recruited the 10 students, who

had gained the highest marks. Both courses are taught at the University of Salerno by one of the

authors, the former being a fundamental course of the Computer Science degree, the latter being

an advanced course. Table 1 describes the groups we identified depending on the skill level and

the language used in the experiment. The same tasks were attempted by each group. Within each

group, 5 subjects used ESSQL and 5 subjects used Phenomena.

<< Table 1 >>

5.2 Environment and Evaluation

While we have a well-established implementation of the Phenomena visual query language, we

cannot provide ESSQL group subjects with a tool because currently there are no DBMS which

extend SQL in such a way. Thus, we decided to administer this part of the experiment by using a

paper and pencil test. This technique has been frequently employed in previous experiments and

can be efficiently managed with multiple subjects simultaneously. Each subject was provided

with the material required to perform this kind of experiment.

The exam was comprised of six tasks which were worded as English sentences. Tasks were

presented as follows.

T1. Find out areas where temperature is higher than 20°C.

T2. Find out areas where temperature is lower than 30°C in 2002.

26

T3. Select the pressure continuous fields overlapping Campania Italian region.

T4. Select regions where the pressure continuous field is higher than 100mb and temperature is

lower than 20°C.

T5. Select the areas of the Italian Campania region where in August the temperature is higher

of 35° and the vegetation is drier than 65%

T6. Select areas where each point is at most 100 meters far from a road in Campania, tempera-

ture is higher than 35°C and vegetation is drier than 65% (See Fig.18).

The experiment began with a training session of 2 hours, when subjects were instructed on how

to compose a complex query by using the language assigned to them. After the training phase,

expert and non-expert programmers were asked to perform the tasks. The order of exposure of a

single task was controlled, so that the task execution could vary among participants.

5.3 Dependent Variables

The dependent variables that the two studies were meant to evaluate were:

User’s accuracy and

User’s satisfaction.

User’s accuracy is a quantitative measurement of user’s ability to solve problems with a spe-

cific tool, and represents the degree of efficacy of use. User’s satisfaction is instead evaluated

to verify whether a user is encouraged to use the tool.

27

5.3.1 User’s Accuracy

In order to determine the accuracy of a user’s solution, for each task we considered the most

serious error the user made, and assigned it the score according to the following list:

Score 3: Correct solution which we indicate with C.

Score 2: Essentially correct solution. This category comprises minor data errors D (e.g.,

the data is not supplied completely as required, e.g. only an abbreviation of a county is

given when the full name is required), and minor language error M (e.g., misspellings

and punctuation).

Score 1: Partially correct solution. We indicate semantic errors with S (these are valid

queries that would run but produce the wrong answer), and syntactic errors with F

(invalid queries)

Score 0: Query not attempted at all.

When assigning the above scores, we grouped D and M values because we considered that

both kinds of problems are due to distraction rather than to inability. Conversely, S and F

problems are due to more serious troubles.

Then, we totalized the scores of each subject and we calculated the average of such values.

5.3.2 User’s Satisfaction

In order to determine satisfaction, we submitted a questionnaire to the involved subjects in order

to obtain a qualitative evaluation of their feelings about the software being tested. In particular,

subjects were asked to answer some questions after completing the tasks. Questions mainly

concerned with three arguments, namely, general reactions to the language used, specific

28

comments on the performed tasks and on the difficulties encountered, and support achieved

during the query composition.

5.4 Hypotheses

The null hypotheses we tried to reject with this experiment are:

H1: In Phenomena, there are no differences in accuracy based on the user skill level.

H2: There are no differences in accuracy between ESSQL and Phenomena for the P

group of participants.

H3: There are no differences in accuracy between ESSQL and Phenomena for the NP

subjects.

5.5 The Accuracy Evaluation Process

The statistical test employed for this analysis is the standard t-Test. As a matter of fact, this test

is largely used for comparing two methodologies which propose to solve the same problem [6,

13, 25, 31]. To test the significance, we need to set a risk level. In most social research, the "rule

of thumb" is to set the alpha (risk) level at 0.05 [22].

This analysis is appropriate whenever we want to compare the mean of two groups as we

propose in this study, because the t-Test assesses whether these values are statistically different

from each other.

Tables 2, 3 and 4 highlight that, by considering the whole set of essentially correct re-

sponses, the benefits gained by NP subjects using Phenomena, were remarkably higher than

the benefits registered for P subjects. In particular, non expert subjects who performed the

29

tasks with Phenomena achieved 40% of success rate, whereas those who used ESSQL correct-

ly performed only 16.6% of queries. On the contrary, ESSQL-P and Phenomena-P subjects

carried out queries in a similar manner, that is, they gained about 70% of success rate.

< < Table 2, 3, 4> >

In particular, we observe that NP subjects had a success rate at the totally correct response

level, which raised from 6.6% of cases for ESSQL to 40% of cases for Phenomena. Results

reported in Tables 1 and 2 show that Phenomena prevents minor errors. As we expected, this

kind of error disappeared because the syntax of the query language is totally hidden to the

user by means of visual representations.

By analyzing Table 3, as for incorrect responses (S, F), we note that a substantial reduction

exists for important errors. Actually, we note that, even if the total rate of this kind of error

decreased from 70% to 53.3%, S errors raised whereas F errors decreased. This may be

explained by considering the method we used to score the tasks. As a matter of fact, we con-

sider just the worst error for each task. Hence, many S errors have been hidden by F errors in

case of ESSQL-NP. The problem did not occur with Phenomena-NP subjects, who built que-

ries in the correct form (F) in many more cases with respect to ESSQL-NP subjects. Howev-

er, they could still make semantic errors (S).

Another important substantial difference concerns with the number of non- performed

tasks, which was approximately halved using Phenomena with respect to ESSQL. As a matter

of fact, NP-Phenomena subjects renounced to formulate 6.6% of the assigned tasks, whereas

NP-ESSQL subjects renounced to 13.3% of the tasks. This is reasonably due to the fact that

visual applications make subjects more confident in solving this kind of problems.

30

As for the hypotheses we made, some other considerations may be done. Each result was

tested at a 0.05 significance level.

H1 --- The difference between the mean values reported in Table 5 obtained by program-

mers and non-programmers performing Phenomena tasks is conspicuous (13.8 -10.4). In this

sense, we claim that expert programmers were more comfortable in making Phenomena tasks

with respect to non-expert programmers. Moreover, because the p value is 0.012, i.e. , less

than 0.05, we can state that in most cases the difference of means is statistically significant.

That is to say, the H1 null hypothesis should be rejected.

By both looking at the standard table of significance (Table 6) and taking into account the t

value (-3.24) and the degrees of freedom for the test (number of considered subjects minus 2,

namely 8), we can state that the probability that t > 3.24 is 0.05. Also, the probability that t

< −3.24 is 0.05.

< < Table 5 > >

< < Table 6> >

H2 --- The second hypothesis claims that ‘‘There are no differences in accuracy between

ESSQL and Phenomena for the P group of participants.’’ In order to verify this assertion, we

analyzed the difference of means concerning with ESSQL and Phenomena P subjects. As a

matter of fact, the difference is quite close to zero. That is to say, there is no difference in

accuracy when performing the tasks by either ESSQL or Phenomena. Because the p value

(significance level) of this two-tailed t-test is greater than 0.05, we deduce that there was no

31

difference in the means of the two P sub-groups. In other words, we failed to reject the null

hypothesis.

H3 --- Finally, we analyzed the third hypothesis, i.e. , ‘‘There are no differences in accuracy

between ESSQL and Phenomena for the NP group of participants’’. In this case, the differ-

ence of means (10.4 --- 6.6) is significantly in favor of Phenomena, and t-test gives t = -3.03

and p is 0.01. This implies that we may reject the hypothesis, and in particular we may state

that Phenomena is better than ESSQL for non-expert programmers.

Overall, the gained results demonstrate that Phenomena allows users to improve their own

abilities in performing spatial queries concerning with both discrete and continuous objects

(H2, H3) when compared with users who perform the same queries with a SQL-like textual

language. Moreover, by rejecting H1, we also prove that users who have specific skills in GIS

and databases fields are more comfortable in performing spatial queries with respect to un-

skilled users.

5.6 The Satisfaction Evaluation Process

In order to monitor user’s satisfaction, subjects were asked to answer some questionnaires

after performing the tasks. Questions mainly concerned with three arguments, namely general

reactions to the language used, specific comments on the performed tasks and on the difficul-

ties encountered, and support achieved during the query composition.

As for the first argument, answers may be divided into four parts according to the external

subdivision (Non-Programmers and Programmers) and internal subdivision (ESSQL and Phe-

nomena) of the groups.

32

Programmers found no difficulty in composing the queries. However, subjects belonging to

the Phenomena group observed that a notable support came in task performance thanks to the

use of metaphors. As a matter of fact, although textual languages are more concise with re-

spect to visual languages, people generally prefer to compose queries in a visual way rather

than in a textual way. According to the particular answers we received, programmers particu-

larly appreciated that they did not have to address their efforts to correctly write clauses,

functions or parentheses.

On the other side, non-programmers performing the tasks by using ESSQL considered the

language hard to use and remember, and reported that they felt uncomfortable in performing

the most complex search tasks. Differently, non-programmers belonging to the Phenomena

group observed that the given visual environment encouraged them to carry out the assigned

complex tasks, thanks to both an adequate feedback and the ability to recover from wrong

actions.

In general, answers highlight that subjects preferred Phenomena with respect to the SQL-

like language. In particular, they valued both the ease for composing queries and selecting the

correct functions, and the awareness on what they were writing at any moment. This benefit

was especially appreciated by non programmers who felt themselves encouraged to use Phe-

nomena also in performing complete tasks.

6 Related Work

Since their appearance, visual query languages showed to be one of the most effective methods

used to reproduce the user’s mental model to discover and manipulate data [28]. Much work has

33

been done along this line and several proposals have been discussed, also meant to provide

designers with intuitive solutions to embed traditional and object-oriented databases into visual

environments. An excellent survey about visual query languages can be found in [5], where

significant work has been analysed and relevant issues have been outlined for the design of next

generation visual query systems.

The aspect that visual query languages share is the usage of icons, diagrams, graphs and multi-

modal approaches, whose goal is twofold. They can be used to associate data with visual nota-

tions which recall their meaning, and they can be spatially manipulated and arranged to compose

queries, which are then automatically translated into the underlying textual query language.

The independence of visual notations from both the application domain and the nature of data,

led researchers to investigate the applicability of this approach also for spatial databases, where

the inner complexity of data could be overcome by the expressiveness of visual languages. In-

deed, given the need for spatial query languages in several application domains, ranging from

image databases, to remote sensing, to GIS, a visual notation may embed heterogeneous compo-

nents, such as meaning, geometry, location and properties, into a homogeneous structure ex-

pressed in terms of graphics and labels. Moreover, the introduction of 3D icons has allowed

users to deal also with scenarios where the third component results necessary in representing

and analysing data [7].

In recent literature, many text-based spatial query languages have been proposed which for-

malize users’ requirements and concepts about spatial data. Basically, the common feature of all

those query languages has been to extend the standardized SQL query language, by integrating

the SELECT-FROM-WHERE construct with new operators and command sets [9, 19, 32]. In

particular, SQL3 multimedia specification (SQL/MM) [36] and OpenGeoSpatial Simple media

34

Specification for SQL are considered the two major standards for storage and management of

spatial data. However, such extensions reflect the complexity of geographic data and the adop-

tion of visual notations may simplify the usage of those languages, while retaining the benefits

coming from their expressive power. Moreover, such languages are mainly useful to manage

discrete objects, which are represented by vector structures. Queries can be applied to vector

data in order to both verify topology and perform spatial analysis. On the contrary, continuous

data are represented by raster formats and can be therefore manipulated only through a limited

set of grid-based functions.

An important visual approach for GIS querying is represented by sketch based visual (lan-

guages. Basically, they adopt the query-by–example approach where users draw particular con-

figurations of the spatial elements that the system should be able to interpret. The depicted con-

figuration represents an example of the result that should be displayed. Sketch! was one of the

first languages which adopted that approach for composing the spatial queries [23]. In Spatial-

Query-By-Sketch [10] users interact with a touch sensitive screen to sketch the example spatial

configurations. They can augment or reduce the similarity ranking to modify the accuracy

threshold for the resulting matches.

Cigales [4] is another sketch based query system for GIS, able to draw visual queries based on

the features and the operations that users select when composing queries on a graphical inter-

face. A weakness of Cigales lies in the possible ambiguity deriving from the multiple visual

representations and interpretations of queries, as well as in the absence of logical operators. A

(limited) attempt to solve the interpretation ambiguities can be found in LVIS introduced as an

extension of Cigales, which was also integrated into a customisable visual environment [2].

35

VISCO is another visual query system that adopts deductive reasoning to support users in

(geometric and topological) query specification [33]. VISCO uses standard elements to describe

query objects as well as relationships between objects, and spatial operators to derive compound

objects. The query semantics must be deduced through a Description Logic subsumption

mechanism. However, the involved query objects can be associated only with simple thematic

components (e.g., ―city‖).

In general, sketch and drawing based approaches are suitable for expressing similarity-based

queries. However, such methods become uncomfortable in the case of composite queries, be-

cause it may be difficult to sketch the sample query so that it includes all and only the characte-

rizing elements the user is looking for. Besides, sketch and drawing based approaches rely on

user’s ability to express spatial relationships in a sketch. Indeed, even if some approaches offer

support to the user during the drawing phase, exact queries can be generally ambiguous due to

the several possible interpretations of the visual configurations [11].

A different approach, very close to our work, was followed with the Spatial Exploration En-

vironment (SEE) [15]. SEE is an integrated framework, that adopts the visual paradigm for

spatial query specification and result visualization, relying on a visual query interface for two-

dimensional spatial data and an underlying visual query system, SVIQUEL, which allows the

specification of topological and directional relationships between objects through direct manipu-

lation. As for Phenomena, the system allows for incremental query specification and refinement,

and benefits from the adoption of direct manipulation paradigm [29], to query spatial data.

However, only relationships between two sets of objects at a time can be specified in SEE, and

no graphical notation is provided to compose complex query conditions and invoke compound

36

spatial computations, as is the case for Phenomena with the Condition Tree and the Nested

Rectangle metaphors, respectively.

The issue of specifying any complex spatial query conditions connected by Boolean opera-

tors, has been mainly faced with two basic approaches in the literature. In the former, single

conditions are combined just by the AND operator, and the final result indicates all conditions

that must be satisfied to select objects. The visual query language Spatial Query by Sketch is an

example of such an approach. It allows users to sketch a drawing illustrating all the spatial con-

ditions that must be satisfied. As for the second approach, a combination of visual and textual

representations is used. Simple conditions are visually built, while logical operators are applied

through the corresponding buttons. By selecting visual operators and pressing buttons, the cor-

responding textual SQL condition appears, which may be used to check the query construction.

An example of such an approach is the query builder of the desktop ArcGIS component, named

ArcView. A further, merely visual, approach is proposed with Filter/Flow, where users use the

pipe metaphor to describe Boolean logics [24]. Each condition is like a filter for the water flow:

if two conditions must be satisfied at the same time (AND), then they are located as a sequence

of cocks, while if at least one condition must be satisfied, then the flow is divided into two mi-

nor flows which may be interrupted by cocks, representing the conditions.

What mainly distinguishes Phenomena from the previous systems is that users may benefit

from direct manipulation of visual elements (i.e., Nested Rectangles) also when performing

complex spatial computational tasks. The latter may require handling discrete as well as con-

tinuous geographic information, making Phenomena an effective visual environment for GIS

management, able to support experts’ activity through usable yet powerful interfaces.

37

7 Conclusion

In this paper we have presented a visual environment, named Phenomena, where heterogene-

ous data may be handled through a uniform approach. We have shown that this environment

allows users to compose queries meant to answer a wider range of problems involving spatial

data, because, differently from other approaches, it enables users to visually build conditions

concerning with discrete data as well as continuous data.

The intended users of the system are domain experts (e.g., geologists, meteorologists, arc-

haeologists, sociologists, etc.), who accomplish their analyses on their own, supported by a

visual query language. The friendly and easy-to-learn environment provided by Phenomena

allows such users to perform complex computations, also on continuous fields, that have so

far required the intervention of expert computer scientists.

The effectiveness of Phenomena has been described through a running example concerning with

a typical real world problem, namely the detection of fire risk zones. Its usability has also been

evaluated, in terms of efficacy and user’s satisfaction, through a comparative experimental

study, which has suggested Phenomena superiority against an extension of the spatial SQL, the

ESSQL language.

Let us conclude, by remarking that further analysis is needed for the usability assessment of

Phenomena. Indeed, in the present paper, we have provided a preliminary empirical evalua-

tion, which has relied on the observation of two groups of subjects, chosen among Computer

Science students, and on a set of six significant tasks . The results of the comparative analysis

are quite encouraging. Nevertheless, we are aware that the involvement of students may

threaten the external validity of the experiment and that in order to achieve a more reliable

38

usability validation, the experiment should be replicated, involving real GIS domain experts,

with different skills in programming. The real world GIS applications we are developing in

the context of a scientific cooperation with the Agriculture Council Department of Campania

Region, will provide a suitable experimental environment for future analyses.

References

[1] J. F. Allen, Maintaining knowledge about temporal intervals, Communication of the ACM 26

(1983) 832-843

[2] M.A. Aufaure-Portier, C. Bonhomme, A High Level Visual Language for Spatial Data Manage-

ment, in: Lecture Notes in Computer Science 1614, Springer, Amsterdam, the Netherland, 1999,

325-332.

[3] D. Blaser, M. Egenhofer, A Visual Tool for Querying Geographic Databases, in: Proceedings of

the 5th Conference on Advanced Visual Interfaces, ACM Press, Palermo, Italy, 2000, 211-216.

[4] D. Calcinelli, M. Mainguenaud, Cigales, a Visual Query Language for a Geographical Informa-

tion System: the User Interface, Journal of Visual Languages and Computing 5(2) (1994) 113-

132.

[5] T. Catarci, M.F. Costabile, S. Levialdi, C. Batini,Visual Query Systems for Databases: a Survey.

Journal of Visual Languages and Computing 8(2) (1997) 215-260.

[6] T. Catarci, What Happened When Database Researchers Met Usability. Information System

25(3) (2000) 177-212

http://www.sigmod.org/dblp/db/journals/is/is25.html#Catarci00

http://www.sigmod.org/dblp/db/journals/is/is25.html#Catarci00

39

[7] V. Del Fatto, L. Paolino, F. Pittarello, M.Sebillo, G. Vitiello, WebMGISQL 3D – Iterating the

Design Process passing through a Usability Study, in: Proceedings of the 20th BCS HCI Group

conference, London, September 11-15, 2006, 69-73.

[8] V. Del Fatto, L. Paolino, and F. Pittarello. A usability-driven approach to the development of a

3D web-GIS environment. J. Vis. Lang. Comput., Special Issue on Visual Languages and Tech-

niques for Human-GIS Interaction, 18(3) (2007) 280-314.

[9] M. Egenhofer, Spatial SQL: A Query and Presentation Language, IEEE Transactions on Know-

ledge and Data Engineering 6(1) (1994) 86- 95

[10] M. Egenhofer, Query Processing in Spatial Query by Sketch, Journal of Visual Languages and

Computing 8(4) (1997) 403-424.

[11] F. Ferri, E. Pourabbas, M. Rafanelli, The syntactic and semantic correctness of pictorial configu-

ration query geographic databases by PQL, in: Proceedings of the 17th ACM Annual Sympo-

sium on Applied Computing (ACM SAC2002), Madrid, Spain, 2002, 432-437.

[12] S. Gordillo, R. Laurini, Field Orientation for Continuous Spatio-temporal Phenomena, in: Pro-

ceedings of International Workshop on Emerging Technologies for Geo-Based Applications,

Ascona. Switzerland. 2000, 77-101

[13] D. P. Groth, An Evaluation of a Rule-Based Language for Classification Queries, LNCS 3392

Applications of Declarative Programming and Knowledge Management, Springer, Ber-

lin/Heidelberg, 2005, 79-97.

[14] V. Haarslev, M. Wessel, Querying GIS with Animated Spatial Sketches, in: Proceedings of

IEEE Symposium on Visual Languages, Capri, Italy, 1997, 201-208.

40

[15] S. Kaushik, E.A. Rundensteiner, SEE: A Spatial Exploration Environment Based on a Direct-

Manipulation Paradigm, IEEE Transactions on Knowledge and Data Engineering 13(4) (2001)

654-670.

[16] K. K. Kemp, Fields as a framework for integrating GIS and environmental process models. Part

one: Representing spatial continuity. Transactions in GIS 1(3) (1997) 219-234.

[17] K. K. Kemp, Fields as a framework for integrating GIS and environmental process models. Part

two: Specifying field variables. Transactions in GIS 1(3) (1997) 235-246.

[18] Y.C. Lee, F.L. Chin, An Iconic Query Language for Topological Relationships in GIS, Interna-

tional Journal of GIS 9(1) (1995) 25-46.

[19] H. Lin, B. Huang, SQL\SDA: A Query Language for Supporting Spatial Data Analysis and Its

Web-Based Implementation, IEEE Transactions on Knowledge and Data Engineering 13(4)

(2001) 671-682

[20] , C.P. Lo, A.K.W. Yeung Concepts and Techniques of Geographic Information Systems, Pren-

tice Hall (2002) ISBN: 0130804274.

[21] P.A. Longley, M.F. Goodchild, D.J. Maguire, D.W. Rhind, Geographic Information. Systems

and Science. John Wiley.(2001) ISBN 0471495212.

[22] W. Mendenhall, R. J. Beaver, B. M. Beaver, Introduction to Probability and Statistics, 12th edi-

tion, Duxbury Press; 2002

[23] Meyer B, Beyond Icons: Towards new metaphors for visual query languages for spatial informa-

tion systems, in: Proceedings of International Workshop on Interfaces to Database Systems (IDS

92), Glasgow, UK, 1992, 113-135.

http://www.abebooks.com/servlet/BookDetailsPL?bi=918713077&searchurl=nsa%3D1%26isbn%3D0534395198

41

[24] A.J.Morris, A.I. Abdelmoty, B.A.El-Geresy, C.B.Jones, A Filter Flow Visual Querying Lan-

guage and Interface for Spatial Databases, GeoInformatica 8(2) (2004) 107-141.

[25] N. Murray, N.W. Paton, C.A. Goble, J. Bryce, Kaleidoquery—A Flow-based Visual Language

and its Evaluation, in Journal of Visual Languages and Computing 11(2) (2000) 151-189

[26] L. Paolino, F. Pittarello, M. Sebillo, G. Tortora, G. Vitiello, WebMGISQL - A 3D Visual Envi-

ronment for GIS Querying, in: Proceedings of International Conference on Visual Languages

and Computing (VLC’03), Miami, Florida, USA, 2003, 294-299.

[27] L.Paolino, M. Sebillo, G. Tortora, G. Vitiello, Extending the OpenGIS© for Managing Discrete

and Continuous Time Dependent Data, in The European Information Society Leading the Way

with Geo-information, Lectures Notes in Geoinformation and Cartography Springer, Berlin,

2007, 265-286

[28] M. Sebillo, G. Tortora, G. Vitiello, The Metaphor GIS Query Language, Journal of Visual Lan-

guages and Computing 11(4) (2000) 439-454.

[29] B. Shneiderman, Designing the User Interface – Strategies for Effective Human- Computer Inte-

raction, Addison-Wesley,3rd edition, 1998.

[30] K. Silvervarg, E. Jungert, A Visual Query Language for Uncertain Spatial and Temporal Data,

in: Proceedings of the Conference on Visual Information Systems 2005 (VISUAL 2005), Am-

sterdam, The Netherlands, July 2005, 163-176.

[31] M. Yi-Miin Yen, R. W. Scamell, A Human Factors Experimental Comparison of SQL and QBE.

IEEE Transactions on Software Engineering 19(4), (1993) 390-409.

http://www.springerlink.com/content/100268/?p=c098bdd458dc4462a41176bee6bcd13c&pi=0

http://www.sciencedirect.com/science/journal/1045926X

http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/s/Scamell:Richard_W=.html

http://www.informatik.uni-trier.de/~ley/db/journals/tse/tse19.html#YenS93

42

[32] F. Wang, J. Sha, H. Chen, S. Yang, GeoSQL: A Spatial Query Language of Object-oriented GIS,

in: Proceedings of 2nd International Workshop on Computer Science and Information Technol-

ogies (CSIT'2000), Ufa, Russia, 2000, 215-219

[33] M. Wessel, V. Haarslev, VISCO: Bringing Visual Spatial Querying to Reality, in: Proceedings

of the IEEE Symposium on Visual Languages, Nova Scotia, Canada, 1998, 170-177.

[34] ArcGIS http://www.esri.com

[35] Seventh Research Framework Programme (FP7), programme on 'Cooperation' - Thematic area

―Environment‖. http://cordis.europa.eu/fp7/cooperation/environment_en

[36] SQL Multimedia and Application Packages (SQLMM, Part3: Spatial), ISO Working

[37] The Open Geospatial Consortium, Inc.® (OGC) http://www.opengeospatial.org/

Figure 1 Temperature continuous field in Ethiopia in 1995 (delimited within the dark grey area)

http://cordis.europa.eu/fp7/cooperation/environment_en

43

Figure 2. The Phenomena environment.

Figure 3 – Communication Panel flow.

44

SELECT Intersect(Vegetation.getValue(value > 65), Tem-

perature.getValue(value > 35), Roads.buffer(100))

FROM Tempertaure, Vegetation, Roads, Campania

WHERE Roads.overlaps(Campania) AND tempera-

ture.during(Vegetation) AND (temperature.during(August)

OR temperature.during(July))

Figure 4. The reference example code as implemented through the Extended Spatial SQL.

Figure 5. A schematic description of a Dictionary object

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE objects SYSTEM "objects.dtd">

<objects>

<type>Polygon</type>

<img>image\districts.gif</img>

<meaning>Campania District</ meaning >

<source> SELECT geometry FROM CampaniaDistrict WHERE

region = Campania </source>

<description />

</object>

Figure 6 The XML description of the Campania district Dictionary object

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE objects SYSTEM "objects.dtd">

45

<objects>

<type>SURFACE</type>

<img>image\temperature.gif</img>

<meaning>Temperature</ meaning >

<source>

SELECT continuousField FROM temperature </source>

<description />

</object>

Figure 7 The XML description of the Temperature Dictionary object

Figure 8. Black shapes represent continuous fields domains and green shapes discrete data objects.

Figure 9. A composition of the Overlap relationships as depicted in Phenomena.

46

Figure 10. A composition of the During Temporal relationships as depicted in Phenomena

Figure 11. A visual description of the temporal relationships.

47

Figure 12. An example of Boolean expression depicted according to the Condition Tree.

1 for each path in tree

2 for each node in path

3 if node is not root then

4 query concat node

5 if node is not leaf then query concat “AND”

6 if path is not last then query concat “OR”

7 return query

Figure 13. The Algorithm translating trees into SQL code.

Figure 14. An example of Boolean query composition in Phenomena.

48

Figure 15. An example of buffer operation in the Phenomena environment.

Figure 16. The schema of the Basic Function.

49

Figure 17. A example of application of the GetValue operation.

Figure 18. An example of application of the Nested Rectangle.

50

Table 1. The base of experimental subjects

Non-expert Programmers NP(10) expert Programmers P (10)

ESSQL (5) Phenomena (5) ESSQL(5) Phenomena(5)

Table 2: Results of the experiment for the two groups of non-programmers (top) and pro-

grammers (bottom).

ESSQL --- NP Phenomena - NP

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10

T1 C C M S F C C C C C

T2 D D F F F C C C C S

T3 S F F F F C C F C S

T4 F F F F F S S F S F

T5 F F F F F S F F F F

T6 F N N N N F N N F F

ESSQL-P Phenomena - P

S11 S12 S13 S14 S15 S16 S17 S18 S19 S20

T1 C C C C C C C C C C

T2 C C C C C C C C C C

T3 C M C C F C C C C F

T4 C M C C F C S C C C

T5 D F F C N S F S F F

T6 F F F S N C C N F F

Table 3: Number of errors for each group.

Non-Programmers Programmers

Response Category ESSQL Phenomena ESSQL Phenomena

C (Correct) 2 12 17 20

D (Minor data error) 2 0 1 0

M (Minor language error) 1 0 2 0

S (Error of substance) 2 6 1 3

F (Error of form) 19 10 7 6

N (Not attempted) 4 2 2 1

30 30 30 30

Table 4: Percentage of query responses in each category


Response Category ESSQL Phenomena ESSQL Phenomena

C (Correct) 6.67 40.00 56.67 66.67

D (Minor data error) 6.67 0.00 3.33 0.00

M (Minor language error) 3.33 0.00 6.67 0.00

51

S (Error of substance) 6.67 20.00 3.33 10.00

F (Error of form) 63.33 33.33 23.33 20.00

N (Not attempted) 13.33 6.67 6.67 3.33

Table 5: Mean accuracy scores for each group, as a percentage of total (standard deviation)


Language Category ESSQL Phenomena ESSQL Phenomena

Mean accuracy 6.6 10.4 13 13.8

Standard Deviation 1.8 1.7 3.1 1.4

Table 6: The standard table of significance for the three sessions of the experiment.

t p

H1 -3.24 0.012

H2 - 0.6

H3 -3.03 0.01

Documents

Phenomena – A visual environment for querying heterogenous spatial data