21
Spatial Data Modeling: Issues and Implications on Geographic Information Systems Jinsoo Park Information and Decision Sciences Department Carlson School of Management University of Minnesota Minneapolis, MN 55455 E-mail: [email protected] URL: http://kimchi.csom.umn.edu Abstract Even though the potential ability of Geographic Information System (GIS) as decision support for business activities has been widely recognized, the topic of Geographic Information Systems (GIS) has received little attention in the business literature. Spatial database systems provide the underlying database technology for Geographic Information Systems and other applications. Modeling spatial objects and their operations in spatial databases is a relatively new research area. Spatial data models should include constructs of high-level abstractions, spatial entities, relationships, operators and a query language, which provides rich concepts to efficiently and effectively handle spatial data. We present a comprehensive survey of the current state-of-the-art in spatial databases. This paper also examines issues related with modeling spatial complexity. We conclude the paper with directions for future research in spatial databases. Working Paper Last revised on September 2000

Spatial Data Modeling

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spatial Data Modeling

Spatial Data Modeling: Issues and Implications on Geographic

Information Systems

Jinsoo Park

Information and Decision Sciences Department Carlson School of Management

University of Minnesota Minneapolis, MN 55455 E-mail: [email protected]

URL: http://kimchi.csom.umn.edu

Abstract

Even though the potential ability of Geographic Information System (GIS) as decision support for business activities has been widely recognized, the topic of Geographic Information Systems (GIS) has received little attention in the business literature. Spatial database systems provide the underlying database technology for Geographic Information Systems and other applications. Modeling spatial objects and their operations in spatial databases is a relatively new research area. Spatial data models should include constructs of high-level abstractions, spatial entities, relationships, operators and a query language, which provides rich concepts to efficiently and effectively handle spatial data. We present a comprehensive survey of the current state-of-the-art in spatial databases. This paper also examines issues related with modeling spatial complexity. We conclude the paper with directions for future research in spatial databases.

Working Paper Last revised on September 2000

Page 2: Spatial Data Modeling

1

1 Introduction Even though the potential ability of Geographic Information System (GIS) as decision support for business activities has been widely recognized [Kelly 1996; Murphy 1996; Youtie and Brown 1996], the topic of Geographic Information Systems (GIS) has received little attention in business schools [Kuehn et al. 1994]. The applicability of GIS to business applications in areas, such as target marketing, strategic planning, banking, customer location analysis, real estate, site selection, transportation, etc., is discussed in [Kuehn et al. 1994; Kelly 1996]. A GIS is an information system that is designed to work with data referenced by geographic coordinates [Star and Estes 1990]. In addition, a GIS describes objects from the real world in terms of (1) locational information – their positions with respect to a known coordinate system (i.e., the object is geographic-referenced), (2) descriptive information – their attributes, such as soil type, elevation, demographic information, etc., that are not directly related to location, (3) spatial relationships – their spatial interrelationships with other objects, which describe how they are linked together, and (4) temporal information – information about the evolution of both spatial and non-spatial data in time.

There are many open research areas in GIS, such as spatial modeling and analysis, multidimensional visualization, and data format standardization. Among these research issues, one of the most immediate and apparent problems that must be resolved is to identify ways to manage complex spatial data objects, since the majority of these issues relies on GIS data management. Modern GISs require the ability to handle a large volume of data from different sources, such as digitizing or satellite remote sensing. The major challenge of GIS is the effective management of huge amounts of data. Conventional GIS packages are built on a relatively simple file system which make them unsuitable for dealing with very large data sets. A Database Management System (DBMS) would seem more suitable for GIS, since DBMSs provides tools needed to handle and integrate the data, in addition to high-level data definition, integrity constraints, flexible queries, physical data independencies and transaction management for multi-users. However, since GISs require special data manipulation techniques such as multi-layer analysis, map generation and abstraction, polygon overlay and dissolve, grid cell analysis, and digital terrain analysis, as well as special devices such as digitizers for graphical input [Dangermond 1993], commercially available relational database management systems (RDBMSs) are not appropriate for GIS. Limitations of RDBMSs include extensibility, graphical input and output, and long-term transactions. Therefore, we need a special kind of database system, a spatial database system, which supports GIS operations and provides other distinct services which current GIS applications lack.

Güting [1994] defines a spatial database system as a database system which provides spatial data types such as points, lines and polygons in its data model and query language. It also supports spatial data types in its implementation, providing at least spatial indexing and algorithms for efficient spatial join. The importance of the spatial database system with regard to GIS is mentioned in several studies [Frank 1988; Guenther and Buchmann 1990; Smith and Frank 1990; Maguire 1991]. The spatial database system should provide spatial data types (e.g., multi-dimensional geometry [Herring 1992; Wiegand and Adams 1994] and spatial relationships and operators [Freeman 1975; Frank and Mark 1991; Güting 1994; Medeiros and Pires 1994; Egenhofer 1994a]), multiple representations of the same object at different scales [Angelaccio et al. 1990; Guptill 1990; Smith and Frank 1990], version control for temporal changes in spatial objects [Newell et al. 1992], long-term transactions for complex objects input (e.g., digitizing) and update [Kim 1990; Bertino and Martino 1991; Newell et al. 1992; Cattell 1994].

Since both GISs and spatial databases use spatial data, some people use the terms interchangeably. However, there are important distinctions between the two. The main purpose of a GIS is spatial analysis [Cowen 1988; Maguire 1991]. Cowen [1998] says “…the GIS actually would have created new information rather than just have retrieved previously encoded information. This ability to both automatically synthesize existing layers of spatial data and update a database of spatial entities is the key to a functional definition of a GIS.” Thus, Cowen argues that, “the emphasis of the GIS operations must be on the integration of different layers; not their creation,” and “…of all of the operations… spatial search and overlay are the only ones unique to GIS.” Further, spatial data is almost never used “as-is” [Alonso and Abbsadi 1994] and spatial analysis and operations are fundamental components of GISs. The spatial overlay operation is a typical operation in raster-based systems [Dangermond 1993]. Maguire [1991] points out that GISs are a subset (not a superset) of four different information systems, such as computer-added design (CAD) systems, computer cartography systems, remote sensing systems, and DBMS.

Spatial databases, on the other hand, emphasize data management aspects, such as data integrity, spatial query processing, concurrency control for multi-users, as well as support for spatial data types and operators. The application areas of spatial databases are not limited to GIS. Possible application areas include CAD, robotics,

Page 3: Spatial Data Modeling

2

medical image visualizations, and ecology simulations where spatial data is used. Thus, the scope of application areas of spatial databases is much broader than that of GIS. Since a spatial database utilizes spatial data management, it can be one of the major components of a GIS architecture. However, only a few GIS (e.g., TIGRIS [Herring 1992], PROBE [Orenstein and Manola 1988], GRAL [Güting 1989]) are built on spatial databases. Table 1 shows the differences between GIS and spatial databases.

-- Insert Table 1 around here --

In this paper, we survey data modeling in spatial databases. We first present fundamental concepts of basic spatial data types and their representation in computer-based systems. Modeling spatial objects and their operations in spatial databases is a relatively new research area. In order to model complex spatial phenomena, we should understand the nature and structure of spatial objects, their relationships and operations. Hence, we review these spatial constructs as well as two different views of modeling “space.” We then introduce a novel approach for modeling data in spatial databases. Finally, we discuss additional issues related with spatial data models.

2 Basic Spatial Data Types and Their Representation To understand the nature of spatial data, it is necessary to explain some basic concepts, such as points, lines and polygons, and their representations under two different systems, raster- and vector-based systems. A point might represent a single location in a coordinate system (e.g., {x, y}) such as a building or a city. A line consists of a series of points in which the two end points are in different coordinates (e.g., {x1y1, x2y2,…,xnyn}). Rivers and roads can be represented by lines. A polygon also consists of a series of points, but the two end points are in the same location (e.g., {x1y1, x2y2,…, x1y1}). Examples of polygons may include states, lakes and ponds. A spatial data object, however, can be represented by more than one of these types. For example, a city can be portrayed by either a point or a polygon, depending on the purpose of the application or on the different scales of the system. These data types are fundamental elements of spatial data and can be stored and represented in either raster or vector format. Which format is better for spatial data depends on the purpose of the application. Detailed surveys of these types are found in [Peuquet 1993]. Goodchild [1992] argues that neither raster- nor vector-based data types are suitable for modeling the complexity of the real world. An alternative approach that combines both raster and vector types was proposed by [Peuquet 1994]. However, raster and vector types have been and will be commonly used to represent spatial data in GIS.

In the raster-based system, space is divided into regularly sized and shaped (usually square) cells. Each cell is called a grid. The geographic location of each cell is implied by its position in the cell matrix. Thus, the smallest addressable unit of surface is a cell. The finer the grid, the better the resolution of the information. Since locational information is based on its position in the cell matrix, the geographic coordinates of the cells need not be stored. This type of system accommodates discrete areas (such as soil types) equally as well as continuous data (such as topography) and facilitates intermixing them. Since its emphasis is on space rather than objects, the raster-based data type is “location-based” [Peuquet 1994]. In the raster-based data type, data is stored as a number representing the information contained in a specific cell. For example, as illustrated in Figure 1, ‘1’ represents hardwood trees, while ‘2’ represents softwood trees. The values inside the grids appear on the computer screen not as numbers but as, for example, points of different colors.

-- Insert Figure 1 around here --

In the vector-based data type, as shown in Figure 2, each bit of information is represented as a set of connected points in which the line segment between two points can be considered a vector. This type of data is based on mathematical topology and includes operations to determine the boundary of a given object. Since the data are not stored in cells, the geographic locations must be explicitly defined. The basic addressable units of surface are “points,” “lines,” and “polygons.” Spatial data objects such as forest types, roads and streams can be individually and effectively retrieved and processed because each object is stored as a point, a line, or a polygon. Thus, it is easier to associate a variety of descriptive non-spatial data with a single geographic feature. Another advantage of using a vector-based system is that the output of graphics and paper plots are of higher quality than the output from

Page 4: Spatial Data Modeling

3

raster-based systems. Because of its emphasis on objects rather than space, the location information and attributes of an object are usually stored separately, but linked logically together. Thus, a vector-based system is “object-based” [Peuquet 1994]. The graphical output of an example will actually show a smoother connection between points.

-- Insert Figure 2 around here --

3 Two Different Views of Modeling Spatial Phenomena The need for abstract concepts to describe spatial objects and their operations in spatial databases is mentioned by many researchers [Morehouse 1990; Frank and Mark 1991; Lee and Isdale 1991; Burrough 1992; Falcidieno et al. 1992; Frank 1992]. It is also noted that a geoscientist’s model of space and time needs to be reconciled with the conceptual models developed in the database community [Guptill 1990]. Conceptual modeling provides a tool that is used to construct a high level description of abstract concepts in order to describe the real world and to capture more meaning from databases. The conceptual model is independent of the physical implementation of a particular database system. One of the purposes of the conceptual model design is “to describe the information content of the database, rather than the storage structures that will be required to manage this information” [Batini et al. 1992]. Worboys et al. [1990] provide a clear statement of the role of conceptual modeling:

A data model provides a tool for specifying the structural and behavioral properties of a database and ideally should provide a language which allows the user and database designer to express their requirements in ways that they find appropriate, while being capable of transformation to structures suitable for implementation in a database management system. The purpose of data modeling is to bring about the design of a database which performs efficiently; contains correct information (and which makes the entry of incorrect data as difficult as possible); whose logical structure is natural enough to be understood by users; and is as easy as possible to maintain and extend.

Lee and Isdale [1991] argue that there is a need for a special-purpose conceptual data model suitable for GIS applications – the “spatial data model.” The spatial data model should include constructs of high-level abstractions, spatial entities, relationships, operators, and a query language which provides rich concepts to efficiently and effectively handle spatial data. The spatial data model should provide well-defined spatial relationships and operators to manipulate the spatial nature of interest. However, conceptual modeling for spatial data has not received enough attention.

The question then is how to model spatial complexity; how to capture and explicitly represent the semantic meanings of spatial data objects, their relationships and constraints; and how to integrate temporal aspects in space. To answer these questions, we should first understand the nature and structure of space and spatial data objects in order to model them appropriately. The modeling of spatial phenomena depends on different perceptions of the real world because human beings employ different methods for conceptualizing space [Frank 1992]. Several studies discuss these different views [Goodchild 1992; Madero 1992; Güting 1994; Peuquet 1994]. They generally agree that there are two different approaches-- one views the world as a set of fully definable, discrete objects; the other views the world as complex continua in space. The former represents the “object-based view” [Goodchild 1992; Madero 1992; Güting 1994; Peuquet 1994], and the latter refers to either “field view” [Goodchild 1992; Medeiros and Pires 1994] or “space-based view” [Güting 1994; Peuquet 1994]. We will use the terms object-based view and space-based view to describe the two views. In the object-based view, the world is regarded as a set of descriptive objects, each object consists of its name, attributes, relationships and rules. For example, a river has its name, “the Mississippi River,” which runs through Minnesota (a spatial relationship with the state of Minnesota); any river cannot overlap highways and the Mississippi River does not cross Minneapolis (rules or constraints). Since each object is independent, two objects can be overlapped in the same location. This view is usually represented by vector-data model structures so that the data is processed as points, lines, polygons or a combination of these. In the space-based view, the complex continua of space can be represented by discretized surfaces (layers). The space-based view can be represented by raster type data in which the data is described in a set of cells. Spatial object information is explicitly recorded in the cells. Figure 3 provides a comprehensive diagram of these different views

Page 5: Spatial Data Modeling

4

from conceptual modeling to implementation. Although another important aspect of modeling space is time, the temporal dimension of spatial data models will not be fully addressed in this paper.

-- Insert Figure 3 around here --

4 Spatial Relationships and Operators To model complex spatial phenomena, we need, from the point of view of conceptual modeling, a set of formally defined spatial data types, spatial relationships, spatial operators and integrity constraints. These constructs should be highly expressive. The conceptual model should provide a mechanism for the evolution of spatial objects and their relationships. This allows users to add to and to change the model as time passes. It also should provide additional constructs to capture the dynamic nature of spatial phenomena. To the best of our knowledge, there is no widely accepted set of spatial constructs, relationships and operators. This may be due to the fact that different types of applications require such different functions [Medeiros and Pires 1994].

Spatial relationships are important because most spatial queries are related to them. For example, if an ecologist wanted to obtain information about the possible habitat of the long-toothed bat, the query might be “retrieve all hardwood tree areas within 80 miles of Minneapolis and near lakes, which do not cover any human-built constructions.” Spatial relationships can be categorized into five different types based on the their properties [Freeman 1975; Egenhofer and Herring 1990; Kainz 1990; Egenhofer 1994a], and they are formally defined in [Ram et al. 1999]:

• Topological relationships describe relative location (in time and/or space), and are based on a topological property that is independent of the existence of a distance. Thus, they are invariant under topological transformations such as scaling and rotation of the referenced spatial objects [Egenhofer and Franzosa 1991]. Examples include covers, disjoint, equal, inside, touching and overlap. Topological relationships are relatively well defined in mathematical terms. A discussion of them can be found in [Egenhofer and Franzosa 1991].

• Spatial order relationships are directional relationships based on the definition of order [Egenhofer 1994a]. Thus, in general, each relationship has a “semantic inverse” [Freeman 1975]. Hence, a binary relationship may be coupled with a relation that is its semantic inverse; for example above ↔ below, left ↔ right, behind ↔ in_front_of, near ↔ far, inside ↔ outside, etc.

• Metric relationships describe distances and directions between spatial objects. These relationships can be described by beyond, within, near, and far. An example query is: “List all public buildings within 3 miles from Carson School of Management.” In general, since spatial order relationships are based on order or direction and can be used for queries based on a distance matrix, these relationships may be regarded as a subset of metric relationships.

• Fuzzy relationships can be used if the direction and distance relationships between spatial objects are dependent on the perceptions of humans. For example, if users want to know “all the public schools near the city hall,” one might perceive “within 3 miles” as near, while another person thinks of “within 1 mile” as near. Some of the relationships that are associated with distance and direction might be perceived differently. Thus, a subset of spatial order relationships (i.e., left, right, behind, in_front_of ) and metric relationships (i.e., near, far) can be viewed as fuzzy relationships.

• Motion relationships describe the motion of one spatial object with relation to another object [Lee and Chin 1995]. For example, “a fire spreads over a particular vegetation area.” Since these relationships can depict the movement of a spatial object, it may be useful to describe some dynamic nature of an object (e.g., fire) with relation to other spatial objects.

Even though some of the relationships, such as fuzzy and motion relationships, are not fundamental and overlap with other relationships, these relationships are useful from the semantic point of view since they allow the

Page 6: Spatial Data Modeling

5

conceptual schema to capture more semantics from the relationships. The importance of the semantic aspects of spatial modeling is discussed in [Ram et al. 1999].

Another classification can be based on the number of spatial objects involved in a relationship. For example, if “the city hall is located between the football stadium and the police station,” three objects are involved, i.e., “city hall,” “football stadium,” and “police station.” Therefore, the spatial relationship between is a ternary relationship. If only two objects are involved in the relationship, then the relationship is binary. Table 2 shows classification and examples of spatial relationships that appear in the literature [Freeman 1975; Egenhofer and Herring 1990; Lee and Chin 1991; Egenhofer 1994a; Lee and Chin 1995].

-- Insert Table 2 around here --

One of the most fundamental and important relationships is the topological relationship. Topological relationships are fundamental in the sense that these are a basic set of spatial relations. They are important because many other relationships can be derived from the composition of these relationships [Egenhofer 1991; Egenhofer 1994b; Sharma and Flewelling 1995], thus providing spatial constraints. Some intensive (but partial) work toward defining formal, mathematical concepts in topological relationships have been conducted and are summarized below [Egenhofer 1989; Egenhofer 1990; Egenhofer and Herring 1990; Egenhofer 1991; Egenhofer and Franzosa 1991; Egenhofer 1993; Egenhofer 1994b].

Based on the concepts of set and topology in mathematics, a theory of topological relationships in the 2-dimensional plane has been developed. These relationships are defined in terms of the intersections of the boundaries (denoted as Y°) and interiors (denoted as ∂Y) of two point sets, where the boundaries and interiors are either empty (∅) or non-empty (¬∅). Then, a topological relationship between X and Y can be described by four intersection sets {∂X ∩ ∂Y, X° ∩ Y°, ∂X ∩ Y°, X° ∩ ∂Y} resulting in sixteen different combinations (24 = 16). However, not all sixteen of the topological relationships exist between the two 2-dimensional spatial objects. Only eight topological relationships (disjoint, meet, equal, contains, inside, covers, covered_by and overlap) are valid (see Table 3 and Figure 4). Since two pairs of topological relationships ({contains, inside} and {covers, covered_by}) are converse [Egenhofer 1991], only six distinct topological relationships remain.

-- Insert Table 3 around here --

-- Insert Figure 4 around here --

The underlying assumption here is that relationships exist between 2-dimensional objects (regions) in a 2-dimensional space (plane); that is, the relationships are dimension-dependent. The topological relationships between 1-dimensional objects (lines) in a 2-dimensional space cannot be considered in this framework [Egenhofer and Franzosa 1991]. Thus, an additional intersection of the point set, complement Y -1 is considered. Since the notions of boundary, interior and complement are mutually exclusive, the union of the three sets provides a complete partition of the universe [Egenhofer 1991]. The intersections of these are described by nine intersection sets {∂X ∩ ∂Y, X° ∩ Y°, X -1 ∩ Y -1, ∂X ∩ Y°, ∂X ∩ Y -1, X° ∩ ∂Y, X° ∩ Y -1, X -1 ∩ ∂Y, X -1 ∩ Y°}. Since relationships among the nine intersection sets are considered with respect to the embedding space, it is possible to determine whether an object is completely included in an another object part if the dimensions of object and space are different [Egenhofer 1991]. Using the nine intersections and transitive laws of sets, compositions of topological relationships can be derived [Egenhofer 1991; Egenhofer 1994b]. For example, if A is inside B and B meets C, then A is disjoint from C. An extended work combining topology and spatial order relationships was done by [Sharma and Flewelling 1995].

We previously mentioned that spatial relationships are important for spatial queries. However, the logical expressive power of a query language should be based on formally defined algebra (e.g., relational algebra), and this algebra is expressed in terms of operators and predicates. For example, in relational databases, such operators are selection, join, projection, union, intersection, difference and product. Thus, in a spatial database, the query language should provide a set of well-defined spatial operators (e.g., geo-relational algebra [Güting 1988]). The spatial database should also provide extensibility in such a way that if the data structure is extended the operators could also be extended.

Page 7: Spatial Data Modeling

6

Table 4 shows the comprehensive classification of spatial operators from the literature [Güting 1988; Egenhofer 1990; Mainguenaud and M.A. 1990a; Mainguenaud and M.A. 1990b; Lee and Chin 1991; Egenhofer 1994a; Lee and Chin 1995]. Unary spatial operators are operations on a single spatial object. The classes of the other operators are categorized, as their names imply, based on their functional associations with the spatial relationships. For example, topological relationships can be derived based on the topological operators in queries. The first vertical table indicates the type of information returned. If the operator returns true/false, then the return information is “Boolean.” If the operator returns some numeric value(s) such as the size of an object, then the return information is “Numeric value(s).” The operator also may return the information as textual description(s) (e.g., the name of the city) or graphical representation(s) (e.g., point, assuming the city is coded as a point) of the object, depending on the implementation of the system.

-- Insert Table 4 around here --

5 USM* — Formal Semantic Data Model for Spatial Databases The purpose of data modeling is to create an isomorphic mapping between the intended real world semantics of a given universe and its symbolic representation on a machine [Sheth et al. 1993]. We have proposed a novel approach for modeling data in spatial databases, called USM* (Unifying Semantic Model) [Ram et al. 1999]. USM* serves as a formal specification mechanism for describing a Universe of Discourse (spatial data and model bases). It extends existing semantic model concepts [Hammer and McLeod 1981; Hull and King 1987; Ram 1995] and incorporates new constructs to support spatial and temporal objects which, to the best of our knowledge, are not found in existing data models. These new constructs are spatiotemporal entity classes, dynamic entity classes, spatiotemporal aggregate entity classes, spatiotemporal aggregate relationships, spatiotemporal interaction relationships, and cause-effect relationships. The formal definitions of these constructs can be found in [Ram et al. 1999]. The model captures the behavior of dynamic and process-oriented spatiotemporal objects, such as fire, storm, flood, etc. For example, fire is a process that not only affects vegetation composition and watershed response, but also changes relationships between spatial objects. Thus, fire is a “contact process” in which the influence within a model is always from some current point or cell (also known as an “affector”) to adjacent areas or cells (the place being affected). While it may be argued that object-oriented data models can capture the dynamic behavior of spatial objects, this behavior is hidden in the methods associated with each object class. Our proposed semantic model – USM*, however, can explicitly capture and show these data, using a higher level of abstraction to explain their behavior. For example, our semantic model has constructs called “dynamic entity class” and “cause-effect relationship” which are not found in object-oriented models. The latter do not distinguish between different types of classes and relationships. USM* can be used to access the underlying spatial databases or facilitate the interface among the simulation models and the spatial data. The detailed description of the implementation and use of USM* as well as formal definitions of USM* constructs are described in [Ram et al. 1999].

A schema describes the structure of data, including entity classes and their relationships. The schema of USM* is a 2-tuple Φ = (S, R), where S is the set of all spatiotemporal entity classes defined in [Ram et al. 1999] and R is the set of all spatiotemporal relationships defined in [Ram et al. 1999]. The schema Φ above presents only the extended part of USM*. The complete USM* consists of all the constructs defined in the USM [Ram 1995] as well as the constructs described above. Figure 5 depicts the USM* metamodel.

-- Insert Figure 5 around here --

6 Discussions and Research Issues This paper examines important issues of modeling spatial objects. Spatial query languages that are another important issue in this category, however, have not been discussed due to the space limit. Since users must be able to interact with spatial databases for GIS, it is very important to provide an effective user interface between users and GIS. Previously, substantial effort has been made to construct spatial queries and command languages [Berman

Page 8: Spatial Data Modeling

7

and Stonebraker 1977; Frank 1982; Güting 1988; Joseph and Cardenas 1988; Orenstein and Manola 1988; Roussopoulos et al. 1988; Goh 1989; Egenhofer 1990; Mainguenaud and M.A. 1990b; Raper and Bundock 1991; Egenhofer 1994a; Lee and Chin 1995]. Frank and Mark [1991] deal with the implications of using natural and formal languages and review the role of languages in spatial databases. They argue for designing a system that reflects the way people conceptualize their problem domains. Thus, a good user interface must provide a user environment which integrates the data modeling process, the availability of spatial operations, and queries. Guenther and Buchmann [1990] state that “… the goal of a spatial query language is to allow the easy formulation of queries that involve both spatial and non-spatial predicates without a loss of spatial semantics.”

It is generally agreed that the user interface of a spatial database is of major significance, and that it should be a topic of major future research. Two specific issues regarding interfaces to spatial databases are raised: (1) the importance of providing users with methods for finding the relevant data, which includes the whole set of methods through which users can access the data, and (2) the importance of providing users with methods for dealing with the data,which includes, but is not limited to, graphical presentations of the data in a format the users can understand [Smith and Frank 1990]. In both cases, it is important that the user interface be designed from the users’ viewpoint based on the concepts which they are familiar with.

In addition, browsing capabilities that users peruse until relevant data to the task are found, are very important because spatial browsing tools allow the user to find data on spatial location (e.g. through a sequence of index maps, from the continents through nations, states, counties to a street map) or based on features visible on one map that are indicative of others (finding mountain ranges where one may expect a certain ecosystem). During a browsing mode, data might be presented in such a way that the user need not wait for completion of some activity, (e.g. transmission of a large image) before a judgment can be made as to whether the data will be useful for a task. The tools to find the data should also inform the user about the availability and the cost/time delays involved in using data sets – this is especially important when accessing large databases (e.g. as in the case of terabyte-sized databases) particularly with heterogeneous and distributed data.

Other important research areas in spatial databases include the integration of spatial and non-spatial data, spatial data integrity constraints (e.g., contour lines should not cross), interaction with the relational model (e.g., it is very difficult to capture hierarchy of complex objects – nested definition; some spatial operations, such as ‘nearest to’, don’t fit to SQL), visualization of spatial input and output (e.g., graphical query language and multi-resolution output), extensibility (e.g., how to easily add new operations or new index types?), and performance (e.g., time to load a spatial index and build a spatially-indexed output is important; sequence of spatial operations – cascaded spatial join).

Page 9: Spatial Data Modeling

8

References Alonso, G. and A.E. Abbsadi, “Cooperative Modeling in Applied Geographic Research,” International Journal of

Intelligent and Cooperative Information Systems, Vol. 3, No. 1, 1994, pp. 83-102. Angelaccio, M., T. Catarci and G. Santucci, “Query by Diagram: A Fully Visual Query System,” Journal of Visual

Languages and Computing, Vol. 1,, 1990, pp. 255-273. Batini, C., S. Ceri and S. Navathe, Conceptual Database Design: An Entity-Relationship Approach,

Benjamin/Cummings, 1992. Berman, R. and M. Stonebraker, “Geo-Quel, A System for the Manipulation and Display of Geographic Data,”

Computer Graphics, Vol. 11, No. 2, Summer 1977, pp. 186-191. Bertino, E. and L. Martino, “Object-Oriented Database Management Systems: Concepts and Issues,” IEEE

Computer, April 1991, pp. 33-47. Burrough, P. A., “Are GIS Data Structures Too Simple Minded?,” Computers & Geosciences, Vol. 18, No. 4, 1992,

pp. 395-400. Cattell, R. G. G., Object Data Management: Object-Oriented and Extended Relational Database Systems, Revised

Edition, Addition-Wesley Publishing Company, 1994. Cowen, D. J., “GIS versus CAD versus DBMS: What are the Differences?,” Photogrammetric Engineering and

Remote Sensing, Vol. 54, No. 11, November 1988, pp. 1551-1555. Dangermond, J., “A Classification of Software Components Commonly Used in Geographic Information Systems, ”

in Introductory Readings in Geographic Information Systems, D. J. Peuquet and D. F. Marble (Eds.), Taylor & Francis,, 1993, pp. 30-51.

Egenhofer, M. J., “A Formal Definition of Binary Topological Relationships,” in Proceedings of the 3rd International Conference on Foundations of Data Organization and Algorithms, Paris, France, June, 1989, pp. 457-472.

Egenhofer, M. J., “Interaction with Geographic Information Systems via Spatial Queries,” Journal of Visual Languages and Computing, Vol. 1, 1990, pp. 389-413.

Egenhofer, M. J., “Reasoning about Binary Topological Relations,” in Proceedings of the 2nd International Symposium on Large Spatial Databases-SSD '91, Zürich, Switzerland, August 28–30, 1991, p. 143–160.

Egenhofer, M. J., “A Model for Detailed Binary Topological Relationships,” Geomatica, Vol. 47, No. 3 & 4, Autumn 1993, pp. 261-273.

Egenhofer, M. J., “Spatial SQL: A Query and Presentation Language,” IEEE Transactions on Knowledge and Data Engineering, Vol. 6, No. 1, February 1994a, p. 86–95.

Egenhofer, M. J., “Deriving the Composition of Binary Topological Relations,” Journal of Visual Languages and Computing, Vol. 5,, 1994b, pp. 133-149.

Egenhofer, M. J. and R. D. Franzosa, “Point-Set Topological Spatial Relations,” International Journal of Geographical Information Systems, Vol. 5, No. 2, April-June 1991, pp. 161-174.

Egenhofer, M. J. and J. R. Herring, “A Mathematical Framework for the Definition of Topological Relationships,” in Proceedings of the 4th International Symposium on Spatial Data Handling, Zürich, Switzerland, July 23–27, 1990, p. 803–813.

Falcidieno, B., C. Pienovi and M. Spagnuolo, “Descriptive Modeling and Prescriptive Modeling in Spatial Data Handling, ” in Proceedings of International Conference on GIS-From Space to Territory: Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, Pisa, Italy1992, pp. 122-135.

Frank, A. U., “MAPQUERY: Data Base Query Language for Retrieval of Geometric Data and Their Graphical Representation,” Computer Graphics, Vol. 16, No. 3, July 1982, pp. 199-207.

Frank, A. U., “Requirements for a Database Management System for a GIS,” Photogrammetric Engineering and Remote Sensing, Vol. 54, No. 11, November 1988, pp. 1557-1564.

Frank, A. U., “Spatial Concepts, Geometric Data Models, and Geometric Data Structures,” Computers & Geosciences, Vol. 18, No. 4, 1992, pp. 409-417.

Page 10: Spatial Data Modeling

9

Frank, A. U. and D. M. Mark, “Language Issues for GIS,” in Geographic Information Systms: Principles and Applications, D. J. Maguire, M. F. Goodchild and D. W. Rhind (Eds.) Zürich, Longman Scientific & Technical, Vol. 2, 1991, pp. 147-163.

Freeman, J., “The Modelling of Spatial Relations,” Computer Graphics and Image Processing, Vol. 4, No. 2, 1975, p. 156–171.

Goh, P. C., “A Graphic Query Language for Cartographic and Land Information Systems,” International Journal of Geographical Information Systems, Vol. 3, No. 3, 1989, pp. 245-255.

Goodchild, M. F., “Geographical Data Modeling,” Computers & Geosciences, Vol. 18, No. 4, 1992, pp. 401-408. Guenther, O. and A. Buchmann, “Research Issues in Spatial Databases,” SIGMOD Record, Vol. 19, No. 4, 1990, pp.

61-68. Guptill, S. C., “Multiple Representations of Geographic Entities Through Space and Time, ” in Proceedings of the

4th International Symposium on Spatial Data Handling, Zürich, Switzerland, July 23–27, 1990, p. 859–868. Güting, R. H., “Geo-Relational Algebra: A Model and Query Language for Geometric Database Systems,” in

Proceedings of International Conference on Extending Database Technology—EDBT '88, Venice, Italy, March 14–18, 1988, p. 506–527.

Güting, R. H., “GRAL: An Extensible Relational Database for Geometric Applications,” in Proceedings of the 15th International Conference on Very Large Data Bases, Amsterdam, The Netherlands, August 22-26, 1989.

Güting, R. H., “An Introduction to Spatial Database Systems,” The VLDB Journal, Vol. 3, No. 4, October 1994, pp. 357-399.

Hammer, M. and D. McLeod, “Database Description with SDM: A Semantic Database Model,” ACM Transactions on Database Systems, Vol. 6, No. 3, September 1981, pp. 351-386.

Herring, J. R., “TIGRIS: A Data Model for an Object-Oriented Geographic Information System,” Computers & Geosciences, Vol. 18, No. 4, 1992, p. 443–452.

Hull, R. and R. King, “Semantic Database Modeling: Survey, Applications, and Research Issues,” ACM Computing Surveys, Vol. 19, No. 3, September 1987, pp. 201-260.

Joseph, T. and A. F. Cardenas, “PICQUERY: A High Level Query Language for Pictorial Database Management,” IEEE Transactions on Software Engineering, Vol. 14, No. 5, 1988, pp. 630-638.

Kainz, W., “Spatial Relationships—Topology versus Order,” in Proceedings of the 4th International Symposium on Spatial Data Handling, Zürich, Switzerland, July 23–27, 1990, p. 814–819.

Kelly, K., “GIS In Economic Development and Business Decision Support,” in Proceedings of the 2nd Americas Conference on Information Systems, Phoenix, AZ, August 16-18, 1996, pp. 560-562.

Kim, W., Introduction to Object-Oriented Databases, Cambridge, The MIT Press, 1990. Kuehn, A. A.,T. W. McGuire,N. P. Melone and V. R. Penninti, “High-value PC-based GIS Applications in

Business,” in Proceedings of the 27th Annual Hawaii International Conference on System Sciences, Wailea, Hawaii, January 4-7, 1994, pp. 327-336.

Lee, Y. C. and F. L. Chin, “A Query Language Based on a Spatially Intuitive Grammar,” in Proceedings of Canadian Conference on GIS-91, March 18–22, 1991, p. 868–879.

Lee, Y. C. and F. L. Chin, “An Iconic Query Language for Topological Relationships in GIS,” International Journal of Geographical Information Systems, Vol. 9, No. 1, January-February 1995, pp. 25-46.

Lee, Y. C. and M. Isdale, “The Need for a Spatial Data Model,” in Proceedings of Canadian Conference on GIS-91, March 18–22, 1991, p. 530–530j.

Madero, F. J., “A Rule-Based Mediator Implementation for Solving Semantic Conflicts in SQL,” Bachelor of Science's Theis, Electrical Science and Engineering, MIT, Cambridge, Mass., 86 pages, 1992.

Maguire, D. J., “An Overview and Definition of GIS,” in Geographical Information Systems: Principles and Applications, M. D.J., M. F. Goodchild and D. W. Rhind (Eds.), Longman Scientific & Technical, Vol. 1, 1991, pp. 9-20.

Page 11: Spatial Data Modeling

10

Mainguenaud, M. and P. M.A., “CIGALES: A Graphical Query Language for Geographical Information Systems,” in Proceedings of the 4th International Symposium on Spatial Data Handling, Zürich, Switzerland, July 23-27, 1990a, p. 393–404.

Mainguenaud, M. and P. M.A., “Definition of CIGALES: A Geographical Information System Query Language,” in Proceedings of International Conference on Database and Expert Systems Applications, Vienna, Austria, August 29-31, 1990b, p. 275–280.

Medeiros, C. B. and F. Pires, “Databases for GIS,” SIGMOD Record, Vol. 23, No. 1, March 1994, pp. 107-115. Morehouse, S., “The Role of Semantics in Geographic Data Modelling,” in Proceedings of the 4th International

Symposium on Spatial Data Handling, Zürich, Switzerland, July 23–27, 1990, p. 689–698. Murphy, L. D., “Competing in Space: The Strategic Roles of Geographic Information Systems,” in Proceedings of

the 2nd Americas Conference on Information Systems, Phoenix, AZ, August 16-18, 1996, pp. 563-565. Newell, R. G.,D. Theriault and M. Easterfield, “Temporal GIS—Modeling the Evolution of Spatial Data in Time,”

Computers & Geosciences, Vol. 18, No. 4, 1992, pp. 427-433. Orenstein, J. A. and F. Manola, “PROBE Spatial Data Modeling and Query Processing in an Image Database

Application,” IEEE Transactions on Software Engineering, Vol. 14,, 1988, pp. 611-629. Peuquet, D. J., “A Conceptual Framework and Comparison of Spatial Data Models,” in Introductory Readings in

Geographic Information Systems, D. J. Peuquet and D. F. Marble (Eds.), Taylor & Francis,, 1993, p. 250–285. Peuquet, D. J., “It’s About Time: A Conceptual Framework for the Representation of Temporal Dynamics in

Geographic Information Systems,” Annals of the Association of American Geographers, Vol. 84, No. 3, 1994, pp. 441-461.

Ram, S., “Intelligent Database Design Using the Unifying Semantic Model,” Information and Management, Vol. 29,, 1995, pp. 191-206.

Ram, S., J. Park and G. Ball, “Semantic Model Support for Geographic Information Systems,” IEEE Computer, Vol. 32, No. 5, May 1999, pp. 74-81.

Raper, J. F. and M. S. Bundock, “UGIX: A Layer Based Model for a GIS User Interface,” in Cognitive and Linguistic Aspects of Geographic Space, D. M. Mark and A. U. Frank (Eds.), Kluwer Academic Publishers,, 1991, p. 449–475.

Roussopoulos, N.,C. Faloutsos and T. Sellis, “An Efficient Pictorial Database System for PSQL,” IEEE Transactions on Software Engineering, Vol. 14, No. 5, May 1988, pp. 639-650.

Sharma, J. and D. M. Flewelling, “Inferences from Combined Knowledge about Topology and Directions,” in Proceedings of the 4th International Symposium on Large Spatial Databases-SSD '95, Portland, ME, August 6-9, 1995, p. 279–291.

Sheth, A. P., S. K. Gala and S. B. Navathe, “On Automatic Reasoning For Schema Integration,” International Journal of Intelligent and Cooperative Information Systems, Vol. 2, No. 1, 1993, pp. 23-50.

Smith, T. R. and A. U. Frank, “Report on Workshop on Very Large Spatial Databases,” Journal of Visual Languages and Computing, Vol. 1, No. 3, 1990, pp. 291-309.

Star, J. and J. Estes, Geographic Information Systems: An Introduction, Englewood Cliffs, Prentice-Hall, 1990. Wiegand, N. and T. M. Adams, “Using Object-Oriented Database Management for Feature-Based Geographic

Information Systems,” Journal of the Urban and Regional Information Systems Association, Vol. 6, No. 1, Spring 1994, pp. 21-36.

Worboys, M., H. Hearnshaw and D. Maguire, “Object-Oriented Data Modeling for Spatial Databases,” International Journal of Geographical Information Systems, Vol. 4, No. 4, October-December 1990, p. 369–383.

Youtie, J. and A. Brown, “Using GIS to Support Marketing and Business Development Efforts by Small Manufacturers,” in Proceedings of the 2nd Americas Conference on Information Systems, Phoenix, AZ, August 16-18, 1996, pp. 558-559.

Page 12: Spatial Data Modeling

11

Table 1. The Differences between GIS and Spatial Databases

GIS Spatial Databases

Domain Application-oriented information systems

Database Management Systems

Focus Spatial analysis (e.g., spatial overlay, digital terrain analysis, etc.)

Data management (e.g., data integrity, spatial query processing, concurrency control, etc.)

Scope of application areas Narrower o Geographic-referenced

information systems

Broader o Geographic-referenced

information systems o Space-referenced information

systems – CAD – Robotics – Medical Image Visualization

Page 13: Spatial Data Modeling

12

Table 2. Classification of Spatial Relationships

Binary Ternary

Topological Relationships covers, covered_by, disjoint, equal, inside, contains, meet, touching, overlap

Spatial Order Relationships above, on, below, left, right, north, south, west, east, in_front_of, behind

Metric Relationships spatial order relationships beyond, beside, next_to, within, near, far

between

Fuzzy Relationships left, right, in_front_of, behind, near, far Motion Relationships cross, through, spread

Page 14: Spatial Data Modeling

13

Table 3. Specification of Eight Topological Relationships (adopted from [Egenhofer and Herring 1990])

∂X ∩ ∂Y X° ∩ Y° ∂X ∩ Y° X° ∩ ∂Y X disjoint Y ∅ ∅ ∅ ∅ X meet Y ¬∅ ∅ ∅ ∅ X equal Y ¬∅ ¬∅ ∅ ∅ X inside Y ∅ ¬∅ ¬∅ ∅ X covered_by Y ¬∅ ¬∅ ¬∅ ∅ X contains Y ∅ ¬∅ ∅ ¬∅ X covers Y ¬∅ ¬∅ ∅ ¬∅ X overlap Y ¬∅ ¬∅ ¬∅ ¬∅

Page 15: Spatial Data Modeling

14

Table 4. Classification of Spatial Operators Unary spatial

operators Topological operators

Spatial order operator Metric operators

Boolean

at connected_to, disjoint_from, inside, encloses, inclusion, is_neighbour_of

above, below, begins_at, ends_at, in_front_of, behind, north_of, south_of, west_of, east_of, left_of, right_of, on

adjacent_to, crossing, in_between, in_or_crossing, intersects, near, far, outside, within numeric_unit of

Numeric value(s)

area, area_of, boundary, coordinates_of, dimension, interior, length, length_of, perimeter, perimeter_of, volume, xcoord, ycoord, zcoord

direction distance, distance_between, dist, mindist, maxdist

Textual description / graphical representation

begins_at, bounds_of, buffer, center, center_of, circular_range_search, closest_object_search, rectangular_range_search, clip, cursor, diameter, ends_at, extension, lines_of, nodes_of, points_of, vertices

overlay, difference, spatial_difference, intersection, spatial_intersection, union, spatial_union

adjacency, closest, convex_hull, intersection_point, line_between, link, path, voronoi,

Page 16: Spatial Data Modeling

15

1 1 1 11 1 1 11 1 1 11 1 1 11 1

1 11 1 1 11 1 1 1

1 1

1 1

2 2 2 22 2 2 2

2 2 2 22 2 2 2

2 2 2 22 2 2 2

2 2 2 22 2 2 2 2 2

0 0 0 00 0 0 0

0 0 0 0 0 00 00 0 0 0 0 0

00000000000 0

0000

0000

0000

00000 0

00

0

0 = Null Value1 = Hardwood Trees2 = Softwood Trees

Figure 1. Simple Example of Raster Data Type

Page 17: Spatial Data Modeling

16

1

4

2

3

Spatial DataID Obj_type Coordinations 1 polygon x1y1, x2y2,…, x1y1

2 line x1y1, x2y2,…, xnyn 3 point x, y 4 polygon x1y1, x2y2,…, x1y1

Non-Spatial DataVegetation TableID Name Area Perim Type Density 1 Conanda 54 89 Hardwood 45 4 Ventana 69 101 Grass 39

River TableID Name Length Volume 2 Colorado 126 86

Building TableID Name Address Class 3 McClelland 25 Speedway Public

Figure 2. Simple Example of Vector Data Type

Page 18: Spatial Data Modeling

17

Real World

Object-Based View(Discrete View)

Space-Based View(Continuous View)

Relational ModelSemantic (ER) Model

Object-Oriented ModelDiscretized Surface (Grid) Mathematical Fuctions

Vector Raster

SpatialReasoning

Heterogeneity

RepresentationHeterogeneity

InternalHeterogeneity

Filtering and ObjectRecognition

Sets of FunctionParameters

Figure 3. Differences in Spatial Modeling

Page 19: Spatial Data Modeling

18

disjoint meet inside contains

equal overlap covered_by covers

X

Y

XY

XY X

Y

X YX

YX

Y XY

Figure 4. Graphical Representation of Eight Topological Relationships

Page 20: Spatial Data Modeling

19

SpatiotemporalEntity Classes

SpatiotemporalAggregate

Entity Classes

DynamicEntity Classes

SimpleSpatiotemporalEntity Classes

SpatialAggregate

Entity Classes

TemporalAggregate

Entity Classes

SimpleSpatiotemporal

Agg Entity Classes

SS S

S SS

P

P

USM*Models

Agg

ObjectClasses

S

SpatiotemporalRelationships

SpatiotemporalAggregate

Relationships

SpatiotemporalInteraction

Relationships

SpatialAggregate

Relationships

TemporalAggregate

Relationships

SimpleSpatiotemporal

Agg Relationships

S S

S SS

P

P

S SS

TopologicalRelationships

S

FuzzyRelationships

MetricRelationships

Spatial OrderRelationships

MotionRelationships

S

X

Cause-EffectRelationships

S

Related By

RelatesCreated By

Spatial Index

Temporal Index

Attributes

S SS

SS

Time Stam

pTim

e Interval

StructuralAttributes

SimpleAttributes

DerivedAttributes

Entity Class

Interaction Entity Class

Composite/Grouping Entity Class

Weak Entity Class

S

Agg

Generalization/Specialization

Aggregate

Partition (Coverwith No Overlap)

Mutually Exclusive(No Overlap)X

P

Legends

Inteaction Relationship

Figure 5. The USM* Meta Model

Page 21: Spatial Data Modeling

20

About Author Jinsoo Park is Assistant Professor of Information and Decision Sciences Department in the Carlson School of Management at the University of Minnesota. He joined the Carlson School of Management in September 1999 after completing a Ph.D. in Management Information Systems from the University of Arizona where he was a recipient of multiple research fellowships. He was also a participant in the 1998 Doctoral Consortiums at ICIS and AIS. His research interests are in the areas of semantic interoperability and metadata management in interorganizational information systems, heterogeneous database management and integration, knowledge networking and sharing, data modeling, spatial database systems, Geographic Information Systems, and intelligent agents for information resource management. His current research deals with the design of semantic conflict resolution environment for heterogeneous Internet information systems, semantic interoperability in B2B, and agent-based knowledge networking and sharing. His particular teaching interests include database management systems, object-oriented system development, and web-based e-business applications.