29
Using formal ontology for integrated spatial data mining Julie Sungsoon Hwang Department of Geography State University of New York at Buffalo ICCSA04 Perugia, Italy May 14, 2004

Using formal ontology for integrated spatial data mining

  • Upload
    tommy96

  • View
    660

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using formal ontology for integrated spatial data mining

Using formal ontology for integrated spatial data mining

Julie Sungsoon HwangDepartment of GeographyState University of New York at BuffaloICCSA04 Perugia, ItalyMay 14, 2004

Page 2: Using formal ontology for integrated spatial data mining

Research purposes

Enlighten the role of formal ontology in KDD

Propose the conceptual framework for ontology-based spatial data mining

Case study: ontology-based spatial clustering algorithms

Page 3: Using formal ontology for integrated spatial data mining

Problems in focus (cont.)

No single algorithm is best suited to all research purposes and application domains.

The same algorithm can yield results inconsistent with fact without considering domain knowledge

The same data may have to be analyzed in different ways depending on users’ goal

Page 4: Using formal ontology for integrated spatial data mining

Problems in focus

Developing new algorithms

AlgorithmD

AlgorithmC

AlgorithmA

AlgorithmB

AlgorithmD’

DomainDomain TaskTask

Re-using existing algorithms

Suited to domain and task

How can algorithms be customized to varying domain and task?

Page 5: Using formal ontology for integrated spatial data mining

Relation between data mining and ontology construction

KnowledgeKnowledge

OntologyOntologyOntology Construction

(Knowledge acquisition)

Level o

f abstra

ction

DataData

InformationInformationData Mining

(Knowledge discovery)

KnowledgeKnowledge

Page 6: Using formal ontology for integrated spatial data mining

Role of formal ontology in KDD

Provide the context in which the knowledge Provide the context in which the knowledge extracted from data is interpreted and evaluatedextracted from data is interpreted and evaluated

Guide algorithms such that they can be suitable Guide algorithms such that they can be suitable for domain-specific and task-oriented conceptsfor domain-specific and task-oriented concepts

KDD Process DiagramKDD Process Diagram

Page 7: Using formal ontology for integrated spatial data mining

Using ontology for spatial data mining

Ontology formalizes how the knowledge is conceptualized, thereby making implicit meaning explicit

Data mining extracts a high-level knowledge from a low-level data, thereby enhancing the level of understanding

DomainDomainModelModel

TaskTaskModelModel

Ontology Spatial Data Mining

Low-level dataLow-level data

High-level knowledgeHigh-level knowledge

Page 8: Using formal ontology for integrated spatial data mining

Domain-specific spatial data mining

Let’s compare two different domains: traffic accident versus retailers

Domain of traffic accident

Domain of retailers

Is-a

Spatial constraints

Event Physical object

In road network Outside of road network

Spatial data mining algorithms should take into account different conceptualization (domain-specific properties)

Page 9: Using formal ontology for integrated spatial data mining

Task-oriented spatial data mining

Let’s compare two different tasks: detecting hotspots of traffic accident versus partitioning market areas based on the location of retail

Detect hotspots of traffic accident

Partition market areas to a retailer

# of clusters k

Level of details

Spatial data mining algorithms should take into account different tasks and users’ need

Depend on spatial distributn.

Given (resource constraint)

Varies with scale (depends on area of users’ interest)

Doesn’t vary with scale

Page 10: Using formal ontology for integrated spatial data mining

Ontology as an active component of information system

e.g. medicine e.g. diagnosing

e.g. space, time, matter, object, event

Application OntologyApplication Ontology

Task OntologyTask OntologyTask OntologyTask OntologyDomain OntologyDomain OntologyDomain OntologyDomain Ontology

Top-level OntologyTop-level Ontology

dependence

subject

From Guarino, 1998

Page 11: Using formal ontology for integrated spatial data mining

Conceptual framework for ontology-based spatial data mining (OBSDM)

Page 12: Using formal ontology for integrated spatial data mining

Component of OBSDM

Page 13: Using formal ontology for integrated spatial data mining

OBSDM:: Input:: Metadata

Tag structure of XML can be utilized to inform domain ontology of the semantics of data

Page 14: Using formal ontology for integrated spatial data mining

Component of OBSDM

Page 15: Using formal ontology for integrated spatial data mining

OBSDM:: OBSDMM:: Domain Ont.

Terms within the “theme” tag in the metadata are used as a token to locate the appropriate domain ontology

Domain ontology specifies the definition, class, and properties Class example: Accident is a Subclass-Of Temporal-

Thing Properties example: Road has a Geographic-Region

as a Value-Type

Properties of class inherit from top-level ontology

Page 16: Using formal ontology for integrated spatial data mining

Domain ontology := Traffic accident

Theory TRAFFIC-ACCIDENT-DOMAIN As a spatial thing,

Point(x) On(x, y) Roadway(y) Line(y) In(y, z) Geographic-Region(z)

As a temporal thing, Point(x) At(x, y) Time(y) Event(x) <=> Occurrence(x) Notification(x) Response(x)

Arrival(x) Before(Occurrence(x), Notification(x))

As an intangible thing, Accident (x) RelatedTo(x, y) Vehicle(y)

Page 17: Using formal ontology for integrated spatial data mining

Component of OBSDM

Page 18: Using formal ontology for integrated spatial data mining

OBSDM:: Input:: User Interface

Users can specify a goal, level of detail, and geographic area of interest through UI

Page 19: Using formal ontology for integrated spatial data mining

Component of OBSDM

Page 20: Using formal ontology for integrated spatial data mining

OBSDM:: OBSDMM:: Task Ont.

The inputs specified by users in the user interface are translated into task ontology

Task ontology explicitly specify goal, methods, requirements, and constraint

Page 21: Using formal ontology for integrated spatial data mining

Task ontology := Spatial clustering

Theory SPATIAL-CLUSTERING-TASK Documentation:

This theory defines a task ontology for the spatial clustering task. The spatial clustering task, which is a class of clustering task, is a problem of grouping similar spatial objects into classes.

Super classes: Clustering Subclasses:

Sub goal: “Find hot spots” “Group similar patterns” “Partition into k-clusters”

Requirement: Assignment-Object

Source: Spatial Objects Target: Clusters

Geographic-Scale Detail-Level

Constraint: Spatial Objects Operational Constraints

Page 22: Using formal ontology for integrated spatial data mining

Component of OBSDM

Page 23: Using formal ontology for integrated spatial data mining

OBSDM:: OBSDMM:: Alg. BuilderOBSDM:: Output:: GVis tool

Algorithm builder puts together requirements for building the best algorithm suited to domain of data and users’ input (task).

Data content is filtered through domain ontology, and the users’ requirement is filtered through task ontology.

The geographic visualization tool displays results (pattern discovered)

Page 24: Using formal ontology for integrated spatial data mining

Case study: ontology-based spatial clustering of traffic accidents

OBSC

Input: 353 features in Erie

Setting

Metadata

Theme := Traffic Accident

User interface

Goal := “identify hot spots”

LevelOfDetail := State

PlaceName := New York

Method

Algorithm := SMTIN

Constraint := Named-RoadwayOutput: 18 clusters in Erie County

Page 25: Using formal ontology for integrated spatial data mining

Case study:Effect of scale (Task ontology)

OBSC clusters reflect spatial distribution specific to the scale of users’ interest

Control Algorithm OBSC Algorithm

TASKTASK

LevelOfDetail := LevelOfDetail := NullNull

PlaceName := PlaceName := NullNull

DOMAINDOMAIN

Constraint := RoadwayConstraint := Roadway

TASKTASK

LevelOfDetail := LevelOfDetail := CountyCounty

PlaceName := PlaceName := New YorkNew York

DOMAINDOMAIN

Constraint := RoadwayConstraint := Roadway

Specifying area of interest doesn’t

mask details

Page 26: Using formal ontology for integrated spatial data mining

Case study:Effect of constraint (Domain ontology)

OBSC clusters identify the physical barrier due to concept implicit in domain

Control Algorithm OBSC Algorithm

TASKTASK

LevelOfDetail := StateLevelOfDetail := State

PlaceName := New YorkPlaceName := New York

DOMAINDOMAIN

Constraint := Constraint := NullNull

TASKTASK

LevelOfDetail := StateLevelOfDetail := State

PlaceName := New YorkPlaceName := New York

DOMAINDOMAIN

Constraint := Constraint := RoadwayRoadwaySeparated by body of

water

Page 27: Using formal ontology for integrated spatial data mining

Case study:Benefit of using ontology in spatial clustering

Incorporating ontology in spatial clustering algorithms enhances the quality of spatial clustering results

Task ontology makes clusters usable Responsive to users’ view

Domain ontology makes clusters natural Dictated by concept implicit in domain

Page 28: Using formal ontology for integrated spatial data mining

Conclusion (cont.)

Presents how ontology are incorporated in spatial data mining algorithms

Semantic linkage between ontologies and algorithms through parameterization

Scale as a task-oriented property Constraint as a domain-specific property

Page 29: Using formal ontology for integrated spatial data mining

Conclusion

Ontology is examined as a means to customize algorithms to varying domain and task

Ontology enables algorithms to reflect concepts implicit in domain, and adapt to users’ view

Ontology provides the semantically plausible way to re-use existing algorithms

Ontology provides the systematic way of organizing various factors that dictate mechanisms underlying data mining process