38
Bio-inspired computational techniques applied to the analysis and visualization of spatio- temporal cluster dynamics Miguel Arturo Barreto Sánz [email protected] Faculté des Hautes Etudes Commerciales (HEC) Institut des Systèmes d'information (ISI)

Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

  • Upload
    askroll

  • View
    877

  • Download
    1

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Bio-inspired computational techniques applied to the analysis and visualization of

spatio-temporal cluster dynamics

Miguel Arturo Barreto Sánz

[email protected]

Faculté des Hautes Etudes Commerciales (HEC)Institut des Systèmes d'information (ISI)

Page 2: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Outline● Introduction Data mining in spatio-temporal datasets

● Research plan Specific Goals Challenges in mining spatio-temporal datasets State of the art Approaches

● Preliminary results and discussion

1

Page 3: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

2

Introduction

● Increasing number of complex data sets associated to geographical areas

● Routinely capture huge volumes of data describing several human or nature behaviors

For instance :

Page 4: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

3

Information received from remote sensing systems, and environmental monitoring devices used in:

● Agriculture● Weather prediction● Cartography

Information sourcesIntroduction

Page 5: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

4

These data sets are critical for decision support, but their value depends on the ability to extract useful information for studying and understanding the phenomena governing the data source.

Introduction

Data mining in spatio-temporal datasets

Page 6: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

5

Currently

● Data mining in geospatial data take just the static view of geospatial phenomena.

However

● Geographic phenomena evolve over time ● Mining spatio-temporal data is related to the temporal dynamics of geospatial data = crucial to our understanding of geographic-based process and events.

Goal

● Describe the manner in which spatial patterns change through time

Introduction

Data mining in spatio-temporal datasets

Page 7: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

6

Data mining in spatio-temporal datasets

Introduction

Some fields and applications include:

● Agro-ecology ● Environmental change ● Species distribution ● Disease propagation ● Urban dynamics ● Migration patterns

Page 8: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

1

Introduction

Data mining in spatio-temporal datasets

Manage and understand changing spatial patterns of yields

● What are the variables that make that some regions produce more that the others ?

● Why are regions that maintain its production over time ?

7

Page 9: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

8

The Normalized Difference Vegetation Index (NDVI) gives a measure of the vegetative cover on the land surface over

wide areas.

● What variables are related with the changes in the vegetative cover ?

Summer 1989

Summer 1990

Summer 1991

Summer 1992

Sumer 1993

Summer 1994

Summer 1996

Summer 1997

Summer 1998

Summer 1999

Summer 2000

Summer 2001

Introduction

Data mining in spatio-temporal datasets

Environmental Change (Satellite images)

Page 10: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

9

It is very important to conduct research on data mining of spatio-temporal datasets.

● Develop methodologies ● Assist the knowledge extraction from spatio-temporal datasets ● Improving making decision processes.

Introduction

Data mining in spatio-temporal datasets

New methodologies

Page 11: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

10

New methodologies to mining

spatio-temporal datasets

Visualization of spatio-temporal

cluster dynamics

To provide insights about the nature of cluster

change

To deal with the inherent characteristics of the spatio-temporal datasets

● Multivariate and Temporal Mapping● Visualization of Very Large Datasets● Changing spatial patterns

Introduction

Data mining in spatio-temporal datasets

New methodologies

For instance …

Page 12: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Introduction

Data mining in spatio-temporal datasets

New methodologies

Similarity of sugarcane growing environmental conditions (1999-2001) using Self-organizing maps

11

Page 13: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

12

Introduction

Data mining in spatio-temporal datasets

New methodologies

● Which is the variable or variables that make that two clustersmerge in one. ● There are sites that change from one cluster to another year after year? ● Why that happens?.● It is possible to find recurrent patterns in the dynamics of the clusters?

Page 14: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

13

Specific GoalsDevelopment of bio-inspired methodologies for the detection and tracking of changes in spatio-temporal clusters.

● Agro-ecological datasets will be used as a case study.

● This approach implies to find clusters of sites with similar characteristics in time and space.

Development of bio-inspired methodologies for the visualization of spatio-temporal cluster dynamics.

Research plan

Page 15: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

14

Clusters of sites with similar characteristics in time and space

Research plan

Specific Goals

What crops or varieties are likely to perform well where and when.

Homologues places for Colombian coffee production. Brazil, Equator, East Africa, and New Guinea.

Soil

Climate

Genotype

Page 16: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

15

Clusters of sites with similar characteristics in time and space

Research plan

Specific Goals

Harvest at different time of the same crop

Page 17: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

16

Clusters of sites with similar characteristics in time and space

Research plan

Specific Goals

The COCH project

For commercial (mass production) crops (rice, corn) it is known the “when” and “where”

For native crops (guanabana, lulo) or special types of crops (coffee varieties) it is not the case.

DAPA (Diversification Agriculture Project Alliance)

When and what I must cultivate ?Market demand

Page 18: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

17

Research plan

Challenges in mining spatio-temporal datasets

The special nature of spatio-temporal data poses several challenges to the knowledge extraction process.

For instance:

● Heterogeneity in sources of information and in scales of time and space

● Spatial autocorrelation● Boundaries in geospatial data● Temporal relationships between spatial objects

● Visualization of spatio-temporal cluster dynamics● Geographic space and feature space

Page 19: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

18

Research plan

Challenges in mining spatio-temporal datasets

Conventional methods are not effective for handling mixture of data types and sources.

Heterogeneity in sources of information

Page 20: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

19

Research plan

Challenges in mining spatio-temporal datasets

Heterogeneity in scales of time and space

Necessary to have methodologies to evaluate clusters at different scales in order to find “interesting” patterns between levels.

Improve the analysis of cluster structure at different scales, creating representations of the cluster facilitating the selection of clusters at different scales.

Page 21: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

20

Research plan

Challenges in mining spatio-temporal datasets

Spatial autocorrelation

The spatial autocorrelation can be defined as the degree of relationship that exists between two or more spatial-data variables

Page 22: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

21

Research plan

Challenges in mining spatio-temporal datasets

Boundaries in geospatial data

Algorithms for knowledge discovery in spatio-temporal databases have to consider the neighbors of the geo-referenced data.

For instance, part of the complexity of the problem lies in the fact that the boundaries of these neighbors are not hard, but rather soft boundaries.

Page 23: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Challenges in mining spatio-temporal datasets

Similarity of sugarcane growing environmental conditions (1999-2001) using Self-organizing maps

The relationship between spatial objects can change over time.

This dynamic relationships can be observed for instance in the cluster changes over the time.

Temporal relationships between spatial objects

22

Page 24: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Challenges in mining spatio-temporal datasets

Geographic space and feature space

Geographic space Feature space

Geographic space is concerned with surface features as the terrain we walk on.

Feature space visualization is concerned with the representation of similarities associated with geo-referenced sites in the geographic space

23

Page 25: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Challenges in mining spatio-temporal datasets

Visualization of spatio-temporal cluster dynamics

● Visualization of the overall structure of the dataset,

● Exploration of correlations and relationships.

● Visualization of temporal patterns.

24

1 Km

1 Km

1 point

1 336,025 points just for Colombia

Page 26: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

State of the artMyra Spiliopoulou, et al.Monic: modeling and monitoring cluster transitions. In KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining.

Daniel B. Neill et al. Detection of emerging space-time clusters. In KDD ’05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining.

Geoffrey M. Jacquez. Spatial Cluster Analysis (The Handbook of Geographic Information Science). John Wilson (University of Southern California), 2008

● Small databases● No agro-ecologic or environmental databases● Recorded in controlled conditions● Based on statistical models

25

Page 27: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Used to analyze data when there is only a low level of knowledge about the dataset

● Unsupervised learning Heterogeneous data

● Hierarchical methods Heterogeneity in scales

Approaches

of time and space

26

Page 28: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches

Examples

Prototype

Examples

Prototype

● Data abstraction methods Heterogeneityin scales oftime and space

27

Page 29: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches

A Self-Organizing Map (SOM) applies a learning strategy used in neural structures like the cortex, and presents several advantages that we will exploit in our research in order to gain insights about the spatial autocorrelation present in the geographic zones.

The neighbourhood function hck(t) of a SOM, centred over the best matched neuron mc.

● Self-Organizing Map (SOM) Spatial autocorrelation

28

Page 30: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches

Similarity of sugarcane growing environmentalconditions (1999-2005)using Self-organizing

maps

The clusters found in the feature space in many cases are not the same as those found in geographic space.

Represent clusters of a multidimensional space: map multidimensional data onto a two-dimensional lattice of cells.

● Self-Organizing Map (SOM) Geographic space and feature space

29

Page 31: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches

● Self-Organizing Map (SOM) Visualization of spatio temporal cluster dynamics

Visualization of the overall structure of the dataset, it is clustering, patterns (similarities) and irregularities.

Exploration of correlations and relationships. This is primarily based on component plane displays in multiple views.

Visualization of temporal patterns. Examples are ordered component displays and trajectories.30

Partial Correlation

Page 32: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches

In many applications crisp partitions are not the optimal representation of clusters.

With the purpose of representing degrees of membership, is a feature that could be added to the model.

● Fuzzy logic Boundaries in geospatial data

31

Page 33: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches

To deal with non stationary-relationships implies to find relationships which varies through time and space.

This challenge involves the creation of methodologies capable to adapt their models in order to reveal the dynamics of the clusters and represent their characteristics in the most accurate manner.

Growing hierarchical Self-Organizing Structures could be used as a base for hybrid models in order to detect, reveal and analyze spatio-temporal cluster dynamics.

● Non-stationarity relationships between spatial objects Growing hierarchical Self-Organizing Structures

32

Page 34: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches I propose ...

An unsupervised model based on self-organization which allows data abstraction, hierarchical organization of the clusters, and automatic detection of interesting changes in the dynamics of spatio-temporal clusters.

Some characteristics of the model must be:

● Adapt its structure.

● Changes presented in its structure will reveal cluster dynamics as merging, emergence, mutation, and parallel dynamics.

33

Page 35: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Research plan

Approaches I propose ...

● The hierarchical structure will permit to tackle the problem related to the scale effect (navigation of the clustering structure in different levels).

● The model will work with fuzzy memberships to avoid the problem of boundaries in geospatial data.

● The unsupervised methodology will help to find relationships that can be hidden in very large and heterogeneous datasets (Heterogeneity in sources of information).

34

Page 36: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Preliminary results and discussion

[1] Miguel Barreto-Sanz. and Andrés Pérez-Uribe. Classification of similar productivity zones in the sugar cane culture using clustering of som component planes based on the som distancematrix. In The 6th International Workshop on Self-Organizing Maps (WSOM), 2007.

[2] Miguel Barreto-Sanz. and Andrés Pérez-Uribe. Improving the correlation hunting in a large quantity of som component planes. In ICANN 2007. Proceedings of the 1th international conference on Artificial Neural Networks.

[3] Miguel Barreto-Sanz and Andrés Pérez-Uribe. Tree-structured self-organizing map component planes as a visualization tool for data exploration in agro-ecological modeling. In in Proc. of the 6th European Conf. on Ecological Modelling, Trieste, Italy, 2007

35

Page 37: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Preliminary results and discussion

[4] Miguel Barreto-Sanz, Andrés Pérez-Uribe, Carlos-Andres Peña-Reyes, and Marco Tomassini. Fuzzy growing hierarchical self organizing networks. In ICANN 2008: Proceedings of the 18th international conference on Artificial Neural Networks.

[5] Miguel Barreto-Sanz, Andrés Pérez-Uribe, Carlos-Andres Peña-Reyes, and Marco Tomassini. Tuning Parameters in the Fuzzy Growing Hierarchical Self-Organizing Networks. To appear in: Studies in Computational Intelligence, CONSTRUCTIVE NEURAL NETWORKS Springer, 2009.

36

Page 38: Bio inspired computational techniques applied to the analysis and visualization of spatio-temporal cluster dynamics

Thanks for new ideas and directions to explore!