View
223
Download
2
Tags:
Embed Size (px)
Citation preview
Predicting Locations Using Map Similarity(PLUMS): A Framework for
Spatial Data Mining
Sanjay Chawla(Vignette Corporation)
Shashi Shekhar, Weili Wu(CS, Univ. of Minnesota)
Uygar Ozesmi(Ericyes University, Turkey)
http://www.cs.umn.edu/research/shashi-group
Outline• Motivation• Application Domain• Distinguishing characteristics of spatial data
mining• Problem Definition• Spatial Statistics Approach• Our approach: PLUMS• Experiments, Results, Conclusion and Future
Work
Motivation• Historical Examples of Spatial Data
Exploration– Asiatic Cholera, 1855– Theory of Gondwanaland– Effect of fluoride on Dental Hygiene
• A potential application in news– Tracking the West Nile Virus
Application Domain
• Wetland Management: Predicting locations of bird(red-winged blackbird) nests in wetlands
• Why we choose this application ?– Strong spatial component– Domain Expertise– Classical Data Mining techniques(logistic
regression, neural nets) had already been applied
Application Domain: Continued..
Nest Locations Distance to open water
Vegetation Durability Water Depth
Unique characteristics of spatial data mining
Spatial Autocorrelation Property
Unique characteristics…cont
K
kkk PnearestAAd
KPAADNP
1
))(.,(1
),(
Average Distance to Nearest Prediction(ADNP):
Location Prediction:Problem Formulation
• Given: A spatial framework S.– Explanatory functions,
– Dependent function
– A family F F of learning model function mappings
• Find an element
• Objective: maximize (map_similarity = classification_accuracy + spatial accuracy)
• Constraints: spatial autocorrelation exists
kX RSfk
:
}1,0{: YY RSf
ykky RRRFf ....:ˆ
Spatial Statistics Approach1.
2. Xy XWyy
2”
X
X
e
eyob
1)1(PrLogistic Regression:
Spatial Stat: Solution Techniques
• Least Square Estimation: Biased and Inconsistent
• Maximum Likelihood: Involve computation of large determinant(from W)
• Bayesian: Monte Carlo Markov Chain(e.g. Gibbs Sampling)
Our Approach
Experiment Setup
Result(1)
FNTP
TPTPR
TNFP
FPFPR
Result(2)
Conclusion and Future work
• PLUMS >> Classical Data Mining techniques
• PLUMS State-of-the-art Spatial Statistics approaches
• Better performance(two orders of magnitude)
• Try other configurations of the PLUMS framework and formalize!