1 Clustering of location- based data Mohammad Rezaei May 2013

Clustering of location-based data

Mohammad Rezaei

May 2013

Data mining and Clustering

- Huge amount of location-based Data

- Need for mechanisms to extract knowledge

- Clustering as an important field in spatio-temporal data mining

Clustering

Some applications

RoutingInteresting placesRecommendation of servicesMarketing managementUsers with same interestsVisualization

Clustering Problems in Mopsi

Clutter of markers on the mapSimilar services or photos in a

listCategorization of servicesDistribution of users’ locationsTimeline view of photosClustering of events

Clutter of markers

Search results

Clustering

Photos

Solutions

Grid based clustering

Distance based clustering

Google Maps version 3.0- Using location in pixels for grid-base

clustering- 22 zoom levels- 256*256 in zoom level 0 to 536870912*

536870912 in zoom level 21- ≈ 60*1012 cells in the zoom level 21

with cell size(60,80)

Some issues

- Photos are added or deleted dynamically

- Querying for a certain time, certain user or according to photo description

- Different zoom levels, moving map

Hierarchical Clustering on server

Individual clustering for different zoom levels

Clustering of whole data

How to extract clusters for a specific query?

Are clusters for a lower zoom level can be derived from higher level?

Client side clustering

- Query from server (Resulting N objects)

- Take the zoom view

Not too many cells

- Taking objects in the zoom view and do

clustering only for them (M objects)

- It takes O(N) to find out the objects in

the zoom view!

Input location (lat, lon) of markers Width and height of markers (Hm,Wm) Width and height of cells in the grid (H, W)

Output Location of clusters

Location of the marker

Representation - Middle of cell

-No overlap

-Locations can be misleading

Representation- First object

Representation – Average Location

Proposed approach

- Grids start from beginning of the whole map

- Extend the grid in current zoom view

By moving map clusters do not

change

- Average location for representative

By moving map clusters

do not change

(xmin, ymin)

(xmax, ymax)

Algorithm

1. nRow = ceil((xmax-xmin)/W)

2. nColumn = ceil((ymax-ymin)/H)

3. nCell = nRow * nColumn 4. Clusters = all cells // empty clusters5. For all the markers6. row = floor((y-ymin)/gridHeight)

7. column = floor((x-xmin)/gridWidth)

8. cellNum = row*nColumn + column9. Add the marker to Clusters[cellNum]10. Update the cluster:

Clusters[cellNum]

(xmax, ymax)

(xmin, ymin)

1 2 3 4 5

6 7 8 9 10

Cell number

Merging algorithm- Average location as representative

1. MergeClusters(clusters)2. change the order of clusters descending according to the size of

clusters 3. set parent of each cluster, the same cluster4. k=1 (K is number of clusters)5. while (k < K )6. if ( k is not “processed” )7. checkNeighbors(k);8. mark the cluster k “processed”9. k=k+1

10. CheckNeighbors(k)11. cluster1=clusters[k]12. For all 8 neighbors13. cluster2 = one of the neighbors // 14. if cluster2 is not an empty cell15. checkNeighbor(cluster1, cluster2)

Merging algorithm1. checkNeighbor(cluster1, cluster2)2. find the distance d between the two clusters3. if d<T // distance threshold T4. while ( cluster2 is “processed” ) // means it has been merged5. cluster2 = clusters[cluster2.parent]6. MergeClusters(cluster1, cluster2);

7. MergeClusters(cluster1, cluster2)8. n1 and n2: size of the clusters9. (x1,y1) and (x2,y2): location of clusters10. x=(n1*x1+n2*x2)/(n1+n2) 11. y=(n1*y1+n2*y2)/(n1+n2) 12. x1 x and y1 y 13. mark the second cluster “processed”14. cluster2.parent = k

Width and height of a cell H>Hm and W>Wm

Minimum distance of the markers to avoid overlap 22

mm HWd

Marker

Location of marker

Distance based clustering

Input location (lat, lon) of markers Width and height of markers (Hm, Wm)

Output location of clusters

Time complexity: O(N2)

Algorithm1. i= 0;2. While (i<N) // N=number of markers3. if ( marker i is not clustered )4. Label marker i as clustered5. Calculate distance (dj) to other non-clustered

markers6. for all markers j7. If dj<T // T: distance threshold

8. merge the markers i and j9. Label marker j as clustered10. i = i+1;

Timeline view of photosDisplaying n photos in a limited space

Timeline view of photos

Input Timestamps Number of clustersOutput PartitionsAlgorithm K-means

Location clusters

Homes of usersShop

Walking street

Marketplace

Swimhall

Sciencepark

Clustering of trajectories

Similarity or distance

Start and end of the routes

Speed, length, accelaration, time, etc

70 km/h 72 km/h

50 km/h

30 km/h

60 km/h

These two routes are more similar in speed than others

Closeness of points and shape(Comparing whole route or segments of the routes)

t2t3 t4

t5 t6t7

t2 t3 t4t5 t6

Closest pair distance

Sum of pair distance

Cluttering problem for routes

1 Clustering of location- based data Mohammad Rezaei May 2013

Documents

PLEA2016_Amir Rezaei-Bazkiaei_FINAL

Constraints on Dark Matter From Direct Searchesphysics.ipm.ac.ir/conferences/ipp11/speakers/rezaei/... · 2011-09-27 · Constraints on Dark Matter From Direct Searches Amin Rezaei

Database for Location- Aware Applications Mohammad Rezaei School of Computing University of Eastern Finland 8.4.2013 1

Dynamic Clustering based on Evolutionary Algorithms ... · Algorithms Asexual Reproduction Optimization (ARO) Safoura Akhlaghi1,Mohammad Bagher menhaj2 1- MSc. computer engineering

Ali Rezaei ali.rezaei@csulb.edu arezaei/EDP520 arezaei/EDP520

Real-Time Clustering of Large Geo-Referenced Data for ... · Real-Time Clustering of Large Geo-Referenced Data for Visualizing on Map . Mohammad REZAEI, Pasi FRANTI . School of Computing,

\\\\\\\\\\\\\. Mohammad Rezaei Fellowship of Pediatric Pulmonology

Dynamic Cache Clustering for Chip Multiprocessors Mohammad Hammoud, Sangyeun Cho, and Rami Melhem Dept. of Computer Science University of Pittsburgh

Presenter : YAN-SHOU SIE Authors : Pasi Fränti, Mohammad Rezaei, Qinpei Zhao 2014 . PR

Dynamic Cache Clustering for Chip Multiprocessorsmhhammou/ics070-hammoud.pdf · Dynamic Cache Clustering for Chip Multiprocessors Mohammad Hammoud, Sangyeun Cho, and Rami Melhem

Wetting phenomena in membrane distillation: Mechanisms ... · Wetting phenomena in membrane distillation: Mechanisms, reversal, and prevention Mohammad Rezaei a, *, David M. Warsinger

Computer Architecture Mehran Rezaei Mhr.rezaei@gmail.com

Neda Rezaei BSc; Leila Shafeghat BSc; Mohammad Taghi · 2018. 9. 17. · MSUD A.A Concentration (µmol/L) M SD Mi M Lb Rf Enz deficien: branched- Mea chain alpha-keto acid n Min Ma

Investigating Intrinsic Energy Consumption Seasonality Adedamola Adepetu, Elnaz Rezaei

Master’s Seminar Elnaz Rezaei - University of Waterlooblizzard.cs.uwaterloo.ca/keshav/home/Papers/data/14/... · 2014-11-13 · Master’s Seminar Elnaz Rezaei 1. OUTLINE • Introduction

Climatic clustering analysis for novel atlas mapping and ... · Climatic clustering analysis for novel atlas mapping and bioclimatic design recommendations Gholamreza Roshan1, Mohammad

Deidrey Langat Shinen Lo Mahmoud Rezaei Carissa Tudor

CV rezaei guilan- FAstaff.guilan.ac.ir/rezaei_psy/cv/696.pdf · 3- Sajjad Rezaei , Asgari K, Yousefzadeh S, Mousavi H, Kazemnejad E. Effects of Neurosurgical treatment and Severity

Sponsor: Dr. Lockhart Team Members: Khaled Adjerid, Peter Fino, Mohammad Habibi, Ahmad Rezaei Fall Risk Assessment: Postural Stability and Non-linear Measures

AIMC46 Schedule Talksaimc46.yazd.ac.ir/Proc/Talks-Schedule.pdf · AIMC46 Schedule Talks ... Sajjad Rahmany and Monireh Riahi ... (Prof. Mir-Mohammad-Rezaei) l 08:00-09:00 Hall 5 Plenary