24
Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong Spiridon Bakiras Hong Kong University of Science and Technology

Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Ad-hoc Distributed Spatial Joins on Mobile Devices

Panos Kalnis, Xiaochen LiNational University of Singapore

Nikos MamoulisThe University of Hong Kong

Spiridon BakirasHong Kong University of Science and Technology

Page 2: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Motivation

Users are equipped with a mobile device (eg. PDA)

Ad-hoc spatial queries Combine data from remote servers

Hotels Restaurants

“Find hotels which are within 500m of a seafood restaurant”

Servers do not collaborate with each other The query is executed on the mobile device

Page 3: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Mediators?

Services may only allow end-user connections (eg., subscribers only)

Access through mediators may be more expensive

Requests are ad-hoc; existing mediators may not support them

Hotels Restaurants

Mediator

Page 4: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Cost

Telecommunication companies typically charge by the bulk of transferred data (eg. GPRS), instead of connection time.

Goal: Minimize the amount of transferred data.

Page 5: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Solution

Ask aggregate queries to estimate the data distribution (i.e., statistics)

Partition the space recursively to achieve sub-linear transfer cost

Choose the physical operator indepen-dently for each partition

Page 6: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Related Work

Hash-based methods (eg. PBSM): require all data to be transferred

R-tree based methods (eg., [Tan et.al, TKDE, 2000]): require access to internal index

Mediators : HERMES : Statistics from previous queries DISCO, Garlic : Statistics during initialization Tuckila : Optimize parts of the execution tree

Page 7: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Operators

WINDOW query: return all objects intersecting a window w

COUNT query: return the number of objects intersecting w

ε-RANGE query: return all objects within range ε from a point p

NO access to the internal indices!

ε

w

p

Page 8: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Query Types Intersection Join

Find hotels which are inside parks

E-range Join Find restaurants which

are within 500m of a hotel

Iceberg Semi-join Find hotels which are

close to at least 3 restaurants

ε

Page 9: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Hash Based Spatial Join

Each partition must fit in memory

Page 10: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Recursive evaluation

Retrieve statistics for each subpart

Page 11: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Inefficient HBSJ

Page 12: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Nested Loop Spatial Join

Recursive HBSJ : 4 QRY + 2 RCV + 5 RCV

NLSJ : 2 RCV + 2 SND + 2 RES

Page 13: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Inefficient NLSJ

Page 14: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Cost Model

TCP/IP: MTU = MSS + BH

MSS

BBBBT DHDDB )(

c1: download |RW| objects from R and |Sw| objects from S and join them on the PDA

C2,3: download |RW| objects from R, send them as window queries to S and retrieve the results

c4: repartition w, retrieve detailed statistics and apply the algorithm recursively

Page 15: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

UpJoin (Uniform Partition Join)

Decide if datasets are uniform

If HBSJ is cheaper and both datasets are uniform then perform HBSJ

If NLSJ is cheaper and the largest dataset is uniform then perform NLSJ

Else repartition

Page 16: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Uniformity check

wiww DDD

'4

Dw

Dw’0 Dw’1

Dw’3 Dw’2

% variation from uniform distribution

Note: UpJoin will not repartition if the cost for retrieving statistics is larger than the cost of joining

Page 17: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Inefficient UpJoin

Page 18: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

SR-Join (Similarity Related Join)

wiw

wwi A

A

DD

Area% variationof density

Identify dense and sparse quadrants

If the distribution is similar then apply HBSJ or NLSJ

Else repartitionX

X

X

X

Page 19: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Experimental setup Implementation

Server: Unix Client: HP-Ipaq PDA (WiFi network, 400MHz

RISC CPU, 64MB RAM, Windows Pocket PC) Datasets:

Synthetic: 1K – 10K points, varying skew Real: Roads and railways of Germany

Page 20: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Setting the parameters

α (for UpJoin) ρ (for SR-Join)Uniform Uniform

Page 21: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Real Dataset

Uniform

Page 22: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Comparison with SemiJoin

•SemiJoin: Use intermediate levels of R-Tree index•We cannot use it in practice, because we cannot access the index

Uniform

Page 23: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Conclusions Distributed spatial joins on mobile devices No mediator – non collaborative servers – limited

set of supported operators Two algorithms

UpJoin SRJoin Both estimate the datasets’ distribution

Future work Support multi-way spatial joins Improve the accuracy of the cost model

Page 24: Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong

Questions?