16
Efficient Progressive Processing of Skyline Queries in Peer-to-Peer Systems INFOSCALE’06

Efficient Progressive Processing of Skyline Queries in Peer-to-Peer Systems INFOSCALE’06

Embed Size (px)

Citation preview

Efficient Progressive Processing of Skyline Queries in Peer-to-Peer Systems

INFOSCALE’06

Outline

Introduction

Algo

Evaluation

Conclusion

Introduction

Finding a hotel with nearest distance to a beach and a lower priceDistance

Price

Semantic Small Word

Peer Choose the centroid of its largest data cluster as its semantic label

Each node in the network knows its local neighbors, called short range contacts.

Each node knows a small number of randomly chosen nodes,called long range contacts

Peer is responsible for management of data objects and the location information of data objects stored at other peers referred as foreign indexes

Cont.

SSW Overlay Structure Foreign Indexes

short

long

Problem Definition

For a 4-Dimension SSW {a0,a1,a2,a3}

A Skyline Query={a0:min, a2:max}

Q is only related to attribute dimension a0 and a2 only.

Algo.

Exact Algo.

Step: Locate the Origin Cluster

Find the boundary value in the skyline query(vbound)

Inter-Cluster Pruning Forwarded to peers in neighboring cluster as long as the cluster is not

dominated by vbound .

Intra-Cluster Pruning

prune irrelevant peers

Skyline Computing

Exact Algo.

Approximate Algo.(Single-Path)

In cases where a semantic overlay network does not exist.

Receiving an incoming skyline query Q, the initial peer must decide the next candidate peer to which the skyline query is forwarded from its knowledge of contact

Semantic Distance

)}(),...,(),min{( 1'

11'10

'0 jj vvvvvvScore

: attribute of candidate peer

: attribute of current peer

Single-PathPrice Distance to Beach

Current peer 101 66

Discussion and Improvement

Consider A,B in the candidate list

A is cheaper.

B is near to the beach.

Case: Choose A

If B contains many hotel records that are near to the beach

Therefore, an import portion of a good skyline is neglected

Multi-path

Semantic Distance

)}(),...,(),{( 1'

11'10

'0 jj vvvvvvScore

The score function return a j-tuple set instead of a single result

Cont.

Price Distance to Beach

Current peer 101 66

Peer ID Price Distance to Beach Score

A 79 65 -22,-1

B 73 73 -28,7

C 88 59 -13,-7

D 182 65 81,-1

E 103 70 2,4

F 69 84 -32,18

G 90 68 -11,2

F,C will be selected.

Evaluation Result Quality

Return the area between an approximate skyline with a complex exact one that takes all the data objects in the network in to consideration.

Cont.

Conclusion