15
Parallel Computation of Parallel Computation of Skyline Skyline Queries Queries Verification Verification COSC6490A Fall 2007 Slawomir Kmiec

Parallel Computation of Skyline Queries Verification

  • Upload
    noreen

  • View
    63

  • Download
    0

Embed Size (px)

DESCRIPTION

Parallel Computation of Skyline Queries Verification. COSC6490A Fall 2007 Slawomir Kmiec. Presentation Outline. Skyline Concepts The Parallel Algorithm JPF Experience JPF Issues Abstraction Results Future Work Summary Questions. Skyline Concepts. - PowerPoint PPT Presentation

Citation preview

Page 1: Parallel Computation of Skyline Queries Verification

Parallel Computation ofParallel Computation of

Skyline QueriesSkyline QueriesVerificationVerification

COSC6490A Fall 2007 Slawomir Kmiec

Page 2: Parallel Computation of Skyline Queries Verification

Presentation OutlinePresentation Outline

Skyline Concepts

The Parallel Algorithm

JPF Experience

JPF Issues

Abstraction

Results

Future Work

Summary

Questions

Page 3: Parallel Computation of Skyline Queries Verification

Skyline ConceptsSkyline ConceptsIn a set of points (or records) identify points that are better than (i.e. not worse than) any of the others by a given set of their attributes.

Name Rating Avg. Price

Parthenon 5 $45.00

Olympus 4 $40.00

Coliseum 4 $30.00

Pyramid 3 $25.00

Bombay 5 $35.00

Paris 5 $40.00

Roma 4 $35.00

Palermo 3 $30.00

Point pa is said to dominate point pb if for all i such that 1 ≤ i ≤ d we have xi(pa) ≤ xi(pb) , and at least one of those inequalities is strict.

A point p is a skyline point if it is not dominated by any other point in S. The skyline of S is denoted sky(S).

Page 4: Parallel Computation of Skyline Queries Verification

The Parallel Algorithm (A)The Parallel Algorithm (A) Principles:→ data divided equally and distributed→ local skyline is computed at each peer→ size of the local skyline is shared with

peers→ if combined results fit on any processor→ local skylines are exchanged with peers

then→ processor pi picks ith chunk of the

combined skyline and eliminates points in it that the combined skyline dominates

→ local results are sent to the central process

→ end // of processing

Page 5: Parallel Computation of Skyline Queries Verification

The Parallel Algorithm (A cont.)The Parallel Algorithm (A cont.)

Page 6: Parallel Computation of Skyline Queries Verification

The Parallel Algorithm (B)The Parallel Algorithm (B) Principles (continued)→ else // combined results do not fit on

some pi

→ loop until required number of results is available or all pi have finished do

→ each processor pi picks a random set of points (in proportion of his local skyline)

→ this set is submitted to all peers that mark point that they dominate and marked points are returned to sender

→ each processor pi collects back points submitted to peers and removes marked ones from the original set but sends the remaining ones to the central processor

→ end loop→ end // of processing

Page 7: Parallel Computation of Skyline Queries Verification

The Parallel Algorithm (B cont.)The Parallel Algorithm (B cont.)

Page 8: Parallel Computation of Skyline Queries Verification

JPF ExperienceJPF Experience

getting JPF

getting JPF to run

the Eclipse way

the Linux way

incremental examples

configuration options

JPF value-added services

Page 9: Parallel Computation of Skyline Queries Verification

JPF IssuesJPF Issues independent processors

- restricted to threads

eliminate native code classes- no Swing, Sockets, NIO, Regex (Eclipse)- out of 15 just java.util.ArrayList left- eliminate Socket-oriented developed classes

search-state-space reduction- input: 10 points- 2 worker threads- operation abstraction- output discarded

Page 10: Parallel Computation of Skyline Queries Verification

AbstractionAbstraction• 2 types of developed classes left

SkylineMain and SkylineWorker - workflow classes “Handler” classes - request handling classes

SkylineMain

SkylineMainListener

SkylineMainHandler

Thread

Socket

ServeSocket

SkylineWorker

SkylineWorkerListener

SkylineWorkerHandler

Thread

Socket

ServerSocket

Page 11: Parallel Computation of Skyline Queries Verification

Abstraction (cont.)Abstraction (cont.)• high volume of work:

- due to a lot of original code

• removed all GUI:- remove Swing and AWT elements

• asynchronous Socket messaging done as:- keep references to workers instead of addresses- eliminate the “Listener” classes- each message done as an instance of the handler- create a handler for the destination worker- execute synchronous (blocking) part of data sending- start handler to execute asynchronous processing- each type of messages split into synch- and asynch- part

• file IO done as:- store parameters as static constants- store input data as an array- replace input scanning with referencing the array- display or discard output

• String.split() method (Regex) done as:- re-done as a String manipulation method

Page 12: Parallel Computation of Skyline Queries Verification

ResultsResults• issues reported - different issues at different settings - large volume of output to be analyzed

• uncaught-exception conditions - issues regarding un-synchronized access - the above as IllegalMonitorStateException

• dead-lock conditions - issues regarding termination conditions

• PreciseRaceDetector -“Unprotected Variable Access” severe warnings

• possibly more - it ran for a long time with no other errors - it did not finish in the time given

Page 13: Parallel Computation of Skyline Queries Verification

Future WorkFuture Work• atomize code - wrap code fragments into atomic operations

• protect shared variable access - use locks of synchronized blocks - re-run PreciseRaceDetector

• run it for an extended period of time - to search the complete state space

• analyze the applicability of issues found - wrt the applicability to the original app - not as a result of the abstraction or transformation

• reduce shared data interaction - handlers to create private data structures to be quickly accepted by corresponding main process - this will allow greater robustness and redundancy

Page 14: Parallel Computation of Skyline Queries Verification

SummarySummary

• JPF is a flexible and complex tool

• JPF is memory- and time- intensive

• JPF is a valuable verification tool

• the application had to be changed

extensively to work with JPF

• potential issues were found by JPF

• verification = value-added serviceextra testing

code refinement (robustness)

Page 15: Parallel Computation of Skyline Queries Verification

QuestionsQuestions

??????