Upload
sabesan-manivasakan
View
421
Download
1
Embed Size (px)
Citation preview
11
Web Service Query Service
Manivasakan Sabesan and Tore Risch
Uppsala DataBase Laboratory
Dept. of Information Technology
Uppsala University
Sweden
2
Outline
WSMED
Research Area
Adaptive Query Parallelization
Conclusion & Future work
• WSMED provides general query capabilities over data providing web services.
• Users only need to provide WSDL URLs of web services.
• WSMED automatically creates SQL views for each web service operation.
• It makes every web service operation query-able without any programming.
• Users can make any SQL query by using the automatically created SQL views.
WSMED (Web Service MEDiator) System
4
Service Oriented Architecture of WSMED
WSMED Server
SQL View1
WSDL metadata 1
WS Operation 1
WS Operation p
WS Operation 1
WS Operation q
WS1 WSn
WSDL metadata n
Import metadata
SQL Viewm
IMPORTWSDL AUTHENTICATION QUERY EXIT_SINIT
WSMED Web Service Interface
TABLEINFO
SOAP call
WSMED Demo
• WSMED provides web service query service.
• WSMED Demo can be accessible from a web browser.
• Java Script is used to invoke directly WSMED web service.
6
Outline
WSMED
Research Area
Adaptive Query Parallelization
Conclusion & Future work
7
Queries calling data providing web services have a similar pattern :- dependent calls.
Web service calls incur high-latency and high message setup cost
A naïve implementation of an application making these calls sequentially is time consuming
A challenge here is to develop methods to speed up such queries with dependent web service calls
Research Problems
WS1 WS2 WS3 WSn
8
Outline
WSMED
Research Area
Adaptive Query Parallelization
Conclusion & Future work
9
Example Query
select gl.City , gl.TypeIdfrom GetAllStates gs, GetPlacesWithin gp, GetPlaceList glwhere gs.state=gp.state and gp.distance=15.0 and gp.placeTypeToFind='City' and gp.place='Atlanta' and gl.placeName=gp.ToPlace+' ,'+gp.ToState and gl.MaxItems=100 and gl.imagePresence='true'
Finds information about places located within 15 km from each City named ’Atlanta‘ in all US states.
• Invokes 300 web service calls and returns a stream of 360 tuples
<City,
TypeId>GetAllStates GetPlacesWithin GetPlaceList<state> <ToPlace,
ToState>
<15,’City’,’Atlanta’> <100,’true’>
10
Query Processing in WSMED
Parallel query plan
SQL queryCalculus
Generator
Parallel pipeliner
Plan function generator
Non-parallel plan optimizer
Plan splitter
Phase 1
Phase 2
Non-parallel plan
γGetPlacesWithin(‘Atlanta’, state, 15.0, ‘City’)
<City, TypeId>
γGetPlaceList (str, 100, ‘true’)
γGetAllStates()
<state >
<city , state2 >
γconcat(city,’, ‘, state2)
<str>
Split point 1
Split point 2
PF1
PF2
Non-Parallel Plan
<str>
12
Adaptive Parallel Plan
<state>
AFF_APPLYP(PF2, str)
<City, TypeId>
γGetAllStates()
AFF_ APPLYP(PF1, state)
13
Parallel Process Tree
qi- query process (i=0,1,......n)PFj- Plan Function (j=1,......m)
Level 2
q0
q1
q3 q4
q2
GetAllStates
q5 q8q7q6
Coordinator
Level 1
Query
PF1
GetPlaceList
GetPlacesWithin
PF2
14
AFF_APPLYP(Function PF, Stream pstream) → Stream result• PF – plan function
• pstream – stream of parameter values pi
• result – stream of results ri
• Asynchronous operator
q3
q4q5
PFPF
PFp1
p2
p3
Adaptive First Finished Apply in Parallel (AFF_APPLYP)
AFF_APPLYP
r1r2
r3
p4
p5
p6
PFp1, p2, p3
r1
p4
r3
p5
r2
p6
Functionalities of AFF_APPLYP
1. AFF_APPLYP initially forms a binary process tree by always setting fanout to 2 - init stage.
15
q0
q1
q3 q4
q2
q6q5
Coordinator
Level 1
Level 2
..........2. A monitoring cycle for a non-leaf query process is defined when number of received end-of-call messages equal to number of children.
2.1 After the first monitoring cycle AFF_APPLYP adds p new child processes - an add stage.
3. When an added node has several levels of children, the init stages of AFF_APPLYP s in the children will produce a binary sub–tree.
q0
q1
q3 q4
q2
q5
Coordinator
Level 1 q7
q9q8q10Level 2 q6 q11
17
......
4. AFF_APPLYP records per monitoring cycle i the average time ti to produce an incoming tuple from the children.
4.1 If ti decreases more than a threshold (25%) the add stage is rerun.
4.2 If ti increases we either add no more children or run a drop stage that drops one child and its children.
q0
q1
q3 q4
q2
q5
Coordinator
Level 1
q12q10Level 2 q6 q11
18
Adaptive Results- Example Query
0
50
100
150
200
250
300
Execu
tio
n T
ime (
Sec)
Non-parallel plan p=1, no drop stage, fo1=3 fo2=3
p=1, drop stage, fo1=2 fo2=3 p=2, no drop stage, fo1=4 fo2=5
p=2, drop stage, fo1=3 fo2=3 p=3, no drop stage, fo1=5 fo2=3.4
p=3, drop stage, fo1=4 fo2=3.25 p=4, no drop stage, fo1=6 fo2=8.7
p=4, drop stage, fo1=5 fo2=4.2 p=5, no drop stage, fo1=7 fo2=7.5
p=5, drop stage, fo1=6 fo2=7.8
19
AFF_APPLYP observations
• For example query :– The execution time with p=4 and no drop stage is the best. – It is more than 4 times faster with the sequential execution (non-
parallel).
• The execution time with p=2 and no drop stage is reasonably close to the best execution time ( 80% ).
• Drop stage makes insignificant changes in the execution
time.
• Fanout of each level on a process tree depends on the execution time of a web service invoked on that level. – AFF_APPLYP finds the optimized fanout for each level.
20
Outline
WSMED
Research Area
Adaptive Query Parallelization
Conclusion & Future work
Related work
• Similar to WSMS (U.Srivastava, J.Widom, K.Munagala, and R.Motwani, Query
Optimization over Web Services, VLDB 2006) WSMED also invoke parallel web service calls. In contrast, WSMED supports automated adaptive parallelization.
• In contrast to WSQ/DSQ(R.Goldman, and J.Widom, WSQ/DSQ: a practical
approach for combined querying of databases and the Web, SIGMOD 2000) ,WSMED produces non-materialized adaptive parallel plans based on parameter streams.
• Runtime optimization techniques (A. Gounaris, et al., Robust runtime
optimization of data transfer in queries over Web Services, ICDE 2008 ) investigate adaptation of buffer sizes in web service calls, not dealing with adaptive parallelism on web service calls.
21
Conclusion• WSMED can be accessed :
– through a URL http://udbl2.it.uu.se/WSMED/wsmed.html – without installing any software.
• Queries are expressed in SQL to dynamically compose data providing web services without any programming.– Makes any web service queryable with SQL
• AFF_APPLYP:– automatically parallelize web service calls.– adapts the process tree at runtime , based on the flow of result
stream without any static cost model.
• Adaptive Parallel plan with AFF_APPLYP makes possible to run expensive queries.
22
23
Future .....
• Generalize the strategy for queries mixed with dependent and independent web service calls, as well bushy trees (Ongoing work)
• Investigate different process arrangement strategies with the algebra operators.
• Setup a benchmark to simulate the parallel invocation of web services.
Thank you for your attention
?
24“The un-queried life is not worth living”