42
1

1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Embed Size (px)

Citation preview

Page 1: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

1

Page 2: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

OutlineOutline

• Background: Geographic Information Systems and Open Geographic Standards

• Motivations and Motivating Use Cases• Research Issues• Architecture: Federated Service-Oriented

Geographic Information System• Performance enhancing designs -measurements

and analysis• Contribution• Future Work

2

Page 3: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Geographic Information Systems (GIS)Geographic Information Systems (GIS)• GIS is a system for: creating,

storing, sharing analyzing, manipulating and displaying spatial data and associated attributes.

• GIS evaluated from mainframe systems to Desktop to Distributed systems.

• Modern GIS require:– Distributed data access for

spatial databases– Utilizing remote analysis,

simulation or visualization tools.

• Problems with traditional distributed GIS approaches:– Distributed nature of the geo-

data; various client-server models, databases, HTTP, FTP

3

Page 4: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Open Geographic StandardsOpen Geographic Standards• Aim is to make geographic information and services neutral

and available across any network, application, or platform. • Two major well-known standard bodies: Open Geospatial

Standards (OGC) and ISO/TC211.• OGC Specifications defines online services and data models:

– Data Format Specs: Geographic Markup Language (GML)– Service Specs: Web Feature Service (WFS), Web Map Service

(WMS)• OGC Services are HTTP-based which has limited data transport

capabilities. • HTTP-based services are request-response type services;

centralized and synchronous applications.

4

Page 5: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

MotivationsMotivations

• Requirements for Interoperable Service-oriented Geographic Information Systems – Necessity for sharing and integrating heterogeneous data

and computation resources• Uniform data access/query/display from a single

access point• Responsive and interactive GIS systems– GIS applications require quick response

• Emergency early warning systems• Home-land security and natural disasters.

5

Page 6: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Motivating Use CasesMotivating Use Cases• Earthquake science applications

– Pattern Informatics (PI)• Earthquake forecasting code developed by Prof. John Rundle (UC

Davis) and collaborators, uses seismic archives.– Virtual California (VC)

• Time series analysis code, can be applied to GPS and seismic archives. It can be applied to real-time and archival data.

• Interdependent Energy Infrastructure Simulation System (IEISS) – Los Alamos National Laboratory (LANL)– Models infrastructure networks (e.g. electric power systems and

natural gas pipelines) and simulates their physical behavior, interdependencies between systems.

6

Page 7: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

ResearchResearch IssuesIssues• Interoperability

– Adoption of Open Geographic Standards – Applying Web Service principles to GIS data services

• Flexibility and Extensibility– The system should bridge GIS and Web Service communities by

adapting standards from both– Other GIS applications should be able to consume data without

having to do costly format conversions• Federation

– Federation of GIS systems• Unified data access/query/display from a single access point

– Principles for generalizing the proposed federated GIS system• In terms of components, framework and requirements.

• Addressing high-performance support for responsiveness

7

Page 8: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Interoperable Service-oriented GIS Interoperable Service-oriented GIS • Composed of two types of online services, Web Map Services (WMS) and

Web Feature Services (WFS)• And two types of data:

– Binary data –map images (provided by WMS),– Structured-data –GML : content (core data) and presentation (attribute and

geometry elements) (provided by WFS)• WMS and WFS have their own type of capability metadata defined by Open

Geographic specs. They exchange capabilities through “getCapability” service interface to make valid requests and get valid responses

• UDDI based registry services • Components are Web Services and all control goes through SOAP messages

8

Relation of the components and data flow:

WMS GML

rendering

WMS GML

rendering

WFS

(mediator)

WFS

(mediator)

wsdl

wsdl

GMLBinary data

getCapabilitygetMapgetFeatureInfo

getCapabilitygetFeatureDescribeFeatureType

GIS

Page 9: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Federated Interoperable GISFederated Interoperable GIS

• Unified data access/query/display from a single access point• Providing application-based hierarchical data definitions

– layer based data and service (WMS and WFS) compositions• Federation is done by aggregating GIS Web Services’ capabilities metadata• Capability is basically a metadata about data+service:

– Server’s information content and acceptable request parameter values

User PortalInteractive Map-Tools

User PortalInteractive Map-Tools

WFSWFS

WMSWMS

FederatorFederator

12

WMSWMS

WMSWMS

WFSWFS

WFSWFS

12

GIS

Capability FederationMap Rendering

1

2

3

1. GetCapability (data and operations available on)

2. GetMap (get map data in set of layer(s))

3. GetFeatureInfo (query the attributes of data)

Browser

9

a

bc

d

a. Gas-pipelineb. Electric-powerc. NASA satellited. State-boundaries

Sample Layers for IEISS:

Application-based hierarchical data: [Application]- IEISS– [Layer-1] Gas-pipeline over Satellite

• [Data-1] – Gas-pipeline (WFS-1)

• [Data-2] – Satellite-Image(WMS-2)

– [Layer-2] • Google map (WMS-1)

– [Layer-3]- Electric-power• [Data-1]

• Electric-power(WFS-3)

Application-based hierarchical data: [Application]- IEISS– [Layer-1] Gas-pipeline over Satellite

• [Data-1] – Gas-pipeline (WFS-1)

• [Data-2] – Satellite-Image(WMS-2)

– [Layer-2] • Google map (WMS-1)

– [Layer-3]- Electric-power• [Data-1]

• Electric-power(WFS-3)

Page 10: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Why Capability metadataWhy Capability metadata

• Web Services provide key low level capability but do not define an information or data architecture

• These are left to domain specific capabilities metadata and data description language (GML).

• Machine and human readable information– Enables easy integration and federation

• Enables developing application based standard interactive re-usable tools – for data query display and analysis– Seamless data/access/query

10

Page 11: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Generalizing the Proposed Architecture - IGeneralizing the Proposed Architecture - I

• One can define a GIS-style information model in many application areas such Chemistry and Astronomy– Application Specific Information Systems (ASIS).

• We have investigated the requirements and principles to generalize the proposed federated GIS approach.– From GML to ASL (Application Specific - Language)

• Data description language in forms of domain specific features– From WFS to ASFS (Feature Services)

• Provides data in ASL with standard service interfaces– From WMS to ASVS (Visualization Services)

• Domain specific display format definitions and standard services• Visualizes information and provide a way of navigating ASFS

compatible databases (cf. GetFeatureInfo for GIS)– Need to define application specific capabilities metadata

for ASVS and ASFS.11

Page 12: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Generalization of the Proposed Architecture - IIGeneralization of the Proposed Architecture - II

• Mediators: Query and data format conversions• ASFS -> provide ASL(structured data covering content and

presentation tags).• ASVS -> provide common data representations from ASL, in binary

images• Federator federates the capabilities of distributed ASVS and ASFS to

create application-based hierarchy of distributed data and service resources

ASSensorAS

Sensor

ASRepository

Such as filter, transformation, reasoning, data-mining, analysis

Messages using ASL

1234

12

Standard service API

Mediator Standard service API

Mediator

Federator ASVSFederator ASVS

Capability FederationASL-RenderingStandard service API

1

2

3

Unified data query/access/display

ASIS

Page 13: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Performance enhancing designs Performance enhancing designs -measurements and analysis--measurements and analysis-

13

Page 14: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Performance InvestigationPerformance Investigation

• Interoperability requirements bring up some compliance costs:– Common data model (GML)– Web Services (SOAP protocol for communication)

• Approaches: Enhancing the GIS systems responsiveness– Streaming GIS Web Services– Pre-fetching– Parallel processing with caching

• Testing with large scale science applications using large scale data, and resource consuming processes– Earthquake forecasting (PI),– Virtual California (VC)

• Turning compliance requirements into competitiveness14

Page 15: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Limits of Conventional OGC-GIS systemsLimits of Conventional OGC-GIS systems

• On-demand data access, single-threaded and no-caching• Related projects: Deegree and UMN-Minnesota Map Servers• Baseline performance tests over the systems developed with Open

Geographic Standards:– Local-area network – from database to user ends– Small data sets (less than 500KB) response times are ok– For larger data sizes the performance is not enough.

15

Page 16: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Design & Measurement-1:Design & Measurement-1:

Large sized structured data transferLarge sized structured data transfer

• XML representation of data tend to be significantly larger than binary representations

• The larger data sizes consume the greater network bandwidth.

• In initial development of the proposed SOA based GIS we used GIS Web Services and SOAP over HTTP as transfer protocol.

• BUT, this had some limitations over the performance.• We investigated “Streaming Data Transfer”: topic-

based publish-subscribe messaging systems for exchanging SOAP messages and data payloads.

16

Page 17: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Streaming GIS Web-ServicesStreaming GIS Web-Services

• Lines 1, 2 and 3 show classic publish-find-bind triangle of Web Services• SOAP is used for negotiation (line-3) – standard getFeature request

– Publisher information in (topic, IP, port) triple is returned.• Publisher streams, subscriber receives.• The performance gain is average 40%

Topic-wfs

(A)WMS WFS

Narada Brokering

Server

UDDI

client server

registry

GML GML

3

2 1

getFeature

(topic, IP, port) PublisherSubscriber

wsdl

w s d l

17

Page 18: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Design & Measurement-2: Design & Measurement-2:

GML Data ProcessingGML Data Processing• Processing XML data: Parsing and rendering to create map images.• Two well-known approaches are document models (DOM) and push

models (SAX).• We use pull approach for XML processing:– Parses only what is asked for– No support for document validation (major gains of performance)– Doesn’t build complete object model in memory (unlike DOM)– Contents are returned directly to application from calls to parser (unlike SAX)

Data SizeTotal rendering timings

(1GB allocated VM)

(KB) DOM (dom4j) pull (Xpp)

1 469.22 15.59

10 494.06 72.81

100 625.54 183.06

1,000 760.20 270.47

5,000 1,422.91 671.74

10,000 3,557.44 1,025.67

100,000 -OUT OF MEM - 7,059.72

150,000 -OUT OF MEM - 11,047.89

200,000 -OUT OF MEM - 14,949.12

Page 19: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Federator-Oriented Performance Federator-Oriented Performance Enhancement ApproachesEnhancement Approaches

19

Page 20: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Geo-Data CharacteristicGeo-Data Characteristic• A data is described with

location attribute -(x, y) coordinates.

• A set of data is described with bounding box (bbox)– (a, b, c, d)

• Geo-data is described as un-evenly distributed and variable sized according to their locations attributes.– Ex. Human population

• Cannot share workload evenly

R1

R2R3

R4

(c,d)

(a,b)((a+c)/2, b) (a,b)

(c,d)

(1) (2)

(c, (b+d)/2) (c, (b+d)/2)

((a+c)/2, b)

• Supporting alternative techniques based on data characteristics1. Pre-fetching2. Parallel processing with caching

through attribute-based query decomposition

20

Page 21: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Design & Measurement-3: Design & Measurement-3:

Pre-fetching (PM)Pre-fetching (PM)• Getting the GML data before it is needed• Overcomes the network bandwidth problem and repeated data conversions.• For infrequently changing archived data

– In other case it might cause consistency• Red curve – pre-fetching the data (data is brought to federator – ready to use)• Black curve – on-demand fetching the from remote heterogeneous resources

FederatorUser PortalInteractive

Tools

User PortalInteractive

Tools

WFSWFS

ProcessorProcessor

12

WMSWMS

WMSWMS

WFSWFS

WFSWFS

12 PM

NBTemp

Storage

GML

Local File System

PF: Pre-fetching module NB: NaradaBrokering WMS: Web Map Service WFS: Web Feature Service

21

PM runs pre-defined task in pre-defined periodicity-independent of the application

Page 22: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Data SizeAverageResponse  

AverageResponse  

KB Pre-fetching StdDev On-demand StdDev

1 1,006.47 176.84 2,375.24 152.40

10 1,040.33 233.24 2,578.69 252.49

100 1,148.44 233.11 7,973.16 374.12

1000 1,687.44 421.92 59,335.69 343.76

10,000 2,785.37 282.39 573,324.66 836.46

• For 10MB, pre-fetching is about 200 times faster conventional on-demand fetching.

• The larger the data size the higher the performance gains.

Pre-fetching vs. On-demand FetchingPre-fetching vs. On-demand Fetching

22

Page 23: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Design 4: Design 4: Parallel Processing and Caching Parallel Processing and Caching

WFS

r1

GetFeature requestsr2 r3 rPn

GML1 GML2 GMLPnGML

Cached

R1R2

. . .

. . . .

Main query: cached data extraction and rectangulation Layers from Other

WFS and WMS

Critical data layer

Critical data provider in GML

Critical data falling into partitioned regions

R1 R2

R3

R4

R1

R1

R2

Cached Data

Successive request

2

3

4

1Main query cached-data extraction rectangulation - {Rectangles[Ri]} partitioning – {sub-queries [ri]} assigning separate threads assembling the results

23

Page 24: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Attribute-based Query Decomposition Attribute-based Query Decomposition Over un-cached RegionsOver un-cached Regions

• Finding the number of partitions need to be made for each rectangle– Calculate the cached data density– Compare with the pre-defined threshold value

• defines a region’s max possible size– Then , divide the region into equal sized (in bbox) sub-

regions whose size should be less than or equal to the threshold value

• Creating sub-queries and assembling the result sets– Sub-queries for the partitions inherit all the attributes

from the main query.– The only difference is bbox values

24

-110,35,-100,36 GFeature-1

-110,36,-100,37 GFeature-2

-110,37,-100,38 GFeature-3

-110,38,-100,39 GFeature-4

-110,39,-100,40 GFeature-5

Page 25: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

CachingCaching• Caching– Basically removes repeated jobs– One-time caching : Recently fetched data is kept for the

successive requests– For each session (browser), separate short-term cache

data• Session Tracking for Caching– How servers know what request came from whom?– Mapping Browser-based Sessions to Web Services

• Standard Web Service interfaces and message formats• Each request initiated from the same browser will have same

sessionID.• Adding new entry to header of SOAP request - “sessionID”

25

requestObj.setHeader(service_address, channel_name, sessionID)

Page 26: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Measurement-4Measurement-4: : Performance Tests – Performance Tests –

Parallel Processing and CachingParallel Processing and Caching

26

• As a result of comparing bbox of cached data and request, there are 3 different possible scenarios– Case 1: No usage of cached-data – Case 2: Complete usage of cached-data

• Bets case looks like pre-fetching – Case 3: Partial usage of cached-data

Page 27: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Data Access TimingsData Access Timings-No Cached Data--No Cached Data-

• Tdata access = Tquery conversion (getFeature to SQL) + TGML conversion + TStreaming the data from WFS to federator + TBuilding GML at federator

27

Federator WFS

Page 28: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Overhead and Response Timings Overhead and Response Timings ex. case: 10-threaded parallel processingex. case: 10-threaded parallel processing

• The performance does not increase in the same ratio at which the thread number increases– Overheads: Query partitioning, sub-query creation, map creation and map transfer.– There is no performance gain for less then a threshold-data size handled.

28Federator WFS

User-portal Interactive map - tools

Browser

Page 29: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Partial Usage of Cached Data (1/2 cached)Partial Usage of Cached Data (1/2 cached)

• There is no performance gain for the small sizes of data due to the overheads.

• For 10mb, the proposed system is almost 8 times faster than the ordinary on-demand one-threaded system.

• As the data size increases, performance gain increases.

• As the overlapped cached region increase, the performance gain increase– 100% overlapping -> look

like pre-fetching case

Comparison of the response times

GML Data Half cached half with parallel p Orinary systems

Size - MB time StdDev time StdDev

0.01 7,063.38 357.46 2,578.69 252.49

0.1 9,702.49 322.20 7,973.16 374.12

0.5 12,892.12 361.53 30,868.52 482.83

1 14,692.18 414.89 59,635.69 343.76

5 45,401.40 590.89 288,594.12 772.41

10 70,494.98 475.19 574,825.16 836.46

29

Browser

Page 30: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Contributions (Systems Research)Contributions (Systems Research)

• A framework for federated Service-Oriented GIS– Integrated Web Services with Open Geographic Standards for

supporting interoperability at both data and application levels– Capability definitions and federation

• Principles for Application Specific Information Systems– Conditions and requirements

• Investigating performance efficient designs and detailed benchmarking – Streaming GIS Web Services and Pre-fetching– Attribute-based query partitioning and caching for parallel

processing• Mapping browser-based session to Web Services• Forecasting workload from the cached-data

30

Page 31: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Contributions (Systems Software)Contributions (Systems Software)

• Developing Web Map Server (WMS) in Open Geographic Standards

• Developing GIS Federator• Interactive map tools for data display, query and

analysis.• Sci-Plot (Scientific data plotting) GIS Web Services– To integrate geo-science application data with Geo-

data Grid

31

Page 32: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Future Research DirectionsFuture Research Directions

• Developing generic framework for application specific information systems –ASIS– Considering semantics of data and services– Distributed capability federation– Capability files and application specific languages – Inter-service communications through capability exchange

• Integrating ASIS with science applications– Science plotting services as a gateway between science data

grid and applications– Handling processed data

• Storage, overlay and association with raw(input) data

32

Page 33: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

AcknowledgementAcknowledgement

• Galip Aydin: Web Feature Server (WFS)• Mehmet Aktas: Universal Description and Discovery

Services (UDDI)• The work described in this presentation is part of the

QuakeSim project which is supported by the Advanced Information Systems Technology Program of NASA's Earth-Sun System Technology Office.

• This collaboration is part of the NASA ACCESS ROSES funded project, Modeling and On-the-fly Solutions in Solid Earth Science.

33

Page 34: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Thanks!....Thanks!....

34

Page 35: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

BACK-UP SLIDES

35

Page 36: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

General Structure of AS-ToolsASF(V)S-based mediation

• To be concrete let’s analyze WFS-based mediation

• Query conversion– From “GetFeature” to local

query (ex. SQL for database)• Data set conversion and

composition– Local query result to GML

• Common service API– GetCapability– GetFeature– DescribeFeatureInfo

Data/ information SourcesDatabases, file systems or other

remote/local sources .

ASF(V)S Service LayergetCapability,getFeature,describeFeatureTyp

Request Handler

Composition

Mapping: query re-creation

Source Connection/Execution

Request Response

(Hetero-Sources)

WSDLWSDL

(1,3)

(2,3)

Standard Service API

36

Page 37: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Capabilities Federation Capabilities Federation Capability Files for Standard ServicesCapability Files for Standard Services

WMS WFS<Capabilities>

<Service><Name><OnlineResource><ContactInfo>

</Service><Capability>

<Request> <GetCapability> <GetMap> <GetFeaturInfo></Request><LayerList> <Layer-1: Satellite img> <Layer-2: gas-pipeline> <Layer-2: Google-map></LayerList>

</Capability></Capabilities>

<Capabilities><Service>

<Name><OnlineResource><ContactInfo>

</Service><Capability>

<Request> <GetCapability> <GetFeature> <DescribeFeaturType></Request><DataList> <Data-1: gas-pipeline> <Data-2: electric-power> <Data-2: other-data></ DataList >

</Capability></Capabilities>

Operations -Web Service Interfaces

Metadata about provided data/information

General Service Metadata

37

Page 38: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

38

Page 39: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Parallel processing with caching through Parallel processing with caching through attribute-based query decomposition - Iattribute-based query decomposition - I

• Attribute is bounding box (bbox) defined as– (minx, miny, maxx, maxy)

• CD_size_br2 = (maxxc - minxc)*(maxyc - minyc)• R_size_br2 = (maxx - minx)*(maxy - miny)• And pre-defined thr (threshold) value to determine if partitioning is

required for a rectangle (bbox)• Pn : The number of partitions calculated for a rectangle

(minxc, minyc)

(maxxc, maxyc)

(minx, miny)

(minx, miny)

1. Determining the number of partitions (Pn)

39

Page 40: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Parallel processing with caching through Parallel processing with caching through attribute-based query decomposition - IIattribute-based query decomposition - II

• 2. How to partition a rectangle in bbox– We know the rectangle’s bbox and Pn.– Since we still don’t know the workload falls in that bbox earlier,

we partition that rectangle into equal sizes– There are two options here, vertical partitioning and horizontal

partitioning. Let’s pick vertical and explain the algorithm:

Sy12

Pn

maxx, maxy

minx,miny

Calculating the bboxes of the partitioned regions:

for (i=0; i<Pn*sy; i=i+sy;) print ( minx, miny – i, maxx, maxy-(i+sy) ) ;Partitioning the rectangle along

the coordinate y40

Page 41: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Parallel processing with caching through Parallel processing with caching through attribute-based query decomposition - IIIattribute-based query decomposition - III

• 3. How to created sub-queries– After having partitioned regions’ bbox values printed in previous

step, corresponding sub-queries are created.– Each partition is differentiated by their bbox values calculated

above. Other attributes are inherited from the main query.– Ex: main query bbox is “-110, 35, -100, 40” and let’s assume we

found out that Pn=5

-110, 35, -100, 40

-110, 35, -100, 36

-110, 36, -100, 37

-110, 37, -100, 38

-110, 38, -100, 39

-110, 39, -100, 40

GetFeature-1

GetFeature-2

GetFeature-3

GetFeature-4

GetFeature-5

A rectangle from the rectangulation process

Creating queries for these bbox values

Decomposing the rectangle according to Pn and sy

41

-110,35,-100,36 GetFeature-1

Page 42: 1. Outline Background: Geographic Information Systems and Open Geographic Standards Motivations and Motivating Use Cases Research Issues Architecture:

Performance Tests – Based on Case Performance Tests – Based on Case ScenariosScenarios• As a result of comparing bbox of cached

data and request– (1) No usage of cached-data – (2)-(3) Complete usage of cached-data – (4) Partial usage of cached-data

WFS

r1

GetFeature requests

r2 r3 rPn

GML1 GML2 GMLPnGML

Cached

R1R2

. . .

. . . .

Main query: cached data extraction and rectangulation

Layers from Other WFS and WMS

Critical data layer

Critical data provider in GML

Critical data falling into partitioned regions

R1 R2

R3

R4

R1

R1

R2

Cached Data

Successive request

2

3

4

1

Main query >---rectangulation---> Rectangles[Rs] >---partition---> sub-queries [rs] 42