24
06/18/22 1 Distributed Heterogeneous Data Distributed Heterogeneous Data Warehouse For Grid Analysis Warehouse For Grid Analysis Harvey B. Newman, Julian Bunn, Harvey B. Newman, Julian Bunn, Saima Iqbal Saima Iqbal CALTECH ( California Institute of Technology ). CALTECH ( California Institute of Technology ).

6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

  • View
    230

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 1

Distributed Heterogeneous Data Distributed Heterogeneous Data Warehouse For Grid Analysis Warehouse For Grid Analysis

Harvey B. Newman, Julian Bunn, Harvey B. Newman, Julian Bunn, Saima IqbalSaima Iqbal

CALTECH ( California Institute of Technology ).CALTECH ( California Institute of Technology ).

Page 2: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 2

OUTLINE

Introduction

What is Relational data warehouse Distributed Heterogeneous Relational Data Warehouse Databases (DHRD) and Grid How DHRD could be integrated with the Grid

Why web services? Building blocks of Web Services Vital parts of Web Services How DHRD could be integrated with the Grid as a Web Service

Grid services Grid services architecture (GGF) [draft 16th Feb. 2003] Grid services client infrastructure (GGF) [ draft 5th Jun. 2003]

Proposed web services architecture based on Grid services to use DHRD in Grid environment Technologies employed UDDI complaint registry service

Working of web services prototype (demo)

Conclusion Future work Questions?

Page 3: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 3

INTRODUCTION

Can databases integrated with the Grid ? Most of the existing and proposed Grid applications

are file based. Very little work has been done on how Distributed

Heterogeneous Databases can be made available on the Grid.

Web Services can help in accessing Distributed Heterogeneous Databases as a single “Virtual Database” across the Grid.

Page 4: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 4

Distributed Data Warehouse

The distributed database system allows applications to access data from local & remote databases.

It helps to move some of data and some of the users to separate servers and databases.

Allow to keep data by a particular workgroup at Tier 2 and Tier 3, on a server nearby.

Reduce the need for massive central computing and network delays.

Page 5: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 5

Distributed Heterogeneous Relational Data Warehouse (DHRD) Databases and Grid

Is it possible to access DHRD databases across Grid by adopting the existing Grid services that handle files?

While relational databases offers much richer set of operations like queries and transactions.

There is much differences exists among different DBMS as that of different file systems.

Even within one paradigm different databases products ( Oracle, MS-SQL, DB2) vary in their functionality and interfaces.

Page 6: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 6

How DHRD Could Be Integrated with

The Grid

The diversity of DHRD makes it difficult to design a single solution to integrate DHRD databases with Grid.

The Open Grid Services Architecture (OGSA) for distributed system provide the concept of Grid Services (like Web Services) to access resources across distributed and heterogeneous environment.

These Grid Services/Web Services can help in providing the distributed databases across the Grid as a “Virtual Database System”.

Page 7: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 7

Web Services are centered on the Service definition and messages

Web Services build on set of well established technologies and protocols

- XML used for service description and data interchange

http used as a transport protocol- widely deployed with trusted security features

Web Services standards are structured and extensible- Interface evolution without breaking what is already

working Provide solution for the access of heterogeneous, web-wide

resources.

Why Web Services?

Page 8: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 8

Building blocks Of Web Services

Web Services are modular software components wrapped inside a specific set of Internet communication protocols and that can be run over the Internet.

At the heart, web services architecture is the need for program-to-program communications.

Key roles in the web services architecture are : - a service provider

- a service registry - a service requestor

Page 9: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 9

Building blocks Of Web Services (cont’d)

- Together they perform three operation on web services

Publish, Find and Bind

SERVICE

PROVIDER

SERVICEREQUESTOR

SERVICE REGISTRY

1Publish3

Bind

2Find

Make the service descriptionMake the service description

publicly available publicly available

Discover the serviceDiscover the service

Allows the service to be usedAllows the service to be used

by the requestorby the requestor

Page 10: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 10

Vital Parts of Web Services

SOAP (Simple Object Access Protocol) through which the service provider, service registry and service requestor communicate.

WSDL( Web Services Description Language) is the language used to create service description.

UDDI (Universal Description Discovery and Integration) is the directory technology used by service registries that contain the description of web services and allows the directory to be searched for a particular web service.

Page 11: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 11

The Distributed Heterogeneous Relational Databases can register themselves as a web service in a UDDI registry.

These web services can be accessible by a client through web application by using WSDL.

In this architecture Client is very important because this Client can dynamically discover services, configure the remote calls on the basis of the inputs it gets from http call.

How DHRD Could Be Integrated with The Grid As A Web Service

Page 12: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 12

The OGSA integrates key Grid technologies (including Globus toolkit) with Web Services mechanisms to create a distributed system framework around the OGSI (Open Grid Services Infrastructure).

A Grid Service is a Web Service that conforms to a set of conventions (interfaces & behavior) that define how a client interacts with a services available across Grid.

Grid Services

Page 13: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 13

Grid Services Architecture (cont’d)(Grid Database Service specification (GGF))

Requester

GridDataTransportPort

GridDataServicePort

GridServicePort

GridDataService

GridDataService

<Response>Put/get

<Response>

Perform

<ServiceData>

FindServiceData

Requester Using Grid Data Service Ports

Page 14: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 14

Grid Services Architecture (Grid Database Service specification (GGF))

Requester

GridDataService

GridDataServiceFactory

GridServiceRegistry

Database Servers

FindServiceData

GSH(GridServiceHandler)

CreateService

<ServiceInformation>

create

Creating a Grid Data Service

Page 15: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 15

Grid Services Client Infrastructure(Grid Database Service specification (GGF))

ClientClient

ApplicationApplicationBinding Binding

SelectionSelection

Protocol 1Protocol 1

(binding)(binding)

Specific stubSpecific stub

Protocol 2Protocol 2

(binding)(binding)

Specific stubSpecific stub

A Client-Side runtime architectureA Client-Side runtime architecture

A C

lient-

Serv

er

Inte

rface

A C

lient-

Serv

er

Inte

rface

ProxyProxy

Invocation ofInvocation of

Web ServiceWeb Service

Page 16: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 16

ORACLE9i SERVERDATA(META DATA)

ORACLE9i SERVERDATA(META DATA)

MS-SQLDATA(META DATA)

JAVA XML API

to connect with DatabaseServer

Web Server

UUDI Registry Server

Client Web Application to connect with database

Bind with the provided service

SOAP Processor

WSDL file

UDDI SOAP Request and Response

Server withMaterialized View Database

 

Proposed Web Services Architecture Based on Grid Services

To Use DHRD In Grid Environment

(Service Provider)

(Service Requestor)

(Service Registry)

SOAP

SOAP

Server withMaster Database

HTTP Server

Data Replication through SSL

MonaLisaMonaLisa

Page 17: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 17

Java Web Services Developer Pack 1.0 (JWSDP)Java Web Services Developer Pack 1.0 (JWSDP) Apache Tomcat 4.1.2 for Java Web Services Developer Pack Apache Tomcat 4.1.2 for Java Web Services Developer Pack

1.01.0

-Apache web server-Apache web server

-Tomcat servlet engine-Tomcat servlet engine Java API for XML Registries (JAXR) 1.0_02Java API for XML Registries (JAXR) 1.0_02 Java API for XML-based RPC (JAX-RPC) 1.0_01Java API for XML-based RPC (JAX-RPC) 1.0_01 Web Application Deployment Tool for JWSDPWeb Application Deployment Tool for JWSDP XRPCC tool to generate WSDLXRPCC tool to generate WSDL JWSDP Registry Server 1.0_02JWSDP Registry Server 1.0_02

-Xindice database, the repository for registry data-Xindice database, the repository for registry data

-implements Version 2.00 of the Universal Description, -implements Version 2.00 of the Universal Description, Discovery and Integration (UUDI)Discovery and Integration (UUDI)

Technologies Employed

Page 18: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 18

UDDI Complaint Service Registry

A standardized, transparent mechanism for describing the serviceA standardized, transparent mechanism for describing the service A simple mechanism for invoking the serviceA simple mechanism for invoking the service An accessible central registry servicesAn accessible central registry services Make use of XML and SOAPMake use of XML and SOAP Provide service discovery platform on WWWProvide service discovery platform on WWW Suitable for Suitable for “Black Box” “Black Box” web environmentweb environment Allow to store as much as detail about a service and its Allow to store as much as detail about a service and its

implementation as desiredimplementation as desired The UDDI version 2.0 API defines approx. 40 messages to perform The UDDI version 2.0 API defines approx. 40 messages to perform

inquiry and publishing functions against any UDDI complaint inquiry and publishing functions against any UDDI complaint service registryservice registry

The schema defines 25 requests and 15 responsesThe schema defines 25 requests and 15 responses

Page 19: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 19

Working of Web Services Prototype

Database Database

ServerServer

Registry Registry

ServerServerWeb serverWeb server

Web Service Web Service

RequesterRequester

Program InterfaceProgram Interface

StubsStubs

JAX-RPCJAX-RPC

RuntimeRuntime

Program Program

ImplementationImplementation

TiesTies

JAX-RPCJAX-RPC

RuntimeRuntime

11

22

33

44

55

66

88

99

1010

77

httphttp

JAXRJAXR

Find-serviceFind-service

SOAPSOAP

MessageMessage

SOAPSOAP

MessageMessage

SOAPSOAP

MessageMessage

JAX-RPCJAX-RPC

Page 20: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 20

Working of Web Services Prototype

Page 21: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 21

DEMODEMO

Working of Web Services Prototype

Page 22: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 22

Conclusion

It seems possible that we can make the Distributed Heterogeneous Relational Data Warehouse Databases available across the Grid in form of Web Services/Grid Services.

Page 23: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 23

Future WorkFuture Work

Integration of MonALISA (Grid monitoring tool), for the location of required web service with optimal network resources

Exploit UDDI with its full functionality

Provide an API to integrate this Grid Services based Web Services prototype into the Globus toolkit

Page 24: 6/3/20151 Distributed Heterogeneous Data Warehouse For Grid Analysis Distributed Heterogeneous Data Warehouse For Grid Analysis Harvey B. Newman, Julian

04/18/23 24

Questions?Questions?