24
Building Problem Solving Environments with Application Web Service Toolkits Choonhan Youn and Marlon Pierce Computer Science, Syracuse University And Community Grid Labs, Indiana University

Building Problem Solving Environments with Application Web Service Toolkits Choonhan Youn and Marlon Pierce Computer Science, Syracuse University And Community

Embed Size (px)

Citation preview

Building Problem Solving Environments with Application Web

Service Toolkits

Choonhan Youn and Marlon PierceComputer Science, Syracuse University

And Community Grid Labs, Indiana University

Presentation Outline• Introduction

– What is the Computational web portal?– Gateway: computing web portal– Limitations of traditional approach

• Web Service-Based Computing Portal Architecture• Core Web services for Computing Portals

– Job submission– File Manipulation– Context Management– Script Generation– Job monitoring

• Application Web services• Web service negotiation.• Conclusion

Computational Web Portals

• Computational Web Portals provide seamless access to HPC resources– You can log in anywhere through any general web browser.

• Portals simplify the use of HPCs for novice users.– Basics: batch script generation, job submission and monitoring, file

service and ……– Computational grid services: Globus, Condor

• Portals can simplify the use of unfamiliar codes.– GEM code: disloc, simplex

• Provide a work management environment for all users.– You can see what you did last week.

• Other PSEs Web portals– NASA Information Power Grid LaunchPad– NPACI Hotpage – Pacific Northwest National Laboratory’s Ecce system, UNICORE– Our own Gateway/ServoGrid projects

Gateway project

• Gateway is a computational web portal project funded through:– DoD HPC MO PET Portal: Kerberos security in computational web portal– GEM science: Support codes developed by earthquake modeling

consortium– Alliance: Contribute to NCSA portal– SciDAC (Scientific Discovery through Advanced Computing): DOE

project to build portal services for Plasma physics• Our goal is to provide building block components that can be used to

build specific portals. • We also develop browser-based interfaces for basic services and

specific science codes.• Developed to support typical, if simple, high performance computing

services– Batch script generation, job submission and monitoring, file management

and transfer.– Do it all securely

Problems with Traditional Portal Architecture

• Portals accesses heterogeneous back ends and grids through a particular middle tier.

• Most portal projects are not interoperable– Middle tier software incompatible– Wide range of protocols.

• Why do we need the portal interoperability?– Portal developers don’t have to

reinvent every single important service (lesson from GGF GCE).

– Users will have access to more services than any one project can provide.

– Users will be able to pick up the best available implementation of a service.

services

Web browser Web browser

services

Back end resources Back end resources

?

Web Service-Based Computing Portal Architecture

JS: Job submissionJM: Job MonitoringFT: File TransferCM: Context ManagerSG: Script GenerationAWS: Application Web ServiceHIS: Host Independent ServiceHSS: Host Specific Service

Backend Resources

Middle Tier(Web Server)

Simulation Component

JSJM

FT

HPC

SOAP

Data Component

FTJS

JM

Data Base

… Web Services Provider

Web Browser

ServiceRepository

Publish

Publish

SOAP

SOAPSOAP

SOAP

HTTP HTTP

Portal Server

CM

SG

AWS

Middle Tier(Web Server)

HIS

SOAP

SOAP

User Interface Server

SOAP Client

Repository Client

SOAP

HSSHSS

Publish

Core Web services – 1

• Given WSDL and SOAP, what can you build?• Host-Specific Services (HSS)

– Instances of these services are bound to particular hosts.– Job Submission– File Transfer– Job & Host Monitoring

• Host-Independent Services (HIS)– Informational services that are not tied to specific service points– The service provided does not depend on the location.– Context Management– Script Generation

• These core services are simple, stateless.

Core Web services - 2• Job Submission

– Allow users to execute scientific applications– Execute operating system calls directly or may interact with Grid

services through, for example, the CoG client API to Globus.– We use Java Runtime processes to run external (non-Java) commands,

for example, PBS qsub.• File Manipulation

– Upload and download files between their desktops and various backend destinations.

– Allow users to transparently move, rename, and copy files on remote back-ends and crossload between different backend sites.

– File uploading and downloading service illustrate the use of SOAP messages with attachments in the RPC messaging style.

– SOAP attachments are non-XML files that are appended to the SOAP message and are useful for sending binary data and files with known MIME formats.

Core Web services - 3

• Context Management (CM)– Archives interactions with the computational portal and stores all of the

metadata associated with user sessions.– Provides simplest possible data model

• CM provides an easy interface to an arbitrarily deep and complex tree-shaped data structure.

• Context data nodes are defined by recursive schema that hold optional, unbounded name/value pairs and child nodes.

– We use CM to store locations of job scripts, miscellaneous file URIs, user’s application instance XML files, etc.

– CM metadata stored on file systems, XML-native databases, ….• Actual data may be anywhere.

– Actual service interface for manipulating contexts and the context data• Add one or more contexts.• Search and store the context data with XPath queries.• Remove the specified context.• List the child contexts.

Context Manager Architecture

Client

Axis Servlet

SOAP/HTTP

ContextManager

SharedWSDL

Interface

FS XMLDB

InternalCommunication

Context Data

Core Web services - 4

• Script Generation– For users who are unfamiliar with HPC systems.– The information about user’s choice with the portal interaction is stored

as user’s application instance XML document.– Generate the job script which could be broken down into two parts: a

queue script for a particular queuing system such as PBS, LSF and LoadLeveler and a user script for running the application code.

• Job monitoring– Has been built in the polling method.– Monitor the execution of a job running in a queuing system.– Return the array of the generated a WSDL complex type, effectively an

XML data object that contains the job status of the scheduler, given the user name and the type of queuing system as input parameters on job monitoring method.

List user files on selected host, Solar. File operations include Upload, download, Copy, rename, crossload

File manipulation service

Job monitoring service

List the user’s job status on selected host, Solar that is running PBS queuing system.

Application Web Services (AWS)

• Application: specifically some code developed by the scientific community.– Example: Finite element codes, grid generation codes and so on.

• AWS are designed to make scientific applications (i.e. earthquake modeling codes) into Grid Resources.

• An actual application is wrapped by a Java program.• We need a meaningful metadata model for applications

– Describe application-specific requirements– Describe bindings of applications to host environments and to Web

services in a general way that is independent of the particular portal.• Scientific applications consist of several core Web services.

– Get files to right place, script submission instructions, submit the job, get notified at various states.

AWS Lifecycle

• Applications can exist in four stages:– Abstract state: describes optional choices and

configurations that are available.– Ready state: Specific choices are made– Submitted: Application is running – Completed: Application is finished, but we

need to archive information about it.

AWS Schema Structure

• Two sets of XML schema:– Application Descriptors:

• describe abstract state.• describe application options. Used by the application developer

to deploy his/her service into the portal.– Application Instance Descriptors:

• describe particular instance states (ready, running, archived).• describe particular user choices and archive them for later

browsing and resubmission.

• Schema sets are arranged hierarchically– Applications contain hosts– Schema are designed to be pluggable

• Don’t like my queue description schema? Plug in your own.

AWS XML Descriptors

• Application description schema– A “basic information” element that contains information such as application

name, version, option flags.– An “internal communication” element that contains child elements for

describing input, output, and error fields for the code.– An “execution environment” element that contains a list of core services

needed to execute the application.– An optional, generic parameter to hold arbitrary information about the

application.• Host description schema

– Contains information about the resource such as DNS name and IP address– All of the information needed to invoke the parent application on that resource

such as location of the executable, location of the workspace or scratch directory, and so on.

• Queue description schema– Contains information needed to perform queue submissions such as memory

size, number of CPUs and so on( in case of PBS).

Example: Deploy an application code, Simplex on a particular host as a service and this form is used to edit the Application XML descriptor file

Sample generated user view of application code, Simplex: this form is generated from the Application XML descriptor for a particular application runs: the input files used, the location of the output, the resources used for the computation, etc.

Portal Stack

• Core services provide the basic connection to back end “Grid” services.

• Application services combine core services and application metadata.

• User interface portlets are built for each service.

• Portals aggregate portlet components into portals.

Core Web Services

User Interfaces

Application Web Servicesand Workflow

Aggregate Portals

Message S

ecurity, Information

Portlets for User Interface Components

• Web services define XML interfaces for accessing services.

• User interface components (such as JSPs) combine service stubs into useful objects for human interaction.

• So we actually have two points of interoperability:– At the WSDL interface– At the user interface

• Portlets combine HTML (and other) user interfaces into aggregate portal interfaces.– EX: Jetspeed from Jakarta

Reliability of Distributed Services

• Distributed service systems have some important reliability problems– Information must be up to date.

• The system adjust when servers become available or unavailable.• Service metadata should match the actual capabilities of the system.

– Messages should reach the services.• We are automating application service metadata through

publish/subscribe mechanisms.– Servers contain embedded publisher/subscriber clients– Information aggregators publish requests for information to JMS-

style brokers.– All available servers subscribed to the request topic publish their

information back to the aggregator.

Bridging Between Client-Serverand Messaging Services

Browser

DynamicUser Interface

Component

BrokerAggregator

TomcatServer

TomcatServer

TomcatServer

TomcatServer

TomcatServer

Serversrun NaradaNotifiers

Peers registerthemselvesto Aggregator

Web servicerequest forinformation

SOAP

HTTP

Conclusions• Traditional portals have “stovepipes” with interoperability problems.• By designing and implementing several core portal services and Application Web

Services around Web services, we gain interoperability and reusability.• The emphasis on the development of reusable services that can form the basis for

multiple PSEs.• The portal developer can construct specific implementations and composites of

primitive service components and can also provide services that may be shared among different portals.

• Application-specific services and data models that can be used to encapsulate entire applications independently of the portal implementation.

• User interfaces to application services become distributed portlets.• Everything is distributed

– Core Web Services->Application Web Services->User Interfaces Portlets->Portals

– Uses HTTP, SOAP, WSDL, ….• It all has to be secured.

– A flexible, message-based security system that can be bound to multiple mechanism and multiple message formats.

– The general approach: to use assertion– SAML, WS-Security– Kerberos, PKI