62
Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure) May 29 2003 Geoffrey Fox, Indiana University Note the terms Grid, e-Science Technology/Middleware, and Cyberinfrastructure are NOT distinguished

Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

  • Upload
    erv

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure). May 29 2003 Geoffrey Fox, Indiana University. Note the terms Grid, e-Science Technology/Middleware, and Cyberinfrastructure are NOT distinguished. What is a Grid I?. Collaborative Environment (Ch2.2,18) - PowerPoint PPT Presentation

Citation preview

Page 1: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Status ofGrid Technology/Middleware

(e-Science, Cyberinfrastructure)

May 29 2003

Geoffrey Fox, Indiana University

Note the terms Grid, e-Science Technology/Middleware, and Cyberinfrastructure are NOT distinguished

Page 2: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

What is a Grid I?• Collaborative Environment (Ch2.2,18)• Combining powerful resources, federated computing and a security

structure (Ch38.2)• Coordinated resource sharing and problem solving in dynamic multi-

institutional virtual organizations (Ch6)• Data Grids as Managed Distributed Systems for Global Virtual

Organizations (Ch39)• Distributed Computing or distributed systems (Ch2.2,10)• Enabling Scalable Virtual Organizations (Ch6)• Enabling use of enterprise-wide systems, and someday nationwide

systems, that consist of workstations, vector supercomputers, and parallel supercomputers connected by local and wide area networks. Users will be presented the illusion of a single, very powerful computer, rather than a collection of disparate machines. The system will schedule application components on processors, manage data transfer, and provide communication and synchronization in such a manner as to dramatically improve application performance. Further, boundaries between computers will be invisible, as will the location of data and the failure of processors. (Ch10)

Page 3: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

What is a Grid II?• Supporting e-Science representing increasing global collaborations of

people and of shared resources that will be needed to solve the new problems of Science and Engineering (Ch36)

• As infrastructure that will provide us with the ability to dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications. (Ch1)

• Makes high-performance computers superfluous (Ch6)• Metasystems or metacomputing systems (Ch10,37)• Middleware as the services needed to support a common set of

applications in a distributed network environment (Ch6)• Next Generation Internet (Ch6)• Peer-to-peer Network (Ch10, 18)• Realizing thirty year dream of science fiction writers that have spun

yarns featuring worldwide networks of interconnected computers that behave as a single entity. (Ch10)

Page 4: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

What is Grid Technology?• Grids support distributed collaboratories or virtual

organizations integrating concepts from• The Web• Distributed Objects (CORBA Java/Jini COM)• Globus Legion Condor NetSolve Ninf and other High

Performance Computing activities• Peer-to-peer Networks• With perhaps the Web being the most important for

“Information Grids” and Globus for “Compute Grids”• Use Information Grids and not usual Data Grids as

“distributed file systems” (holding lots of data!) are handled in Compute Grids

Page 5: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Taxonomy of Grid FunctionalitiesName of Grid

TypeDescription of Grid Functionality

Compute/File Grid Run multiple jobs with distributed compute and data resources (Global “UNIX Shell”)

Desktop Grid “Internet Computing” and “Cycle Scavenging” with secure sandbox on large numbers of untrusted computers

Information Grid Grid service access to distributed information, data and

knowledge repositories Complexity or Hybrid Grid

Hybrid combination of Information and Compute/File Grid emphasizing integration of experimental data, filters and simulations

Campus Grid Grid supporting University community computing

Enterprise Grid Grid supporting a company’s enterprise infrastructure

Note: Term Data Grid not used consistently in community so avoided

Page 6: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

HPCSimulation

DataFilter

Data FilterD

ata

Filt

er

Data

Filter

Data

Filter

Distributed Filters massage dataFor simulation

Other

Grid

and W

eb

Servi

ces

AnalysisControl

Visualize

Complexity Grid Computing Model

Grid

OGSA-DAIGrid Services

This Type of Gridintegrates with

Parallel computingas on TeraGrid

Page 7: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Taxonomy of Grid Operational StyleName of Grid

StyleDescription of Grid Operational or

Architectural Style

Semantic Grid Integration of Grid and Semantic Web meta-data and ontology technologies

Peer-to-peer Grid Grid built with peer-to-peer mechanisms

Lightweight Grid Grid designed for rapid deployment and minimum life-cycle support costs

Collaboration Grid Grid supporting collaborative tools like the Access Grid, whiteboard and shared applications.

R3 or Autonomic Grid

Fault tolerant and self-healing Grid

Robust Reliable Resilient R3

Page 8: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

“Central” Architecture/Functionality/Style Gaps• Substantial comments on “hosting environments”

OGSI and “permeating principles”– Agreement on Web service model

4: Key OGSA Services

5: OGSA-compliant System Grid Services

6: Domain-Specific (Application) Grid Services

1: Hosting Environment

WS WS WS WS

2: OGSI Web service Enhancements

3: Permeating Principles and Policies

“Central ServicesAnd Architecture”

Central Gaps

“Modular” Servicesnatural for

distributed teamsSpecific Gaps

Page 9: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

An OGSA Grid Architecture in detail (from GGF GPA)

Page 10: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Permeating Principles and Policies• Meta-data rich Message-linked Web Services as the permeating paradigm• “User” Component Model such as “Enterprise JavaBean (EJB)” or .NET. • Service Management framework including a possible Factory mechanism • High level Invocation Framework describing how you interact with system

components.– This could for example be used to allow the system to built from either

W3C or GGF style (OGSI) Web Services and to protect the user from changes in their specifications.

• Security is a service but the need for fine grain selective authorization encourages

• Policy context that sets the rules for each particular Grid. – Currently OGSA supports policies for routing, security and resource use.

• The Grid Fabric or set of resources needs mechanisms to manage them. This includes automatic recording of meta-data and configuration of software.

• Quality of service (QoS) for the Network and this implies performance monitoring and bandwidth reservation services. – Challenging as end-to-end and not just backbone QoS is needed.

• Messaging systems like MQSeries from IBM provide robustness from asynchronous delivery and can abstract destination and allow customization of content such as converting between different interface specifications.

• Messaging is built on transport mechanisms which can be used to support mechanisms to implement QoS and to virtualize ports

Page 11: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

World Wide Grid Service Activities I• Commercial activities especially those of IBM, Avaki, Platform, Sun,

Entropia and United Devices• The GT2 and GT3 Globus Toolkits. Here we effectively covering not just

the Globus team but the major projects such the NASA Information Power Grid that have blazed the trail of “productizing” Grids. – Note that we can “already” see GT3 (Grid Service) like functionality

from GT2 wrapped with the various (Java, Perl, Python, CORBA) CoG kits. So GT2 capabilities can be classified as Services

• Trillium (GriPhyn, iVDGL and PPDG) and NeesGrid; the major NSF (DoE for PPDG) projects in the USA. – Condor from the University of Wisconsin which is being integrated

into Grid services through the Trillium and NMI activities.• The NSF Middleware Initiative (NMI) packaging a suite of Globus,

Condor and Internet2 software. – This has overlaps with the VDT (Virtual Data Toolkit from GriPhyn)

Page 12: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

World Wide Grid Service Activities II• Unicore (GRIP), GridLab, the European Data Grid (EDG) and

LCG (LHC Computing Grid) – Many other (20) EU Projects but these have most of technology

development• Storage Resource Broker SRB-MCAT from SDSC• The DoE Science Grid and related activities such as the Common

Component Architecture (CCA) project• Examination of services from a collection of portal projects in the

US from Argonne, Indiana, Michigan, NCSA and Texas. – This includes best practice discussion from Global Grid Forum

in portals.• Review of contributions to the recent book Grid Computing:

Making the Global Infrastructure a Reality edited by Fran Berman, Geoffrey Fox and Tony Hey, John Wiley & Sons, Chichester, England, ISBN 0-470-85319-0, March 2003– This includes other major projects like Cactus, NetSolve, Ninf

• Some 6 Core and other application specific UK e-Science Projects

Page 13: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Network 8.11

InformationCompute Resources

PortalsPSE’s8.10

Application SpecificResource SpecificGeneric

Grid Services:

Architectureand Style 8.1 Basic Technology

Runtime and Hosting Environment 8.2

Information 8.7Compute/File 8.8

Security 8.3Workflow 8.4Notification 8.5Meta-data 8.6Other 8.9

Categorization of Technical Gaps and Grid ServicesSection Numbers in Report available Mid June

Page 14: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Categories of Worldwide Grid Services• 1) Types of Grid

– R3– Lightweight– P2P– Federation and Interoperability

• 2) Core Infrastructure and Hosting Environment– Service Management– Component Model– Service wrapper/Invocation – Messaging

• 3) Security Services– Certificate Authority– Authentication– Authorization– Policy

• 4) Workflow Services and Programming Model– Enactment Engines (Runtime)– Languages and Programming– Compiler– Composition/Development

• 5) Notification Services• 6) Metadata and Information Services

– Basic including Registry– Semantically rich Services and meta-data– Information Aggregation (events)– Provenance

• 7) Information Grid Services– OGSA-DAI/DAIT– Integration with compute resources– P2P and database models

• 8) Compute/File Grid Services– Job Submission– Job Planning Scheduling Management– Access to Remote Files, Storage and

Computers– Replica (cache) Management– Virtual Data– Parallel Computing

• 9) Other services including– Grid Shell– Accounting– Fabric Management– Visualization Data-mining and

Computational Steering– Collaboration

• 10) Portals and Problem Solving Environments• 11) Network Services

– Performance– Reservation– Operations

Page 15: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Features of Worldwide Grid Services• UK activities have a strong web service and Information Grid

emphasis– Important compute/file activities as well (White Rose,

RealityGrid, UK part of EDG etc.)• Non UK activities are dominantly focused on compute/file Grids

– Submit jobs in distributed UNIX shell (Gridshell) fashion– Gather data from instruments (accelerator, satellite, medical

device); process in batch mode mapping between filesets• Little emphasis on lightweight or R3 Grids but NSF in USA and

EDG have aimed at better support and software quality– EDG has useful “tension” between technology and application

focus working groups– NMI and even GT3 have changed packaging and added

service view – have not changed “underlying” architecture for robustness

• Coordinated set of Portal activities in USA• Little work on integrating parallel computing and Grid although

TeraGrid in USA could change this

Page 16: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Central Gaps:Gaps in Grid Styles and Execution Environment• Need for both robust (fault tolerant) and lightweight

(suitable for small groups) Grid styles identified– Peer-to-peer style supports smaller decentralized virtual

organizations

• Note opportunities for modern middleware ideas to be used – lightweight, message-based

• Note that Enterprise JavaBeans not optimized for Science which has high volume dataflow

• Federated Grid Architecture natural for integration of heterogeneous functionality, style and security

• Bioinformatics and other fields require integration of Information and Compute/File Grids

Page 17: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Information Grid

Enterprise Grid

Compute Grid

Campus Grid

R2R1

Teacher

Students

Dynamic light-weight Peer-to-peerCollaboration Training Grid

Overlapping HeterogeneousDynamic Grid Islands

Page 18: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

(a) Layered OGSA Grid

CoreService

CoreService

CoreService

CoreService

ApplicationService

ApplicationService

ApplicationService

OGSA Interface

OGSA Mediation

CoreService

CoreService

CoreService

CoreService

CoreService

CoreService

Appl.Service

Appl.Service

Appl.Service

Appl.Service

Grid-1 Grid-2OGSA or non OGSA Interface-2OGSA or non OGSA Interface-1

(b) Federated OGSA Grid

Page 19: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Many Gaps in Generic Services• Some gaps like Workflow and Notification are to make

production versions of current projects– Just in UK workflow from DAME, DiscoveryNet, EDG,

Geodise, ICENI, myGrid, Unicore plus Cardiff, NEReSC ….• RGMA and Semantic Grid offer improved meta-data

and Information services compared to UDDI and MDS (Globus)– Need comprehensive federated Information service

• Security requires architecture supporting dynamic fine-grain authorization

• UK e-Science has pioneered Information Grids but gap is continuation of OGSA-DAI, integration with other services and P2P decentralized models

• Functionality of Compute/File Grids quite advanced but services probably not robust enough for LCG or Campus Grids

Page 20: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Gaps in Other Grid services• Portals and User Interfaces – Noted gap that many

not using Grid Computing Environment “best practice” with component based user-interfaces matching component-based middleware

• Programming Models (using workflow runtime)• Fabric Management (should be integrated with

central service management and Information system), Computational Steering, Visualization, Datamining, Accounting, Gridmake, Debugging, Semantic Grid tools (consistent with Information system), Collaboration, provenance

• Application-specific services• Note new production central Infrastructure can

support both research and production services of this type

Page 21: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

PPPH: Paradigms Protocols Platforms and Hosting I

• We will start from the Web view and assert that basic paradigm is

• Meta-data rich Web Services communicating via messages

• These have some basic support from some runtime such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit 3)– These are the distributed equivalent of operating

system functions as in UNIX Shell

• Called Hosting Environment or platform

Page 22: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

OGSI

OGSA Platform services: registry,authorization, monitoring, data

access, etc., etc.

TransportProtocolHosting EnvironmentHosting Environment

Host. Env. & Protocol Bindings

Models for resources

& other entities

More specialized &domain-specific

services

Other

models

Domain-specificprofiles

Environment-specificprofiles

OGSAPlatform

OGSA OGSI & Hosting Environments• Start with Web Services in a hosting environment

• Add OGSI to get a Grid service and a component model

• Add OGSA to get Interoperable Grid “correcting” differences in base platform and adding key functionalities

Page 23: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Functional Level above OGSA• Systems Management and Automation • Workload / Performance Management • Security• Availability / Service Management • Logical Resource Management • Clustering Services • Connectivity Management • Physical Resource Management• Perhaps Data Access belongs here

Page 24: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

OGSI Open Grid Service Interface• http://www.gridforum.org/ogsi-wg• It is a “component model” for web services.• It defines a set of behavior patterns that each OGSI service must exhibit.• Every “Grid Service” portType extends a common base type.

– Defines an introspection model for the service– You can query it (in a standard way) to discover

• What methods/messages a port understands• What other port types does the service provide?• If the service is “stateful” what is the current state?

• A set of standard portTypes for– Message subscription and notification– Service collections

• Each service is identified by a URI called the “Grid Service Handle” • GSHs are bound dynamically to Grid Services References (typically wsdl

docs)– A GSR may be transient. GSHs are fixed.– Handle map services translate GSHs into GSRs.

Page 25: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

OGSI and Stateful Services• Sometimes you can send a message to a service, get a result and

that’s the end

– This is a statefree service

• However most non-trivial services need state to allow persistent asynchronous interactions

• OGSI is designed to support Stateful services through two mechanisms

– Information Port: where you can query for SDE (Service Definition Elements)

– “Factories” that allow one to view a Service as a “class” (in an object-oriented language sense) and create separate instances for each Service invocation

• There are several interesting issues here

– Difference between Stateful interactions and Stateful services

– System or Service managed instances

Page 26: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Factories and OGSI• Stateful interactions are typified by amazon.com where messages carry correlation

information allowing multiple messages to be linked together– Amazon preserves state in this fashion which is in fact preserved in its

database permanently• Stateful services have state that can be queried outside a particular interaction• Also note difference between implicit and explicit factories

– Some claim that implicit factories scale as each service manages its own instances and so do not need to worry about registering instances and lifetime management

• See WS-Addressing from largely IBM and Microsofthttp://msdn.microsoft.com/webservices/default.aspx?pull=/library/en-us/dnglobspec/html/ws-addressing.asp

FACTORY

1

2

3

4

FACTORY

1

2

3

4

Explicit FactoryImplicit Factory

Page 27: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Two-level Programming I• The paradigm implicitly assumes a two-level Programming

Model• We make a Service (same as a “distributed object” or

“computer program” running on a remote computer) using conventional technologies– C++ Java or Fortran Monte Carlo module

– Data streaming from a sensor or Satellite

– Specialized (JDBC) database access

• Such nuggets accept and produce data from users files and databases

• The Grid is built by coordinating such nuggets assuming we have solved problem of programming the nugget

Nugget Data

Page 28: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Two-level Programming II• The Grid is discussing the linkage and distribution of the

nuggets with the onlyaddition runtime interfaces to Grid as opposed to UNIX data streams

• Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs

• Such interpretative environments are the single processor analog of Grid Programming or Workflow

• Some projects like GrADS from Rice University are looking at integration between nugget levels but dominant effort looks at each level separately

Nugget1 Nugget2

Nugget3 Nugget4

Page 29: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Raw (HPC) Resources

Middleware

Database

PortalServices

SystemServices

SystemServices

SystemServices

Application Service

SystemServices

SystemServices

GridComputing

Environments

UserServices

“Core”Grid

Application Metadata

Actual Application

Page 30: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

PPPH: Paradigms Protocols Platforms and Hosting II

• Self-describing programs/interfaces are key to scaling– Minimize amount of work system has to do– Hide as much as possible in services and applications

• Protocols describe (in “principle” at least) those rules that system obeys and uses to deliver information between services (processes)

• Interfaces tell the service what to do to interpret the results of communication

• HTTP is the dominant transport protocol of the Web• HTML is the “interface” telling browser how to render• But you can extend interface to allow PDF, multimedia,

PowerPoint using “helper applications” which are (with more or less convenience) “automatically” downloaded if not already available– “Mime types” essentially self-describe” each interface

Page 31: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Protocol/Interface Analogy with Web II• HTTP and HTML are the analogies on the client side• A “Web Service” generalizes a CGI Script on server side

– CGI is essentially a Distributed Object technology allowing server to access an arbitrary program labeled by a URL plus an ugly syntax to specify name and parameters of program to run

• Roughly WSDL (Web Service Description Language) is a better way to specify program name and its parameters

• Web uses other protocols – HTTPS for secure links and RTP etc. for multimedia (UDP) streams– These again are required to integrate system – codecs like

MPEG are interfaces interpreted by client– There are further protocols like H323 and SIP which will

be replaced (IMHO) by HTTP plus RTP etc. We should minimize number of protocols to get maintainable systems

Page 32: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

PPPH: Paradigms Protocols Platforms and Hosting III

• There are set of system capabilities which cannot be captured as standalone services and permeate Grid

• Meta-data rich Message-linked Web Services is permeating paradigm• Component Model such as “Enterprise JavaBean (EJB)” or OGSI

describes the formal structure of services – EJB if used lives inside OGSI in our Grids

• Invocation Framework describes how you interact with system• Security in fine grain fashion to provide selective authorization

(Globus and EDG WP6)• Policy context describes rules for this particular Grid• Transport mechanisms abstract concepts like ports and Quality of

Service• Messaging abstracts destination and customization of content• Network (monitoring, performance) EDG WP7• Fabric (resources) EDG WP4

Page 33: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Architecture in Pictures I

Network

Resources

Services

Messaging

Services

Messaging

Abstract Model OGSI

Hosting Environment determines physical model

Invocation Framework

ABSTRACT

ACTUAL

Page 34: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Architecture in Pictures IIOGSA Interoperable Grid

Network

Resources

OGSA InterfacesExposed by every

OGSI Grid Services

Messaging

Network Monitoring and Scheduling

Page 35: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Architecture in Pictures IIIOGSA Federated Grid

Network

Resources

Native Servicesnot necessarily

OGSI

Messaging

Network Monitoring and Scheduling

Mediation Serviceconverting between OGSA and “native” services

Mediation Service

Page 36: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Standards Compliant InterGrids Federation Environment

Jini Grid

Resource1GT3 Grid

AWS2Platform

AWS3

Service1

JXTA

AWS1 IBM Grid

Resource2

Avaki Grid

Service2

Federation/Interoperability Problem? Have a collection of Web Services running in Grids defined by

different suppliers? Interoperability – “particular application Web Service of

supplier X” can utilize “core service of supplier Y” Federation– “core service of supplier X” can be integrated with

“core service of supplier Y” to provide a integration/amalgam that is also a realization of core service. Need mediation to link different Grid Islands

Page 37: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Federation Architecture

Resource Resource Resource Resource

Resource Resource Resource Resource

Resource

Resource

Resource

Resource

Resource

Resource

Resource

Resource

Grid Instance

Service

Service Service

Grid Instance

Service

Service Service

Grid Instance

Service

Service Service

Grid Instance

Service

Service Service

R

R

M

M

M

M

R RRR RM MMM

Routing Node Mediation Node

Page 38: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Virtualization• The Grid could and sometimes does virtualize various

concepts• Location: URI (Universal Resource Identifier) virtualizes

URL• Replica management (caching) virtualizes file location

generalized by GriPhyn virtual data concept• Protocol: message transport and WSDL bindings

virtualize transport protocol as a QoS request• P2P or Publish-subscribe messaging virtualizes matching

of source and destination services• Semantic Grid virtualizes Knowledge as a meta-data

query• Brokering virtualizes resource allocation• Virtualization implies references can be indirect

Page 39: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

IFS: Interfaces and Functionality and Semantics I• The Grid platform tries to minimize detail in protocols and

maximize detail in interfaces to enhance scaling• However rich meta-data and semantics are critical for

correct and interesting operation– Put as much semantic interpretation as you can into specific

services– Lack of Semantic interoperation is in fact main weakness of

today’s Grids and Web services

• Everything becomes a service (See example of education) whether system or application level

• There are some very important “Global Services”– Discovery (look up) and Registration of service metadata– Workflow– MetaSchedulers

Page 40: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

IFS: Interfaces and Functionality and Semantics II• There are many other generally important services• OGSA-DAI The Database Service• Portal Service linked to by WSRP (Web services

for Remote Portals)• Notification of events• Job submission• Provenance – interpret meta-data about history of

data• File Interfaces• Sensor service – satellites …• Visualization• Basic brokering/scheduling

Page 41: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

OGSA/OGSI Top Level View

• OGSA is the set of “core” Grid services– Stuff you can’t live

without– If you built a Grid

you would need to invent these things OGSI

Broadly applicable services: registry,authorization, monitoring, data

access, etc., etc.

TransportProtocolHosting EnvironmentHosting Environment

Host. Env. & Protocol Bindings

Models for resources&

other entities

More specialized services: datareplication, workflow, etc., etc.

Domain-specific services

Other

models

Chapters 7 to 9 of Bookhttp://www.gridforum.org/Meetings/ggf7/docs/default.htm

http://www.globusworld.org/globusworld_web/jw2_program_tut.htm

Page 42: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

• OGSA-WG chaired by – Ian Foster, ANL and Univ. of Chicago– Jeff Nick, IBM– Dennis Gannon, IU

• Active Members from– IBM, Fujitsu, NEC, SUN, Hitachi, Avaki– Univ. of Mich, Chicago, Indiana (not much

academic involvement)

Open Grid Service Architecture

Page 43: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

OGSA Core Services I

• Registries, and namespace bindings– Registry is a collection of services indexed by service

metadata.• “find me a service with property X.”

– Directory is a map from a namespace to GSHs.– A namespace is a human understandable version of a

Grid Handle

• Queues – For building schedulers and resource brokers– Jobs and other requests are in queues– This is high-level messaging

Page 44: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Security• Base this on Web Services Security• Authentication

– 2-way. Who are you and who am I?

• Authorization– What am I authorized to use/see/modify

• Accounting/Billing– (not really security – see monitoring)

• Privacy• Group Access

– Easily create a group to share access to a virtual Grid.

• Very complex issues related to services and message delivery.

Page 45: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Common Resource Model

• Every resource on the grid that is manageable is represented by a service instance– CRM is the Schema hierarchy that defines each

resource (with its meta-data)– Service for a resource presents its management

interface to authorized parties.

Page 46: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Policy Management• Policy management services

– Mechanism to publish policy and the services it applies to. – Policy life-cycle mgmt.

• Policy languages exist for routing, security, resource use

PolicyService

Manager

PolicyEnforcement

Point

PolicyServiceAgent

Admin GUI /Autonomic

Manager

Admin GUI /Autonomic

Manager

XMLRepository

* 1

1..n 1

1

1

*

**

*

*

*CanonicalPolicies

CanonicalPolicies

Policy Service CorePolicy Service Core

Policy Transformation

Service

Policy Validation

Service

Policy Resolution

Service

Policy Transformation

Service

Policy Validation

Service

Policy Resolution

Service

Common Resource Model

Device / Resource

Common Resource Model

Device / Resource

Non-Canonical

Producer of Policies

Consumer of Policies

Policy Component Requirements: A management control point for policy lifecycle (PSM) A canonical way to express policies (AC 4-tuple) A distribution point for policy dissemination (PSA) A way to express that a service is “policy aware” (PEP) A way to effect change on a resource (CRM)

Page 47: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Grid Service Orchestration• Creating new services by composing other services• Two types of Orchestration

– Composition in space • One services is directly invoking another

– Composition in time• Managing the workflow

– First do this.– Then do this and that– When that is done do this

» If something goes wrong do this– And so on…

Page 48: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Data Services

• Distributed Data Access

• Data Caching

• Data Replication Services

• Metadata Catalog Services

• Storage Services

Page 49: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Metering Resource Consumption

• At what granularity do services report resource consumption?

• How do they report it?

• How are services metered?

Billing

Con

trac

t Ser

vice

Accounts

Rate Packages

ASPIC CBI

ASPIC CBI

Resource Instrumentation

Metering Handler

Logging Service

Rating

Meter event adaption

Billable Record Listener

Aggregation and Correlation

Usage Information

Accounting

Page 50: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Transactions

• Two threads/workflows must synchronize and agree they have done so before moving on.– Usually involves modification to two or more

persistent states– WS-transactions has been “proposed”.

Page 51: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Messaging, Events, Logging

• Messaging– Delivery Model– Queuing and Pub/Sub message delivery (not clear to me why

these are different as publish/subscribe implemented as topic labeled queues)

• Events– Time stamped messages– Standard XML schemas

• Standard Logging• MQSeries (IBM), JMS (Java Message Service) and

NaradaBrokering (Indiana) provide this but most naturally at level of “platform/hosting environment”

Page 52: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Where should Messaging be?• One can define messaging at the OGSA level “above the hosting environment” but

that makes it difficult to virtualize messaging and support network performance– Publish-subscribe or better queued messaging naturally supports optimized

routing based on network performance• One can naturally support collaborative Web services in same fashion in a way that

it MUCH easier that GrooveNetworks and other collaborative environments (WebeX, Placeware(Microsoft)) do as long as every application is a Web service

• OGSA location of messages is fine for low volume logging or notification events– Not good for events on “video” application where each frame is an update event

Page 53: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

The Overall Architecture• The Grid is defined by a collection of distributed Services

– For many users the primary interaction with the Grid will be through a portal

Portal Server

MyProxyServer

MetadataDirectoryService(s)

Directory& indexServices

ApplicationFactoryServices

Messagingand group

collaboration

Event andlogging

Services

Page 54: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Web Services as a Portlet• Each Web Service naturally has a

user interface specified as “just another port” – Customizable for universal access

• This gives each Web Service a Portlet view specified (in XML as always) by WSRP (Web services for Remote Portlets)

• So component model for resources “automatically” gives a component model for user interfaces– When you build your

application, you define portletat same time

Application orContent source

WSDL

Web Service

S

R

W

P

Application as a WSGeneral Application PortsInterface with other WebServices

User Face ofWeb ServiceWSRP Ports define WS as a Portlet

Web Services have other ports (Grid Service) to be OGSI compliant

Page 55: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Online Knowledge Center built from Portlets

• Web Services provide a component model for the middleware (see large “common component architecture” effort in Dept. of Energy)

• Should match each WSDL component with a corresponding user interface component

• Thus one “must use” a component model for the portal with again an XML specification (portalML) of portal component

A set of UIComponents

Page 56: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)
Page 57: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

InterGrids Develop Grid software including both the packaging of

Grid Islands and federation between Grids Key underlying principles:

• Support small Grids (university departments, Private Grids) with easy deployment (e.g. JXTA Jini …)

• Support composability (P2P) and federation with itself and other Grids

• Use existing (Avaki, Globus JXTA ICENI ..) bits and pieces if possible – encourage such projects to produce modules useable in other Grid systems like InterGrids

• Good test/benchmark/tutorial material and good support

Page 58: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Features of InterGrids I At outer levels of Grid, just need interoperable services but these

services would need mediation as they access resources in a different grid• Interoperable Services have interfaces allowing them to access

appropriate other services in their own grid and through mediation the analogous services in other grids

At inner levels of Grid, need to federate services• Federated services produce results which need to be merged with

other services of the same type when Grids are joined together In case of Jini based Grid, we keep Jini services as Jini and don’t wrap

them Rather the OGSA wrapping of Jini occurs in the mediator Issues if multiple Grids overlap

• Can multiple brokers manage same resources (schedulers on computers)

• What happens if one shares resources between virtual organizations (Grids)

Page 59: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Features of InterGrids II Mediation service integrates

• VPN with adaptable port• Firewalls and security negotiation between domains• Mediation• Difference between PURL/URI and URL

Mediation Service is distributed i.e. is not a single proxy server

Each logical message (this could be several physical messages) in system is examined – destination URI is “just” mapped to URL if internal• If external routed to “best mediator” in same way

NaradaBrokering optimally routes messages• Logical messages are self contained i.e. have enough

information to fully specify any transformations needed in mediator i.e. they are “all” the Schema instance

Page 60: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Features of InterGrids III When message arrives at mediator service, its source and

destination Grids are examined• Note Grids must register themselves in mediation service and in that

registration (or a later update) give specification for service and any needed transformations)

• This registration involves introduction of concept of Grid Gridtype with specification attached to a particular Gridtype

• If they have same specification for this service, then message is routed on unchanged

• If specification different, the mediator looks to see if any special translations available for given source or destination

• If no special actions, the source mapper is used to convert message to FGSA (Federated Grid Service Architecture -- hopefully same as OGSA) and the destination inverse mapper is used to map FGSA to destination format

This can generate an error returned to calling service and logged for both Grids

Page 61: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Features of InterGrids IV This mediation is sufficient for an “interoperable” service Some services such as “search” or “look-up” are registered as

federated• i.e. requests to them are multi-cast to multiple grids and

results merged• Rules must be defined for federated services defining nature

of result Is more than one answer allowed If >1 answer possible, then rules for merging results of

services in each Grid must be specified Another complication is if a single service in one Grid

corresponds to composition of multiple services in another Grid• One sees this for look-up which could involve different levels

of meta-data in different Grids• This seems complicated but relatively clear what one has to

do in composing dynamically services

Page 62: Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure)

Features of InterGrids V There are two problems mentioned earlier Shared Resource Reference Services – this must be a

standard issue in federation. This occurs when federation of say look up services leads to duplicate results. The look-up may not be quite the same and one would want to remove or combine duplicate responses

Shared Resource Access Services – these are services in different Grids that access and affect the same backend resource. This issue comes up even inside a single Grid• It is possible that mediator could be used to resolve this

problem but an heuristic needs to be developed for this