67
Grid Services Presented by Karan Bhatia

Grid Services

  • Upload
    jovita

  • View
    75

  • Download
    0

Embed Size (px)

DESCRIPTION

Grid Services. Presented by Karan Bhatia. Hype Curve. Overview. Grid Computing Background Definition Opportunities Markets Technical Challenges Security Infrastructure Resource Management Service Interoperability Summary. Grid Computing is …. - PowerPoint PPT Presentation

Citation preview

Page 1: Grid Services

Grid ServicesGrid Services

Presented by

Karan Bhatia

Presented by

Karan Bhatia

Page 2: Grid Services

2

Hype Curve

Page 3: Grid Services

3

Overview

• Grid Computing Background– Definition

– Opportunities

– Markets

• Technical Challenges– Security Infrastructure

– Resource Management

– Service Interoperability

• Summary

Page 4: Grid Services

4

Grid Computing is …

• “Co-ordinated resource sharing and problem solving in dynamic multi-institutional virtual organization.” [Foster, Kesselman, Tuecke]

– Co-ordinated - multiple resources working in concert, eg. Disk & CPU, or instruments & database, etc.

– Resources - compute cycles, databases, files, application services, instruments.

– Problem solving - focus on solving scientific problems

– Dynamic - environments that are changing in unpredictable ways

– Virtual Organization - resources spanning multiple organizations and administrative domains, security domains, and technical domains

Page 5: Grid Services

5

Grid Computing is … (Industry)

• “about finding distributed, underutilized compute resources (systems, desktops, storage) and provisioning those resources to users or applications requiring them.” [The Grid Report, Clabby Analytics]

– Distributed - all the resources laying around in departments or server rooms.

– Underutilized - typical utilization of “big iron” is 5 to 10%. Organizations save money by increasing utilization versus purchasing new resources.

– Resources - servers and server cycles, applications, data resources

– Provisioning - predict and schedule resource use depending on load.

Page 6: Grid Services

6

Types of Grids…

• Compute Grids– Seti@home, Entropia,

United Devices, Condor

• Data Grids– Storage Resource Broker

(SRB), Avaki, BIRN, GEON

• Collaboration Grids– Instrumentation

(telescience), applications

• Enterprise Grids– Majority of commercial

interest

• Partner Grids– B2B, Academic/Govt Grids

• Service Grids– “Utility” Computing, “On

Demand”, pervasive, autonomic, etc…

Page 7: Grid Services

7

A Grid is …

• “the next generation Internet,”

• “all about free cycles ala SETI@HOME,”

• “a distributed object system,”

• “a new programming model,”

• “a replacement for high performance computing,”

Page 8: Grid Services

8

IMAGING INSTRUMENTS

COMPUTATIONALRESOURCES

LARGE-SCALE DATABASES

DATAACQUISITION ,ANALYSIS

ADVANCEDVISUALIZATION

Example… TeleScience Grid

Page 9: Grid Services

9

Grid Resources - Networks

Page 10: Grid Services

10

Grid Resources - Compute

Page 11: Grid Services

11

Top 500.org

Page 12: Grid Services

12

Page 13: Grid Services

13

Another Grid Example … Google

• Queries– 150 M queries/day (2000/s)

– 100 countries

– 3.3 B documents

• Hardware– 15,000 Linux systems in 6 data centers

– 15 Tflop/s and 1000 TB total capacity

– 40-80 1U/2U servers/cabinet

– 100 MB Ethernet switches/cabinate with gigabit uplinks

– Growth from 4000 systems (18 M queries/day)

Page 14: Grid Services

14

Grid Resources - Data

• SDSC Resources – HPSS:

• SDSC's central long-term data storage system,• one of the world's largest IBM High Performance Storage System

(HPSS) units,• currently holds more than a petabyte (a million gigabytes) of data in

approximately 21 million files,• It has the capacity to store six petabytes of data; files are added at an

average rate of 10,000 gigabytes per month.

– Storage-Area Network (SAN): • A 72-processor Sun Microsystems SunFire 15K high-end server and 11

Brocade switches (1,400 ports) • 225,000 gigabytes of networked disk storage for data-oriented

applications.

• 1 TB of data = $2500

Page 15: Grid Services

15

Protein Data Bank (PDB)

Page 16: Grid Services

16

Putting it all together… TeraGrid

Page 17: Grid Services

17

Grid Market

Page 18: Grid Services

18

Grid Companies

• IBM– “on demand” solutions

• Sun Microsystems– N1 initiative

• Oracle– 10g

• Dell

• HP– “utility” computing

• Platform Computing– LSF, metaclulstering

• United Devices– Desktop grids

• DataSynapse• Akamai• Google?• Sony online

entertainment?

• Where’s Microsoft?

Page 19: Grid Services

19

Grid Organizations

• Global Grid Forum (GGF)

• Organization for the Advancement of Structured Information Standards (OASIS)

• Distributed Management Task Force (DMTF)

• World Wide Web Consortium (W3C)

• Globus Alliance

• NSF Middleware Initiative (NMI)

• NASA IPG

• DOE Science Grid

• EU DataGrid

• NSF TeraGrid

Page 20: Grid Services

20

Technical Challenges for Grid Computing

Page 21: Grid Services

21

Challenges: Security

• Grids traverse organizational boundaries– Different administration domains have different authentication

mechanisms– Resources have different use agreements and sharing priorities

• Single sign-on– Multiple passwords difficult to manage

• Rights delegation• Trust

– Authentication of users– Authorization of users– Resource access

Page 22: Grid Services

22

Security• Public Key Infrastructure

– Public key A.public– Private key A.private

• Supports Encrpyption– Message to B:

• m’ = F(m,A.private), send m’ to B• recv m’, m = F’(m’,A.public)

• Digital Signatures– Signed message to B:

• m’ = (m,F(m,A.public))

– Receiver verifies that m’ is from A and not tampered

Page 23: Grid Services

23

Grid Security Infrastructure (GSI)

• A central concept in GSI authentication is the certificate.

• Every user and service on the Grid is identified via a certificate, a text file containing the following information:– a subject name identifying the person

or object that the certificate represents, – the public key belonging to the

subject, – the identity of a Certificate Authority

(CA) that has signed the certificate to certify that the public key and the identity both belong to the subject,

– the digital signature of the named CA.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 24: Grid Services

24

Proxy Certificate

• A proxy consists of a new certificate with a new public and private key.

• The new certificate contains the owner's identity modified slightly to indicate that it is a proxy.

• The new certificate is signed by the owner rather than a CA.

– This is called a self-signed certificate.

• The certificate also includes a time notation after which the proxy should no longer be accepted by others.

• Proxies have limited lifetimes in order to minimize the security vulnerability.

• Because the proxy isn't valid for very long, it doesn't have to kept quite as secure as the owner's private key.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 25: Grid Services

25

Mutual Authentication

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 26: Grid Services

26

Additional Challenges

• Certificate Management– MyProxy

• Role-based Access Control– CAS, VOM

• Authorization services• Integration with

applications & Portals

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 27: Grid Services

27

Challenges: Resource Management

• Resources loosely-coupled– Higher network latencies– Planned and unplanned disruptions

• How to provide QoS guarantees?

• Case Study: Entropia Desktop Grids– Additional trust/security issues

Page 28: Grid Services

29

Entropia 1: Gimps• Over 1.5 Billion

CPU hours served

• 300,000+ machines, over 4 years operational

• Every PC and hardware config imaginable (proc, memory, disk, etc.)

• Every networking hookup imaginable

• Found 35th, 36th, 37th, 38th, and 39th Mersenne Primes

Page 29: Grid Services

30

Entropia 2: FightAids@home

• Sept 2000 launch• Internet-Based• 54,657 total

machines• 10,770,506 total

hours of computation

• 27,881 peak billions of calculations/sec

Page 30: Grid Services

31

Entropia 3: DCGrid

• Enterprise focus– Tremendous resources available in enterprise– Complements other HPC resources

• Computing Platform– Arbitrary application (open scheduling model)– Security, unobtrusiveness, manageability guaranteed

• Focus on – Pharmaceuticals, Chemicals, and Materials – Financial Services

Page 31: Grid Services

32

DCGrid Architecture

Page 32: Grid Services

35

Server vs. Desktop Grids

• Server environment– Fixed IP, always connected

– Always-on operation

– Moderate number of systems (10’s – 100’s)

– Dedicated use, trusted systems

• Desktop environment– Dynamic, temporary IP, intermittent connection

– Off evenings, off weekends, off lunch

– Large numbers of systems (100’s – 1000’s - ?)

– Shared resources, potentially untrusted users

• These differences give rise to desktop Grid challenges

Page 33: Grid Services

36

Typical PC-Grid Environment

0

100

200

300

400

500

600

700

552 576 600 624 648 672 696 720

Time (hours)

Page 34: Grid Services

37

PC-Grid Challenges

• Provide a stable compute environment for apps– Isolate app from variable desktop environment

• Operate in environment of dynamic use– Unobtrusiveness and Fault Tolerance are key!

• Provide simple application integration– Support ANY Application without modification

• Provide centralized management console– Zero additional management costs

Page 35: Grid Services

38

JobManagement

ResourceSchedulinng

Physical NodeManagement

Job Manager

Subjob Scheduler

Node Manager

End-user

Entropia Clients

computation

resource

resource description

Workflow

2

3

45

6b

1

7

8

a

Page 36: Grid Services

39

Stable Compute Environment

• Entropia Proprietary Sandbox– Binary-level protection

– System virtualization (registry, file system, network)

• Open Scheduling Infrastructure– Intelligent scheduling (match resources to subjobs

requirements)

– Manage subjob redundancy/fault tolerance

Page 37: Grid Services

40

Manage Dynamic Use

• PC primary use must be respected!• Entropia Proprietary Sandbox

– Guaranteed to run at idle priority– Limit application capability– Monitor page faults, network access

• Management– Provide time-of-use windows– Different levels of unobtrusiveness

• Gathers 95+ % of cycles

Page 38: Grid Services

41

Application Integration

• Support any Win32 binary– Language Neutral (C, C++, Fortran, Java,C#, etc.)

– Compiler/library Neutral

Client1 *

Client2 *

Open Grid Platform

App A

App B

App C

qsubqstat…

ApplicationPreparation Tools

Run Applications

Page 39: Grid Services

42

Manageability

Page 40: Grid Services

43

Application Performance

0

5

10

15

20

25

30

35

40

0 25 50 75 100 125 150

Number of Clients

Sequences per hourEntropia

1CPU SGI

1CPU SUN

Linear (Entropia)

0

50

100

150

200

250

300

350

400

0 100 200 300 400 500 600

Number of Clients

Throughput (Packets per Hour)

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50

Number of Clients

Compounds per Hour

GOLD

AUTODOCK

HMMER

0

1000

2000

3000

4000

5000

6000

7000

0 100 200 300 400 500

Number of Clients

Compounds per Hour

DOCK

Page 41: Grid Services

44

Scheduling PerformanceJob 14 Nodes (94 clients)

0

10

20

30

40

50

60

70

80

90

100

0 3600 7200 10800 14400 18000 21600

Time (secs)

Client ID

Page 42: Grid Services

45

Challenges: Service Interoperability

• Trying to force homogeneity on users is futile. Everyone has their own preferences, sometimes even dogma.

• The Internet provides the model…

Page 43: Grid Services

46

Typical Application

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

Page 44: Grid Services

47

Typical Application

• Implementations are provided by a mix of– Application-specific code

– “Off the shelf” tools and services

– Tools and services from the Globus Toolkit

– Tools and services from the Grid community (compatible with GT)

• Glued together by…– Application development

– System integration

Page 45: Grid Services

48

How it Really Happens(without the Grid)

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

CameraTelepresence

Monitor

RegistrationService

A

B

C

D

E0Grid

Community

0Globus Toolkit

13Off the Shelf

9Application Developer

Page 46: Grid Services

49

How it Really Happens(with the Grid)

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

portlet

MyProxy

Portal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

CameraTelepresence

Monitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

4Grid Community

4Globus Toolkit

9Off the Shelf

2Application Developer

Page 47: Grid Services

50

Theory -> Practice

Page 48: Grid Services

51

What You Get in the Globus Toolkit

• OGSI(3.x)/WSRF(4.x) Core Implementation– Used to develop and run OGSA-compliant Grid Services (Java,

C/C++)

• Basic Grid Services– Popular among current Grid users, common interfaces to the most

typical services; includes both OGSA and non-OGSA implementations

• Developer APIs– C/C++ libraries and Java classes for building Grid-aware

applications and tools

• Tools and Examples– Useful tools and examples based on the developer APIs

Page 49: Grid Services

52

Components in Globus Toolkit 3.0

GSI

WS-Security

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

RFT(OGSI)

RLS

WU GridFTPJAVA

WS Core(OGSI)

OGSI C Bindings

MDS2

WS-Index(OGSI)

Pre-WSGRAM

WS GRAM(OGSI)

Page 50: Grid Services

53

Components in Globus Toolkit 3.2

GSI

WS-Security

CAS(OGSI)

SimpleCA

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

RFT(OGSI)

RLS

OGSI-DAI

WU GridFTP

XIO

JAVAWS Core(OGSI)

OGSI C Bindings

MDS2

WS-Index(OGSI)

Pre-WSGRAM

WS GRAM(OGSI)

OGSI Python Bindings

(contributed)

pyGlobus(contributed)

Page 51: Grid Services

54

Planned Components in GT 4.0GSI

WS-Security

CAS(WSRF)

SimpleCA

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

Authz Framework

RFT(WSRF)

RLS

OGSI-DAI

New GridFTP

XIO

JAVAWS Core(WSRF)

C WS Core(WSRF)

MDS2

WS-Index(WSRF)

Pre-WSGRAM

WS-GRAM(WSRF)

CSF(contribution)

pyGlobus(contributed)

Page 52: Grid Services

55

Grid and Web Services Convergence

The definition of WSRF means that the Grid and Web services communities can move forward on a common base.

Page 53: Grid Services

Grid

Services

Example

• (from sotomayor tutorial)

• MathService API:

– add(int x)

– subtract(int x)

– getvalue()

Note 1: How is this different than - Web Services? - Corba? - COM/DCOM?

Note 2: This is too simple! What about - co-ordination/workflows - personalization - presentation - security

Page 54: Grid Services

OGSI

(or

what is a

grid service?)

• Using web service infrastructure

– MathService is defined by WSDL (like idl)

<?xml version="1.0" encoding="UTF-8"?>...<types><xsd:schema targetNamespace="http://www.gt3tutorial.org/namespaces/0.2/core/gwsdl/Math" attributeFormDefault="qualified" elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema"> <xsd:element name="add"> <xsd:complexType> <xsd:sequence> <xsd:element name="value" type="xsd:int"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="addResponse"> <xsd:complexType/> </xsd:element>...</types>

<message name="AddInputMessage"> <part name="parameters" element="tns:add"/></message><message name="AddOutputMessage"> <part name="parameters" element="tns:addResponse"/></message>...

<gwsdl:portType name="MathPortType" extends="ogsi:GridService"> <operation name="add"> <input message="tns:AddInputMessage"/> <output message="tns:AddOutputMessage"/> <fault name="Fault" message="ogsi:FaultMessage"/> </operation> <operation name="subtract"> <input message="tns:SubtractInputMessage"/> <output message="tns:SubtractOutputMessage"/> <fault name="Fault" message="ogsi:FaultMessage"/> </operation> <operation name="getValue"> <input message="tns:GetValueInputMessage"/> <output message="tns:GetValueOutputMessage"/> <fault name="Fault" message="ogsi:FaultMessage"/> </operation></gwsdl:portType>

</definitions>

Page 55: Grid Services

Basic

Concepts

Page 56: Grid Services

The

GridService

PortType

• a “grid service” is a web service that implements the GridService PortType

<portType name="GridService"><operation name="setServiceData"> [snip] </operation><operation name="destroy"> [snip] </operation><operation name="requestTerminationAfter"> [snip] </operation><operation name="requestTerminationBefore"> [snip] </operation><operation name="findServiceData"> [snip] </operation></portType>

<gwsdl:portType name="GridService"><sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="constant" name="interface" nillable="false" type="xsd:QName"/> <sd:serviceData maxOccurs="unbounded" minOccurs="0" modifiable="false" mutability="mutable" name="serviceDataName" nillable="False" type="xsd:QName"/> <sd:serviceData maxOccurs="1" minOccurs="1" modifiable="false" mutability="mutable" name="factoryLocator" nillable="true" type="ogsi:LocatorType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="0" modifiable="false" mutability="extendable" name="gridServiceHandle" nillable="false" type="ogsi:HandleType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="mutable" name="gridServiceReference" nillable="false" type="ogsi:ReferenceType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="static" name="findServiceDataExtensibility" nillable="false" type="ogsi OperationExtensibilityType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="static" name="setServiceDataExtensibility" nillable="false" type="ogsi:OperationExtensibilityType"/> <sd:serviceData maxOccurs="1" minOccurs="1" modifiable="false" mutability="mutable" name="terminationTime" nillable="false" type="ogsi:TerminationTimeType"/> <sd:staticServiceDataValues> <ogsi:findServiceDataExtensibility inputElement="ogsi:queryByServiceDataNames"/> <ogsi:setServiceDataExtensibility inputElement="ogsi:setByServiceDataNames"/> <ogsi:setServiceDataExtensibility inputElement="ogsi:deleteByServiceDataNames"/> </sd:staticServiceDataValues></gwsdl:portType>

Page 57: Grid Services

GridService

PortType

• FindServiceData()• QueryByServiceDataNames()• GetServiceData()• SetByServiceDataNames()• DeleteByServiceDataNames()• RequestTerminationAfter()• RequestTerminationBefore()• Destroy()

Page 58: Grid Services

Capabilities

of a

Grid

Service

• 2-level naming (GSH vs. GSR)

• Factories

• Lifetime management

• Service Data Elements

• Event Notification

• ServiceGroups

Page 59: Grid Services

GSH

versus

GSR

• A GSH (Grid Service Handle) is a unique name for a Grid Service Instance

• A GSR (Grid Service Reference) is a perhaps temporary mechanism to access the Grid Service Instance

Page 60: Grid Services

Factories

• Create new instances of services dynamically

• Individualized Instances

• lifetime management techniques

Page 61: Grid Services

Service

Data

Elements

• Generalized State

– useful for describing capability

– Get/Set model similar to javaBeans Properties

• Can specify initial values in WSDL

• Integrated with Notification mechanism

Page 62: Grid Services

Service

Data

Elements:

GridService

• Interface

• ServiceDataName

• FactoryLocator

• GridServiceHandle

• GridServiceReference

• TerminationTime

Page 63: Grid Services

Notifications

• Source – implements NotificationSourcePortType– sends a notification message (XML Element) to Sinks• Sink– implements NotificationSinkPortType– sends a notification subscription request to source– causes a GridService Instance of porttype NotificationSubscription to be created

Page 64: Grid Services

ServiceGroups

• A grid service that maintains information about other grid services• Can be used to implement a classic registry model• Can be used for dataset replication• A grid service can belong to more than one Service Group• Membership in a ServiceGroup can be homogeneous or heterogeneous• Service group portTypes are optional

Page 65: Grid Services

Grid

Services:

Summary

• Extends Web Services to support Transient Services– WSDL 1.2 expected to include extensions• Requires support for factories, lifetime management, soft-state management, and

notifications• Java implementation pretty solid– Security implementation still shaky

Page 66: Grid Services

69

Other Challenges

• Developing user interfaces

• Data Management

• Scheduling/co-scheduling of resources

• Failure management

• Application development

• Performance

• Many others…

Page 67: Grid Services

70

What I hope you got from this talk

• Grid Computing is about – Co-ordinated use of different resources– Provisioning resources for increased utilization– Scaling to large numbers of resources, services

and users

• Many systems being built

• Many Applications being developed