42
Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with Nick Jennings & Carl Kesselman http://www.aamas2004.org/proceedings/003-foster_i_grid.pdf

Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

Embed Size (px)

Citation preview

Page 1: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

Brain Meets Brawn: Why Grid and Agents Need

Each Other

Ian Foster

Argonne National Laboratory

University of Chicago

Globus Alliance

In collaboration with Nick Jennings & Carl Kesselman

http://www.aamas2004.org/proceedings/003-foster_i_grid.pdf

Page 2: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

What is the Grid?

“The Grid is an international project that looks in detail at a terrorist cell operating on a global level and a team of American and British counter-terrorists who are tasked to stop it”

Gareth Neame, BBC's head of drama

Page 3: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

Well, Not Exactly!

“The Grid is an international project that looks in detail at scientific collaborations operating on a global level and a team of computer scientists who are tasked to enable it”

At least, that’s where it started …

Page 4: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Open Distributed Systems

Interoperable infrastructure & tools for secure and

reliable resource sharing

GridAgents

Methodologies, algorithms for autonomous problem solvers in open systems

(brain) (brawn)

The two needeach other!

Page 5: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Overview

Where’s the muscle? What Grid is about

The need for brains The importance of automation

Bringing the two together Working with the Open Science Grid Research problems

Page 6: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Overview

Where’s the muscle? What Grid is about

The need for brains The importance of automation

Bringing the two together Working with the Open Science Grid Research problems

Page 7: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Why the Grid?Origins: Revolution in Science

Pre-Internet Theorize &/or experiment, alone

or in small teams; publish paper Post-Internet

Construct and mine large databases of observational or simulation data

Develop simulations & analyses Access specialized devices remotely Exchange information within

distributed multidisciplinary teams

Page 8: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Origins in Science

Engage via telepresence in an experiment at a remote facility

Integrate data from multiple sources in support of global change research

Harness computers across sites to process data from a physics experiment

Discover & access a genome analysis service (running on high-end computer)

Page 10: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Why the Grid?New Driver: Revolution in Business

Pre-Internet Central data processing facility

Post-Internet Enterprise computing is highly distributed,

heterogeneous, inter-enterprise (B2B) Business processes increasingly

computing- & data-rich Outsourcing becomes feasible

service providers of various sorts Growing complexity & need for

more efficient management

Page 11: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Grid Hype—and Products

Page 12: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Common Requirements Dynamically link resources/services

From collaborators, customers, eUtilities, … (members of evolving “virtual organization”)

Into a “virtual computing system” Dynamic, multi-faceted system spanning

institutions and industries Configured to meet instantaneous needs, for:

Multi-faceted QoX for demanding workloads Security, performance, reliability, …

Page 13: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

The Grid “Resource sharing & coordinated

problem solving in dynamic, multi-institutional virtual organizations”

1. Enable integration of distributed resources

2. Using general-purpose protocols & infrastructure

3. To achieve better-than-best-effort service

Page 14: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

“Protocols & Infrastructure”:Open Standards & Software

Standardized & interoperable mechanisms for secure & reliable: Authentication, authorization, policy, … Representation & management of state Initiation & management of computation Data access & movement Communication & notification

Good quality open source implementations to accelerate adoption & development E.g., Globus Toolkit

Page 15: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Incr

ease

d fu

nctio

nalit

y,st

anda

rdiz

atio

n

Customsolutions

1990 1995 2000 2005

Open GridServices Arch

Real standardsMultiple implementations

Web services, etc.

Managed sharedvirtual systems

Computer science research

Globus Toolkit

Defacto standardSingle implementation

Internetstandards

Grid Infrastructure:Open Software and Standards

2010

Page 16: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

WS Core Enables Frameworks:E.g., Resource Management

Web services(WSDL, SOAP, WS-Security, WS-ReliableMessaging, …)

WS-Resource Framework & WS-Notification(Resource identity, lifetime, inspection, subscription, …)

WS-Agreement(Agreement negotiation)

WS Distributed Management(Lifecycle, monitoring, …)

Applications of the framework(Compute, network, storage provisioning,

job reservation & submission, data management,application service QoS, …)

Page 17: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Web Servicesand Stateful Resources

“State” appears in almost all applications Data in a purchase order Current usage agreement for resources Metrics associated with work load on a server

Web services can model, access and manage state in many different ways Ad-hoc, per-application approaches OGSI/WSRF proposes a standard approach

Page 18: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Building systems by composition of heterogeneous components demands that we standardize common patterns Approach to resource identification Lifetime management interfaces Inspection & monitoring interfaces Base fault representation Service and resource groups Notification And many more…

Standards encourage tooling & code re-use Build services more quickly & reliably

Why Standardize An Approach?

Page 19: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

WSRF & WS-Notification Naming and bindings (basis for virtualization)

Every resource can be uniquely referenced, and has one or more associated services for interacting with it

Lifecycle (basis for fault resilient state management) Resources created by services following factory pattern Resources destroyed immediately or scheduled

Information model (basis for monitoring & discovery) Resource properties associated with resources Operations for querying and setting this info Asynchronous notification of changes to properties

Service Groups (basis for registries & collective svcs) Group membership rules & membership management

Base Fault type

Page 20: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Grid in Practice:Earthquake Engineering Example

Secure, reliable, on-demand access to data,software, people, and other resources(ideally all via a Web Browser!)

Page 21: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

How it Really Happens(with the Globus Toolkit)

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

CHEF ChatTeamlet

MyProxy

CHEF

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

Application Developer

2

Off the Shelf

9

Globus Toolkit

4

Grid Community

4

Page 22: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

0

10

20

30

40

50

60

70

8:0

0

8:3

0

9:0

0

9:3

0

10

:00

10

:30

11

:00

11

:30

12

:00

12

:30

13

:00

13

:30

14

:00

14

:30

15

:00

15

:30

16

:00

16

:30

17

:00

17

:30

18

:00

18

:30

Nu

mb

er

of

Pa

rtic

ipa

nts

UIUC

Colorado

NEESgridMultisite OnlineSimulation Test

(July 2003)

Illin

ois

Colo

rado

Illinois (simulation)

Page 23: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Overview

Where’s the muscle? What Grid is about

The need for brains The importance of automation

Bringing the two together Working with the Open Science Grid Research problems

Page 24: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

Grid2003: An Operational Grid 28 sites (2100-2800 CPUs) & growing 400-1300 concurrent jobs 7 substantial applications + CS experiments Running since October 2003

Korea

http://www.ivdgl.org/grid2003

Page 25: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Open Science Grid Components Computers & storage at 28 sites (to date)

2800+ CPUs Uniform service environment at each site

Globus Toolkit provides basic authentication, execution management, data movement

Pacman installation system enables installation of numerous other VDT and application services

Global & virtual organization services Certification & registration authorities, VO membership

services, monitoring services Client-side tools for data access & analysis

Virtual data, execution planning, DAG management, execution management, monitoring

IGOC: iVDGL Grid Operations Center

Page 26: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

(Very Small)Example

Workflows

Genome sequence analysis

Physicsdata

analysis

Sloan digital sky

survey

Page 28: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

The Need for Automation:Critical if Grid is to Scale

Who contributes & gets to consume what? Policy negotiation, enforcement, auditing

How do I schedule jobs & data movement? Adaptive scheduling

Who can be trusted to do what? Community membership, reputation, trust

negotiation, intrusion detection Why do things fail?

Failure detection, problem determination, fault isolation, system adaptation

Page 29: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Adaptive Scheduling Adaptive placement

of data & computation Experiments on Grid3

0

10

20

30

40

50

60

70

80

22-Jan 23-Jan 24-Jan 25-Jan 26-Jan 27-Jan 28-Jan

Aver

age

num

ber o

f Idl

e C

PUs

FNAL_CMS

IU_ATLAS_Tier2 0

500

1000

1500

2000

2500

BN

L_

AT

LA

S

Ca

lTe

ch

-Gri

d3

UFlo

rid

a-P

G

IU_

AT

LA

S_

Tie

r2

UB

uff

alo

-CC

R

UC

Sa

nD

ieg

oP

G

UFlo

rid

a-G

rid

3

UM

_A

TLA

S

Va

nd

erb

ilt

AN

L_

HE

P

Ca

lTe

ch

-PG

Tota

l I/O

tra

ffic

(M

Byte

s)

0

200

400

600

800

Least Loaded At Data Cost Function

Tot

al R

espo

nse

Tim

e (s

econ

ds)

(K. Ranganathan)

Page 30: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

0

50

100

150

200

250

300

350

400

0 50 100 150 200IP delay (ms)

Ove

rlay

del

ay (

ms)

.

AdaptiveUnstructured Multicast

“UMM: A dynamically adaptive, unstructured multicast overlay” M. Ripeanu et al.

A

E

B

D

C

A’

E’

B’

D’

C’

A”

E”

B”

D”

C”

Applicationoverlay

Baseoverlay

Physicaltopology

0

2

4

6

8

10

0

240

480

720

960

1200

1440

1680

1920

2160

2400

2640

2880

3120

3360

3600

3840

Time (sec)

RD

P

0

2

4

6

8

10

12

Max

xim

um li

nk s

tres

s .

MaxRDP

95% RDP

90%RDP

Stress

10 nodes fail then

rejoin 900s later

RDP=1

RDP=2

Page 31: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

WS Core Enables Frameworks:E.g., Resource Management

Web services(WSDL, SOAP, WS-Security, WS-ReliableMessaging, …)

WS-Resource Framework & WS-Notification(Resource identity, lifetime, inspection, subscription, …)

WS-Agreement(Agreement negotiation)

WS Distributed Management(Lifecycle, monitoring, …)

Applications of the framework(Compute, network, storage provisioning,

job reservation & submission, data management,application service QoS, …)

Page 32: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Agreement Negotiation All interesting interactions will be based on

agreements between requestors & services Encode a negotiated quality of experience

Challenge is to establish the framework within which agreements can be: Established in an oversubscribed environment Transformed, composed, decomposed Managed like any other resource Evolved as a result of faults

WS-Agreement is the next step on this path FIPA Request Interaction Protocol?

Page 33: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

WS-Agreement Model

Consumer Provider

create()

foo()Application Instance

Factory

Manager

create()Factory Agreement

Operations:Terminate()GetResourceProperty()...

Resource Properties:

Terms RelatedStatusAgrmts

inspect()

Agreement Layer

Service Layer

Consumer Provider

create()

foo()Application Instance

Factory

Manager

create()Factory Agreement

Operations:Terminate()GetResourceProperty()...

Resource Properties:

Terms RelatedStatusAgrmts

inspect()

Agreement Layer

Service Layer

Page 34: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

WS-Agreement Applicability

WS-Agreement to be used/specialized by specs that define domain-specific services

Data management services Storage provisioning, transfer mgt, replication

Job/execution management services Cluster provisioning, image/service deployment, job

reservation and submission, specialized services, etc. Network provisioning services

Optical path, firewall traversal, etc. Application- and domain-specific services

Page 35: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Network

RRR

A

ServiceLevel

Bringing it All TogetherScenario: Resource management & scheduling

Storage

RRRIBM

IBM

Blades

RRR

Notification

GridScheduler

WS-Resource used to “model” physical

processor resources

WS-Resource Properties “project” processor status (like utilization)

Local processor manageris “front-ended” with A Web service interface

Other kinds of resources are also“modeled” as WS-Resources

JJ

J

WS-Notification can be used to “inform” the

scheduler when processor utilization

changes

Grid “Jobs” and “tasks” are also modeled using

WS-Resources and Resource Properties

Grid Scheduleris a

Web Service

Service Level Agreement

is modeled as a WS-Resource

Lifetime of SLA Resource tied to the duration

of the agreement

Page 36: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Overview

Where’s the muscle? What Grid is about

The need for brains The importance of automation

Bringing the two together Working with the Open Science Grid Research problems

Page 37: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Grid: Two Reasons to Care

Grid technology can ease creation of secure, robust, scalable agent systems E.g., Globus Toolkit

Grid deployments can serve as testbeds for agent concepts & algorithms E.g., Grid3 is evolving to Open Science Grid

with 1000s of CPUs & dozens of sites

Run ultra-large agent systems

Help tackle fundamental problems: trust, fault detection, adaptation, negotiation, etc.

Page 38: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Working on the

What it gives you Large collection of distributed resources Standard services Mechanisms for deploying further software

What you can do Use for large-scale agent experiments Define and deploy new services: e.g., intrusion

detection, fault determination Define challenge problems to motivate

development of adaptive techniques

Page 39: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

10 Research Challenges (1)

Service architecture Robust foundation for autonomous behaviors

Trust negotiation & management Expressing & reasoning about trust

System management & troubleshooting Autonomic management of large systems

Negotiation Establishing, monitoring, evolving agreements

Service composition Describing, discovering, composing services

Page 40: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

10 Research Challenges (2)

VO formation & management Lifecycle issues

System predictability Guarantees of emergent behavior

Human-computer collaboration Interactions in hybrid teams

Evaluation Benchmarks, challenge problems, testbeds

Semantic integration Ontology definition, schema mediation

Page 41: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

Summary

Agents & Grid share a common interest in robust, scalable open distributed systems

Complementary foci in work to date Grid: secure & robust infrastructure (brawn) Agents: autonomous problem solvers (brain)

Both brain and brawn required for progress Specific proposals for convergence

Use Grids as testbeds for agent technology Coordinated attack on open problems

Page 42: Brain Meets Brawn: Why Grid and Agents Need Each Other Ian Foster Argonne National Laboratory University of Chicago Globus Alliance In collaboration with

[email protected] ARGONNE CHICAGO

For More Information Globus Alliance

www.globus.org Global Grid Forum

www.ggf.org Open Science Grid

www.opensciencegrid.org Background information

www.mcs.anl.gov/~foster GlobusWORLD 2005

Feb 7-11, Boston

2nd Editionwww.mkp.com/grid2