18
how Shibboleth can work with job schedulers to create grids to support everyone Exposing Computational Resources Across Administrative Domains H. David Lambert, Stephen Moore, Arnie Miles, Chad La Joie, Brent Putman, Jess Cannata

how Shibboleth can work with job schedulers to create grids to support everyone

  • Upload
    ira

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

how Shibboleth can work with job schedulers to create grids to support everyone. H. David Lambert, Stephen Moore, Arnie Miles, Chad La Joie, Brent Putman, Jess Cannata. Exposing Computational Resources Across Administrative Domains. The Paradox of Grid Computing. - PowerPoint PPT Presentation

Citation preview

Page 1: how Shibboleth can work with job  schedulers to  create grids to support everyone

how Shibboleth can work with job schedulers to

create grids to support everyone

Exposing Computational Resources Across Administrative Domains

H. David Lambert, Stephen Moore, Arnie Miles, Chad La Joie, Brent Putman, Jess Cannata

Page 2: how Shibboleth can work with job  schedulers to  create grids to support everyone

Large amounts of computing power goes untapped, yet researchers cannot typically find computing power.

Resource owners must set policies for the use of their equipment.

Users must find and leverage resources that apply to their needs.

The Paradox of Grid Computing

Page 3: how Shibboleth can work with job  schedulers to  create grids to support everyone

Secure grid-like installations are not growing beyond small groups of known players.

but....WHY?The only method currently available for ensuring security of a resource involves personal interaction between resourceowners and resource consumers.

Enabling a user or resource to access a resource requires manually adding user to a local map file.

Various methods of grouping users and resources to share certificates have sprung up.

Page 4: how Shibboleth can work with job  schedulers to  create grids to support everyone

On the other hand

Grids that encourage resource owners to connecttheir machines to a central portal that only allows

specific efforts to run have exploded.

S.E.T.I.United Devices Grid.org

IBM's World Community Grid

What does this mean?

Historically, getting massive quantities of resourceson the grid has been a challenge.

However, in situations where the potential resource ownersare relieved of heavy administrative burdens,

resource owners flock to the grid.

When massive numbers of resources are madeavailable to researchers, real work gets

accomplished.

Page 5: how Shibboleth can work with job  schedulers to  create grids to support everyone

How are jobs executed?

Modern Job Scheduling software include:Condor

Sun Grid Engine (N1)PBS (Pro and Open)

LSF Platform

Page 6: how Shibboleth can work with job  schedulers to  create grids to support everyone

Job scheduling software is unsurpassed in environmentswhere there is only one administrative domain.

Beowulf ClustersHigh Performance n-way devices

Unfortunately, as soon as you begin to cross any sort ofadministrative line, these products become less robust.

Intra-Campus gridsInter-Campus grids

Attempts to leverage existing grid tools to handle this have resulted incompromises.

Groups of users sharing one certificate.

User management issues.

Accounting issues.

Page 7: how Shibboleth can work with job  schedulers to  create grids to support everyone

In general, job scheduling software accepts a job description filethat describes the work to be done.

Job file is free form text, containing name-value pairs.

We can therefore add anything we want to these files, as long as weteach the execution machines to understand.

Page 8: how Shibboleth can work with job  schedulers to  create grids to support everyone

# Example condor_submit input file# (Lines beginning with # are comments)Universe = vanillaExecutable = /home/arnie/condor/my_job.condorInput = my_job.stdinOutput = my_job.stdoutError = my_job.stderrArguments = -arg1 -arg2InitialDir = /home/arnie/condor/run_1Queue

Example Submission file (Condor)

Page 9: how Shibboleth can work with job  schedulers to  create grids to support everyone

Condor in the Beowulf, Supercomputer,or campus Grid world.

Universe = vanillaExecutable = /home/arnie/condor/my_job.condorInput = my_job.stdinOutput = my_job.stdoutError = my_job.stderrArguments = -arg1 -arg2InitialDir = /home/arnie/condor/run_1Queue

User has an account onthe cluster or HP device,all nodes are in a closelycontrolled administrativedomain.

Page 10: how Shibboleth can work with job  schedulers to  create grids to support everyone

Schedd

Collector

Negotiator

Central Manager

(CONDOR_HOST)

Collector

Negotiator

Pool-Foo Central Manager

Collector

Negotiator

Pool-BarCentral Manager

SubmitMachine

Condor Grid with Flocking

“Flocks” are introduced to each other byhostname or IP address.

Page 11: how Shibboleth can work with job  schedulers to  create grids to support everyone

Job Scheduling with Conventional “Grid” Products:Globus and Condor-GUser submits

job viaGlobus enabledversion of Condor.

Any number of resources “on the grid” accept jobs from Globus Gatekeeper and are distributed to Globus JobManagers to be distributed to resources.

Each resource must physically map a Globus x.509 certificate to a local user account.

Page 12: how Shibboleth can work with job  schedulers to  create grids to support everyone

User and Resources Management Problems

How does the owner of a grid resource grant access to large numbers of individuals?

Summary of Limitations from Previous Examples

How does the owner of a grid resource know when a usergranted access by membership in an organization leaves that

organization?

How does a user easily get added to a resource?

How does a user find available resources?

Page 13: how Shibboleth can work with job  schedulers to  create grids to support everyone

SAML based solutions provide secure access to attributes about a

user to a resource to become a powerful partner to existing batch

job schedulers.

While Condor was already able to leverage user attributes from a local LDAP store, this project

demonstrates the first time that Condor can consume user attributes from a remote store.

Page 14: how Shibboleth can work with job  schedulers to  create grids to support everyone

LDAP

DB

Shib/Condor Portal

CondorSchedd

CondorSchedd

Job ClassAd

Resource ClassAd

User at Site 'A' Resource at Site 'B'

WAYF

IdP

Running Job

111

10

8

9

7

4

6

3

2

5

4

Condor Startd

What we are doing now with Shibboleth, LDAP, and CondorUser at Site 'A' is aware of a Resource at Site 'B' and Owner

of Resource 'B' has granted access to Site 'A'.

We leverage the free-textjob submission files to

add attributes from SAMLto our jobs.

Page 15: how Shibboleth can work with job  schedulers to  create grids to support everyone

Now, Resource owners can grant access to users based upon their attributes instead of their identities.

Management of users is again the responsibility of the local administration, as it should be.

When Resource Owners can easily set policieswithout worrying about user management

and group memberships, they will become willing to attach

their resources to this new computationalGrid.

Page 16: how Shibboleth can work with job  schedulers to  create grids to support everyone

Intelligent Resource Management

Users have their own policy decisions to make:Processor type, Operating System Type, executable location,data location, memory requirements, etc.

In the perfect world, Users will have multiple Resources to choose from.These Resources will have different configurations that can match the User policy requirements.

These varied Resources will also have an ever-changing availability!

An Intelligent Resource Management System will allow users to launchjobs from their portal and trust that the work will be sent to the Resourcethat not only correctly matches the User's job policy, but has the leastload on it.

This will be done without the User being aware of where the work willbe executed.

This solution will be scheduler agnostic.

Page 17: how Shibboleth can work with job  schedulers to  create grids to support everyone

Identity Provider

Job Submission

Client

User Job File

Resource DiscoveryNetwork

Company “A”

University “B”

ResourceDiscovery

Network Node

ResourceDiscovery

Network Node

ResourceDiscovery

Network Node

Scheduler

Scheduler

Scheduler

Scheduler

RunningJob

RunningJob

RunningJob

RunningJob

Example of Intelligent Agent

Page 18: how Shibboleth can work with job  schedulers to  create grids to support everyone

Acknowledgments

Georgetown University:Charlie LeonhardtSteve MooreArnie MilesChad La JoieBent PutmanJess Cannata

University of Wisconsin:Miron LivnyTodd TannenbaumIan Alderman

Internet2: Ken Klingenstein, Mike McGill