53
The Ibis e-Science Software Framework Frank J. Seinstra, Jason Maassen, Niels Drost Jungle Computing Research & Applications Group Department of Computer Science VU University, Amsterdam, The Netherlands

The Ibis e-Science Software Framework ·  · 2011-11-28Please help us define this properly ... Easily combines resources in multiple administrative domains ... ‘Intelligent dispatching’

Embed Size (px)

Citation preview

The Ibis e-Science Software Framework

Frank J. Seinstra, Jason Maassen, Niels Drost

Jungle Computing Research & Applications Group Department of Computer Science

VU University, Amsterdam, The Netherlands

Mini-course: General Overview

●  10:00 – 11:00: Introduction to Ibis ●  11:00 – 12:00: Ibis as ‘Master Key’ ●  12:00 – 13:00: Lunch (free) ●  13:00 – 14:00: Ibis as ‘Glue’ ●  14:00 – 15:00: Ibis today, tomorrow, and beyond ●  During and after course time for discussion

●  Note: ●  Second day (hands-on session) on Monday December 12

2 Ibis Mini-Course November 2011

Introduction: Overview

●  Ibis: ‘Problem Solving’ vs. ‘System Fighting’ ●  Disclaimers ●  Jungle Computing ●  Domain Example #1: Computational Astrophysics ●  Domain Example #2: Multimedia Content Analysis ●  The Ibis Software Framework

●  Requirements and Tools

●  The 3 Common Uses of Ibis ●  Ibis as ‘Master Key’, Ibis as ‘Glue’, Ibis as ‘HPC solution’

3 Ibis Mini-Course November 2011

Ibis: ‘Problem Solving’ vs. ‘System Fighting’

4 Ibis Mini-Course November 2011

●  DACH 2008, Japan ●  Distributed multi-cluster system

●  Heterogeneous ●  Distributed database (image pairs)

●  Large vs small databases/images ●  Partial replication

●  Image-pair comparison given (in C)

●  Find all supernova candidates ●  Task 1: As fast as possible ●  Task 2: Idem, under system crashes

5 Ibis Mini-Course November 2011

A Random Example: Supernova Detection

‘Problem Solving’ vs. ‘System Fighting’

●  All participating teams struggled (1 month) ●  Middleware instabilities… ●  Connectivity problems… ●  Load balancing…

●  But not the Ibis team ●  Winner (by far) in both categories ●  Note: many Japanese teams with years of experience

●  Hardware, middleware, network, C-code, image data… ●  Focus on ‘problem solving’, not ‘system fighting’

●  incl. ‘opening’ of black-box C-code

6 Ibis Mini-Course November 2011

Ibis Results: Awards & Prizes

7

1st Prize: SCALE 2008

AAAI-VC 2007 Most Visionary Research Award

1st Prize: DACH 2008 - BS 1st Prize: DACH 2008 - FT

WebPie: A Web-Scale Parallel Inference Engine

J. Urbani, S. Kotoulas,

J. Maassen, N. Drost, F.J. Seinstra, F. van Harmelen, and H.E. Bal

3rd Prize: ISWC 2008

1st Prize: SCALE 2010

Ibis Mini-Course November 2011 Finalist: EYR3 2011

●  Many domains; data/compute intensive, real-time...

Ibis Users…

8 Ibis Mini-Course November 2011

…and many more

Disclaimers

9 Ibis Mini-Course November 2011

Disclaimer #1

●  During this mini-course I (= Frank) will take the position of an enthusiastic user / domain expert… ●  Because I am / used to be:

●  Multimedia Content Analysis (UvA, 1996 - 2006) ●  2005/2006:

●  Ibis provided all tools to allow me to continue my research ●  Was immediately successful: award at AAAI 2007 ●  Became Ibis ‘evangelist’: “If I can do it, anyone can”

●  Now leader of JCRA group ●  Direction, focus, scope ●  Outreach to application domains / developers

10 Ibis Mini-Course November 2011

Disclaimer #2

●  The Ibis projected started in 2002 ●  Dozens of people have worked on it and/or co-implemented it ●  Many more have used it ●  Hence:

●  Ibis reflects many important lessons learned over +/- 10 years

●  In this tutorial we can only ‘scratch the surface’ ●  Attempt to give best possible impression of what Ibis has to offer ●  Never complete – focus on:

●  most important lessons learned ●  best match with ‘you’

11 Ibis Mini-Course November 2011

Disclaimer #3

●  Ibis is not the-be-all-and-end-all solution ●  A modular toolbox

●  Many (integrated) solutions to fundamental problems ●  Some Ibis-tools are low-level

●  but higher level APIs, models, GUIs often available ●  Further research required ●  Further development / engineering required

●  Note: ●  Code itself (in principle) less important than the knowledge it

contains

12 Ibis Mini-Course November 2011

Disclaimer #4

●  Ibis does not (explicitly) provide solutions for ●  World peace, cold fusion, stock market prediction, … J ●  … ●  Security

●  Well, there is no Ibis-specific security ●  Do integrate existing solutions, e.g. ssh tunnels, grid certificates

●  Co-allocation ●  Well, there is a JavaGAT call, but adaptors must implement it ●  Do provide support for malleability and migration

●  It’s all a matter of ‘scope’: ●  Please help us define this properly

13 Ibis Mini-Course November 2011

Disclaimer #5

●  Outreach to Application Domains ●  We are happy to support any user, but time is against us

●  We must do research primarily ●  Currently we do ‘cherry-picking’ ●  Future: through NLeSC, or…?

●  Outreach to Expert HPDC Developers ●  Ibis is open source; contributions always welcome

●  So far: mostly JavaGAT adaptors (e.g. from D-Grid) ●  Future: through NLeSC, or…?

●  Roadmap to ‘Ibis Coding Community’ required

14 Ibis Mini-Course November 2011

Jungle Computing

15 Ibis Mini-Course November 2011

Jungle Computing

●  ‘Worst case’ computing as required by end-users ●  Distributed ●  Heterogeneous ●  Hierarchical (incl. multi-/many-cores)

Ibis Mini-Course November 2011 16

Why Jungle Computing?

●  Scientists often forced to use a wide variety of resources simultaneously to solve computational problems, e.g. due to:

●  Desire for scalability ●  Distributed nature of (input) data ●  Software heterogeneity (e.g.: mix of C/MPI and CUDA) ●  Ad hoc hardware availability ●  …

●  Note: most users do not need ‘worst case’ jungle ●  Ibis aims to apply to any subset

Ibis Mini-Course November 2011 17

Example Application Domains

●  Computational Astrophysics (Leiden) ●  AMUSE: multi-model / multi-kernel simulations ●  “Simulating the Universe on an Intercontinental

Grid” - Portegies Zwart et al (IEEE Computer, Aug 2010)

●  Climate Modeling (Utrecht) ●  CPL: multi-model / multi-kernel simulations

●  Atmosphere, ocean, source rock formation, … - hardware:

(potentially) very diverse - high resolution => speed & scalability

- …

18 Ibis Mini-Course November 2011

Domain Example #1: Computational Astrophysics

19 Ibis Mini-Course November 2011

20 Ibis Mini-Course November 2011

Domain Example #1: Computational Astrophysics

Demonstrated live at SC’11, Nov 12-18, 2011, Seattle, USA (two week ago)

21 Ibis Mini-Course November 2011

Domain Example #1: Computational Astrophysics

AMUSE

radiative transport

gravitational dynamics

hydro-dynamics

stellar evolution

●  The AMUSE system (Leiden University) ●  Early Star Cluster Evolution, including gas

●  Gravitational dynamics (N-body): GPU / GPU-cluster ●  Stellar evolution: Beowulf cluster / Cloud ●  Hydro-dynamics, Radiative transport: Supercomputer

22 Ibis Mini-Course November 2011

Domain Example #1: Computational Astrophysics

Demonstrated live at SC’11, Nov 12-18, 2011, Seattle, USA (two weeks ago)

Domain Example #2: Multimedia Content Analysis

23 Ibis Mini-Course November 2011

Multimedia Content Analysis (MMCA)

●  Aim: ●  Automatic extraction of ‘semantic concepts’ from image sets and

video streams

●  Depending on specific problem & size of data set: ●  May take hours, days, weeks, months, years…

Ibis Mini-Course November 2011 24

●  Need for user-friendly programming tools ●  Shield domain-experts from all complexities of parallel, distributed,

heterogeneous, and hierarchical computing ●  Familiar (sequential) programming model(s)

Ibis Mini-Course November 2011 25

Solution: tool to make parallel & distributed computing

transparent to user

- familiar programming - easy execution Jungle Computing Systems User

Multimedia Content Analysis (MMCA)

●  Applications in (a.o): ●  Remote Sensing ●  Security / Surveillance ●  Medical Imaging ●  Document Analysis ●  Multimedia Systems ●  Astronomy

●  Application types: ●  Real-time vs. off-line ●  Fine-grained vs. coarse-grained ●  Data-intensive / compute-intensive / information-intensive

Ibis Mini-Course November 2011 26

Multimedia Content Analysis (MMCA)

Domain Example #2: Color-based Object Recognition by a Grid-connected Robot Dog

27 Ibis Mini-Course November 2011

Seinstra et al (IEEE Multimedia, Oct-Dec 2007) Seinstra et al (AAAI’07: Most Visionary Research Award)

Ibis Mini-Course November 2011 28

Successful…

●  …but many fundamental problem unsolved! ●  Scaling up to very large systems ●  Platform independence ●  Middleware independence ●  Connectivity (a.o. firewalls, …) ●  Fault-tolerance ●  …

●  Software support tool(s) urgently needed! ●  Jungle-aware + transparent + efficient ●  No progress until ‘discovery’ of Ibis

Ibis Mini-Course November 2011 29

The Ibis Software Framework

30 Ibis Mini-Course November 2011

The Ibis Software Framework

●  Offers all functionality to efficiently & transparently implement & run Jungle Computing applications

●  Designed for dynamic / hostile environments

●  Modular and flexible ●  Allow replacement of Ibis components by external ones, including

native code

●  Open source ●  Download: http://www.cs.vu.nl/ibis/

Ibis Mini-Course November 2011 31

General Requirements

●  Resource independence ●  Transparent / easy deployment

●  Middleware independence & interoperability ●  Jungle-aware middleware

●  Jungle-aware communication ●  Robust connectivity ●  System-support for malleability and fault-tolerance ●  Globally unique naming

●  Transparent parallelism & application-level fault-tolerance ●  Easy integration with external software (legacy codes)

●  MPI, CUDA, C++, Python, Java, Fortran, …

Ibis Mini-Course November 2011 32

Ibis Software Stack

Ibis Mini-Course November 2011 33

Resource Independence: Java

Transparent / Easy Deployment:

Jungle-aware Middleware

Middleware Independence & Interoperability

Transp. Parallelism Application-level FT

Jungle-aware Communication Unique naming Malleability & FT

Robust Connectivity

Constellation External software

JavaGAT

●  Java Grid Application Toolkit ●  High-level API for developing (Grid) applications independently of

the underlying (Grid) middleware ●  Use (Grid) services; file cp, resource discovery, job submission, … ●  Overcomes problems, incl:

●  Functionality may not work on all sites, or for all users, … ●  Middleware version differences & complex codes…

●  Note: SAGA API standardized by OGF ●  Simple API for Grid Applications (a.o. with LSU) ●  SAGA on top of JavaGAT (and v.v.)

Ibis Mini-Course November 2011 34

Zorilla (not discussed; future)

●  A prototype P2P middleware ●  A Zorilla system consists of a collection of nodes, connected by a

P2P network ●  Each node independent & implements all middleware functionality ●  No central components ●  Supports fault-tolerance and malleability ●  Easily combines resources in multiple administrative domains

Ibis Mini-Course November 2011 35

IbisDeploy

Ibis Mini-Course November 2011 36

Ibis Portability Layer (IPL)

●  Java-centric ‘run-anywhere’ communication library ●  Sent along with your application ●  “MPI for the Grid” (quote 2005)

●  Supports fault-tolerance and malleability ●  Resource tracking (JEL model) ●  Open-world / Closed world

●  Efficient ●  Highly optimized object serialization ●  Can use optimized native libraries (e.g. MPI, MX)

Ibis Mini-Course November 2011 37

SmartSockets

●  Robust connection setup

●  Always connection in 30 different scenarios Ibis Mini-Course November 2011 38

Problems: Firewalls

Network Address Translation (NAT) Non-routed networks

Multi-homing …

Ibis Programming Models

●  IPL-based programming models, a.o.: ●  Satin:

●  A divide-and-conquer model ●  MPJ:

●  The MPI binding for Java ●  RMI:

●  Object-Oriented remote Procedure Call ●  Jorus:

●  A ‘user transparent’ parallel model for multimedia applications

●  Note: ●  In most cases intended as ‘proof-of-concept’ (= scientific) ●  Not discussed in this mini-course

Ibis Mini-Course November 2011 39

“The Future of Ibis”

●  Ibis/Constellation: ●  Generalized programming framework for

‘all’ Jungle Computing applications ●  Automatically maps any application activity

(task) onto any appropriate executor (HW) ●  By way of ‘contexts’, for example:

●  Activity's context: “I need a GPU” ●  Executor’s context: “I represent a GPU”

●  Note: ●  Activities may represent any type of task:

●  Also legacy codes, scripts, 3rd party software, …

40 Ibis Mini-Course November 2011

The 3 Common Uses of Ibis

41 Ibis Mini-Course November 2011

Ibis as ‘Master Key’ (or ‘Passepartout’)

●  Use JavaGAT to access ‘any’ system ●  Develop/run applications independently of available middlewares ●  JavaGAT ‘adaptors’ required for each middleware ●  ‘Intelligent dispatching’ even allows for transparent use of multiple

middlewares

●  Example: file copy ●  JavaGAT vs. Globus

●  Simple, portable, … ●  SAGA API standardized

42 Ibis Mini-Course November 2011

package org.gridlab.gat.io.cpi.rftgt4; import java.net.MalformedURLException; import java.net.URL; import java.rmi.RemoteException; import java.security.cert.X509Certificate; import java.util.Calendar; import java.util.HashMap;

import java.util.LinkedList; import java.util.List; import java.util.Map; import java.util.Vector; import javax.xml.namespace.QName; import javax.xml.rpc.ServiceException;

import javax.xml.rpc.Stub; import javax.xml.soap.SOAPElement; import org.apache.axis.message.addressing.EndpointReferenceType; import org.apache.axis.types.URI.MalformedURIException; import org.globus.axis.util.Util; import org.globus.delegation.DelegationConstants;

import org.globus.delegation.DelegationException; import org.globus.delegation.DelegationUtil; import org.globus.gsi.GlobusCredential; import org.globus.gsi.GlobusCredentialException; import org.globus.gsi.gssapi.GlobusGSSCredentialImpl; import org.globus.gsi.jaas.JaasGssUtil; import org.globus.rft.generated.BaseRequestType;

import org.globus.rft.generated.CreateReliableFileTransferInputType; import org.globus.rft.generated.CreateReliableFileTransferOutputType; import org.globus.rft.generated.DeleteRequestType; import org.globus.rft.generated.DeleteType; import org.globus.rft.generated.OverallStatus; import org.globus.rft.generated.RFTFaultResourcePropertyType; import org.globus.rft.generated.RFTOptionsType;

import org.globus.rft.generated.ReliableFileTransferFactoryPortType; import org.globus.rft.generated.ReliableFileTransferPortType; import org.globus.rft.generated.Start; import org.globus.rft.generated.TransferRequestType; import org.globus.rft.generated.TransferType; import org.globus.transfer.reliable.client.BaseRFTClient; import org.globus.transfer.reliable.service.RFTConstants;

import org.globus.wsrf.NotificationConsumerManager; import org.globus.wsrf.NotifyCallback; import org.globus.wsrf.ResourceException; import org.globus.wsrf.WSNConstants; import org.globus.wsrf.container.ContainerException; import org.globus.wsrf.container.ServiceContainer; import org.globus.wsrf.core.notification.ResourcePropertyValueChangeNotificationElementType;

import org.globus.wsrf.encoding.DeserializationException; import org.globus.wsrf.encoding.ObjectDeserializer; import org.globus.wsrf.impl.security.authentication.Constants; import org.globus.wsrf.impl.security.authorization.Authorization; import org.globus.wsrf.impl.security.authorization.HostAuthorization; import org.globus.wsrf.impl.security.authorization.IdentityAuthorization; import org.globus.wsrf.impl.security.authorization.SelfAuthorization; import org.globus.wsrf.impl.security.descriptor.ClientSecurityDescriptor;

import org.globus.wsrf.impl.security.descriptor.ContainerSecurityDescriptor; import org.globus.wsrf.impl.security.descriptor.GSISecureMsgAuthMethod; import org.globus.wsrf.impl.security.descriptor.GSITransportAuthMethod; import org.globus.wsrf.impl.security.descriptor.ResourceSecurityDescriptor; import org.globus.wsrf.impl.security.descriptor.SecurityDescriptorException; import org.globus.wsrf.security.SecurityManager; import org.gridlab.gat.CouldNotInitializeCredentialException;

import org.gridlab.gat.CredentialExpiredException; import org.gridlab.gat.GATContext; import org.gridlab.gat.GATInvocationException; import org.gridlab.gat.GATObjectCreationException; import org.gridlab.gat.Preferences; import org.gridlab.gat.URI; import org.gridlab.gat.io.cpi.FileCpi;

import org.gridlab.gat.security.globus.GlobusSecurityUtils; import org.ietf.jgss.GSSCredential; import org.ietf.jgss.GSSException; import org.oasis.wsn.Subscribe; import org.oasis.wsn.TopicExpressionType; import org.oasis.wsrf.faults.BaseFaultType; import org.oasis.wsrf.lifetime.SetTerminationTime;

import org.oasis.wsrf.properties.GetMultipleResourcePropertiesResponse; import org.oasis.wsrf.properties.GetMultipleResourceProperties_Element; import org.oasis.wsrf.properties.ResourcePropertyValueChangeNotificationType; class RFTGT4NotifyCallback implements NotifyCallback { RFTGT4FileAdaptor transfer; OverallStatus status;

public RFTGT4NotifyCallback(RFTGT4FileAdaptor transfer) { super(); this.transfer = transfer; this.status = null; }

@SuppressWarnings("unchecked") public void deliver(List topicPath, EndpointReferenceType producer, Object messageWrapper) { try { ResourcePropertyValueChangeNotificationType message = ((ResourcePropertyValueChangeNotificationElementType) messageWrapper) .getResourcePropertyValueChangeNotification(); this.status = (OverallStatus) message.getNewValue().get_any()[0]

.getValueAsType(RFTConstants.OVERALL_STATUS_RESOURCE, OverallStatus.class); if (status.getFault() != null) { transfer.setFault(getFaultFromRP(status.getFault())); } // RunQueue.getInstance().add(this.resourceKey); } catch (Exception e) {

} transfer.setStatus(status); } private BaseFaultType getFaultFromRP(RFTFaultResourcePropertyType fault) { if (fault == null) { return null;

} if (fault.getDelegationEPRMissingFaultType() != null) { return fault.getDelegationEPRMissingFaultType(); } else if (fault.getRftAuthenticationFaultType() != null) { return fault.getRftAuthenticationFaultType(); } else if (fault.getRftAuthorizationFaultType() != null) {

return fault.getRftAuthorizationFaultType(); } else if (fault.getRftDatabaseFaultType() != null) { return fault.getRftDatabaseFaultType(); } else if (fault.getRftRepeatedlyStartedFaultType() != null) { return fault.getRftRepeatedlyStartedFaultType(); } else if (fault.getTransferTransientFaultType() != null) { return fault.getTransferTransientFaultType();

} else if (fault.getRftTransferFaultType() != null) { return fault.getRftTransferFaultType(); } else { return null; } } }

@SuppressWarnings("serial") public class RFTGT4FileAdaptor extends FileCpi { public static final Authorization DEFAULT_AUTHZ = HostAuthorization .getInstance(); Integer msgProtectionType = Constants.SIGNATURE; static final int TERM_TIME = 20;

static final String PROTOCOL = "https"; private static final String BASE_SERVICE_PATH = "/wsrf/services/"; public static final int DEFAULT_DURATION_HOURS = 24; public static final Integer DEFAULT_MSG_PROTECTION = Constants.SIGNATURE; public static final String DEFAULT_FACTORY_PORT = "8443"; private static final int DEFAULT_GRIDFTP_PORT = 2811; NotificationConsumerManager notificationConsumerManager; EndpointReferenceType notificationConsumerEPR;

EndpointReferenceType notificationProducerEPR; String securityType;

String factoryUrl; GSSCredential proxy; Authorization authorization; String host; OverallStatus status; BaseFaultType fault; String locationStr; ReliableFileTransferFactoryPortType factoryPort;

public RFTGT4FileAdaptor(GATContext gatContext, Preferences preferences, URI location) throws GATObjectCreationException { super(gatContext, preferences, location); if (!location.isCompatible("gsiftp") && !location.isCompatible("gridftp")) { throw new GATObjectCreationException("cannot handle this URI");

} String globusLocation = System.getenv("GLOBUS_LOCATION"); if (globusLocation == null) { throw new GATObjectCreationException("$GLOBUS_LOCATION is not set"); } System.setProperty("GLOBUS_LOCATION", globusLocation);

System.setProperty("axis.ClientConfigFile", globusLocation + "/client-config.wsdd"); this.host = location.getHost(); this.securityType = Constants.GSI_SEC_MSG; this.authorization = null; this.proxy = null;

try { proxy = GlobusSecurityUtils.getGlobusCredential(gatContext, preferences, "globus", location, DEFAULT_GRIDFTP_PORT); } catch (CouldNotInitializeCredentialException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (CredentialExpiredException e) {

// TODO Auto-generated catch block e.printStackTrace(); } this.notificationConsumerManager = null; this.notificationConsumerEPR = null; this.notificationProducerEPR = null;

this.status = null; this.fault = null; factoryPort = null; this.factoryUrl = PROTOCOL + "://" + host + ":" + DEFAULT_FACTORY_PORT + BASE_SERVICE_PATH + RFTConstants.FACTORY_NAME; locationStr = setLocationStr(location); }

String setLocationStr(URI location) { if (location.getScheme().equals("any")) { return "gsiftp://" + location.getHost() + ":" + location.getPort() + "/" + location.getPath(); } else { return location.toString(); }

} protected boolean copy2(String destStr) throws GATInvocationException { EndpointReferenceType credentialEndpoint = getCredentialEPR(); TransferType[] transferArray = new TransferType[1]; transferArray[0] = new TransferType();

transferArray[0].setSourceUrl(locationStr); transferArray[0].setDestinationUrl(destStr); RFTOptionsType rftOptions = new RFTOptionsType(); rftOptions.setBinary(Boolean.TRUE); // rftOptions.setIgnoreFilePermErr(false); TransferRequestType request = new TransferRequestType();

request.setRftOptions(rftOptions); request.setTransfer(transferArray); request.setTransferCredentialEndpoint(credentialEndpoint); setRequest(request); while (!transfersDone()) { try {

Thread.sleep(1000); } catch (InterruptedException e) { throw new GATInvocationException("RFTGT4FileAdaptor: " + e); } } return transfersSucc(); }

public void copy(URI dest) throws GATInvocationException { String destUrl = setLocationStr(dest); if (!copy2(destUrl)) { throw new GATInvocationException( "RFTGT4FileAdaptor: file copy failed"); }

} public void subscribe(ReliableFileTransferPortType rft) throws GATInvocationException { Map<Object, Object> properties = new HashMap<Object, Object>(); properties.put(ServiceContainer.CLASS, "org.globus.wsrf.container.GSIServiceContainer"); if (this.proxy != null) {

ContainerSecurityDescriptor containerSecDesc = new ContainerSecurityDescriptor(); SecurityManager.getManager(); try { containerSecDesc.setSubject(JaasGssUtil .createSubject(this.proxy)); } catch (GSSException e) { throw new GATInvocationException(

"RFTGT4FileAdaptor: ContainerSecurityDescriptor failed, " + e); } properties.put(ServiceContainer.CONTAINER_DESCRIPTOR, containerSecDesc); } this.notificationConsumerManager = NotificationConsumerManager

.getInstance(properties); try { this.notificationConsumerManager.startListening(); } catch (ContainerException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: NotificationConsumerManager failed, " + e);

} List<Object> topicPath = new LinkedList<Object>(); topicPath.add(RFTConstants.OVERALL_STATUS_RESOURCE); ResourceSecurityDescriptor securityDescriptor = new ResourceSecurityDescriptor(); String authz = null; if (authorization == null) { authz = Authorization.AUTHZ_NONE;

} else if (authorization instanceof HostAuthorization) { authz = Authorization.AUTHZ_NONE; } else if (authorization instanceof SelfAuthorization) { authz = Authorization.AUTHZ_SELF; } else if (authorization instanceof IdentityAuthorization) { // not supported throw new GATInvocationException(

"RFTGT4FileAdaptor: identity authorization not supported"); } else { // throw an sg throw new GATInvocationException( "RFTGT4FileAdaptor: set authorization failed"); } securityDescriptor.setAuthz(authz);

Vector<Object> authMethod = new Vector<Object>(); if (this.securityType.equals(Constants.GSI_SEC_MSG)) { authMethod.add(GSISecureMsgAuthMethod.BOTH); } else { authMethod.add(GSITransportAuthMethod.BOTH); } try { securityDescriptor.setAuthMethods(authMethod);

} catch (SecurityDescriptorException e) {

throw new GATInvocationException( "RFTGT4FileAdaptor: setAuthMethods failed, " + e); } RFTGT4NotifyCallback notifyCallback = new RFTGT4NotifyCallback(this); try { notificationConsumerEPR = notificationConsumerManager .createNotificationConsumer(topicPath, notifyCallback,

securityDescriptor); } catch (ResourceException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: createNotificationConsumer failed, " + e); } Subscribe subscriptionRequest = new Subscribe();

subscriptionRequest.setConsumerReference(notificationConsumerEPR); TopicExpressionType topicExpression = null; try { topicExpression = new TopicExpressionType( WSNConstants.SIMPLE_TOPIC_DIALECT, RFTConstants.OVERALL_STATUS_RESOURCE); } catch (MalformedURIException e) {

throw new GATInvocationException( "RFTGT4FileAdaptor: create TopicExpressionType failed, " + e); } subscriptionRequest.setTopicExpression(topicExpression); try { rft.subscribe(subscriptionRequest);

} catch (RemoteException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: subscription failed, " + e); } } protected EndpointReferenceType getCredentialEPR()

throws GATInvocationException { this.status = null; URL factoryURL = null; try { factoryURL = new URL(factoryUrl); } catch (MalformedURLException e) { throw new GATInvocationException(

"RFTGT4FileAdaptor: set factoryURL failed, " + e); } try { factoryPort = BaseRFTClient.rftFactoryLocator .getReliableFileTransferFactoryPortTypePort(factoryURL); } catch (ServiceException e) { throw new GATInvocationException(

"RFTGT4FileAdaptor: set factoryPort failed, " + e); } setSecurityTypeFromURL(factoryURL); return populateRFTEndpoints(factoryPort); } protected void setRequest(BaseRequestType request) throws GATInvocationException {

CreateReliableFileTransferInputType input = new CreateReliableFileTransferInputType(); if (request instanceof TransferRequestType) { input.setTransferRequest((TransferRequestType) request); } else { input.setDeleteRequest((DeleteRequestType) request); }

Calendar termTimeDel = Calendar.getInstance(); termTimeDel.add(Calendar.MINUTE, TERM_TIME); input.setInitialTerminationTime(termTimeDel); CreateReliableFileTransferOutputType response = null; try { response = factoryPort.createReliableFileTransfer(input); } catch (RemoteException e) {

throw new GATInvocationException( "RFTGT4FileAdaptor: set createReliableFileTransfer failed, " + e); } EndpointReferenceType reliableRFTEndpoint = response .getReliableTransferEPR(); ReliableFileTransferPortType rft = null;

try { rft = BaseRFTClient.rftLocator .getReliableFileTransferPortTypePort(reliableRFTEndpoint); } catch (ServiceException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: getReliableFileTransferPortTypePort failed, " + e);

} setStubSecurityProperties((Stub) rft); subscribe(rft); Calendar termTime = Calendar.getInstance(); termTime.add(Calendar.MINUTE, TERM_TIME); SetTerminationTime reqTermTime = new SetTerminationTime(); reqTermTime.setRequestedTerminationTime(termTime);

try { rft.setTerminationTime(reqTermTime); } catch (RemoteException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: setTerminationTime failed, " + e); } try { rft.start(new Start());

} catch (RemoteException e) { throw new GATInvocationException( "RFTGT4FileAdaptor: start failed, " + e); } } private void setSecurityTypeFromURL(URL url) {

if (url.getProtocol().equals("http")) { securityType = Constants.GSI_SEC_MSG; } else { Util.registerTransport(); securityType = Constants.GSI_TRANSPORT; } }

private void setStubSecurityProperties(Stub stub) { ClientSecurityDescriptor secDesc = new ClientSecurityDescriptor(); if (this.securityType.equals(Constants.GSI_SEC_MSG)) { secDesc.setGSISecureMsg(this.getMessageProtectionType()); } else {

secDesc.setGSITransport(this.getMessageProtectionType()); } secDesc.setAuthz(getAuthorization()); if (this.proxy != null) { // set proxy credential

secDesc.setGSSCredential(this.proxy); } stub._setProperty(Constants.CLIENT_DESCRIPTOR, secDesc); } public Integer getMessageProtectionType() {

return (this.msgProtectionType == null) ? RFTGT4FileAdaptor.DEFAULT_MSG_PROTECTION : this.msgProtectionType; } public Authorization getAuthorization() { return (authorization == null) ? DEFAULT_AUTHZ : this.authorization; }

private EndpointReferenceType populateRFTEndpoints( ReliableFileTransferFactoryPortType factoryPort) throws GATInvocationException { EndpointReferenceType[] delegationFactoryEndpoints = fetchDelegationFactoryEndpoints(factoryPort); EndpointReferenceType delegationEndpoint = delegate(delegationFactoryEndpoints[0]); return delegationEndpoint; }

private EndpointReferenceType delegate( EndpointReferenceType delegationFactoryEndpoint) throws GATInvocationException { GlobusCredential credential = null; if (this.proxy != null) { credential = ((GlobusGSSCredentialImpl) this.proxy) .getGlobusCredential(); } else {

try { credential = GlobusCredential.getDefaultCredential(); } catch (GlobusCredentialException e) { throw new GATInvocationException("RFTGT4FileAdaptor: " + e); } }

int lifetime = DEFAULT_DURATION_HOURS * 60 * 60; ClientSecurityDescriptor secDesc = new ClientSecurityDescriptor(); if (this.securityType.equals(Constants.GSI_SEC_MSG)) { secDesc.setGSISecureMsg(this.getMessageProtectionType()); } else { secDesc.setGSITransport(this.getMessageProtectionType());

} secDesc.setAuthz(getAuthorization()); if (this.proxy != null) { secDesc.setGSSCredential(this.proxy); }

// Get the public key to delegate on. X509Certificate[] certsToDelegateOn = null; try { certsToDelegateOn = DelegationUtil.getCertificateChainRP( delegationFactoryEndpoint, secDesc); } catch (DelegationException e) { throw new GATInvocationException("RFTGT4FileAdaptor: " + e);

} X509Certificate certToSign = certsToDelegateOn[0]; // FIXME remove when there is a DelegationUtil.delegate(EPR, ...) String protocol = delegationFactoryEndpoint.getAddress().getScheme(); String host = delegationFactoryEndpoint.getAddress().getHost(); int port = delegationFactoryEndpoint.getAddress().getPort();

String factoryUrl = protocol + "://" + host + ":" + port + BASE_SERVICE_PATH + DelegationConstants.FACTORY_PATH; // send to delegation service and get epr. EndpointReferenceType credentialEndpoint = null; try { credentialEndpoint = DelegationUtil.delegate(factoryUrl,

credential, certToSign, lifetime, false, secDesc); } catch (DelegationException e) { throw new GATInvocationException("RFTGT4FileAdaptor: " + e); } return credentialEndpoint; } public EndpointReferenceType[] fetchDelegationFactoryEndpoints(

ReliableFileTransferFactoryPortType factoryPort) throws GATInvocationException { GetMultipleResourceProperties_Element request = new GetMultipleResourceProperties_Element(); request .setResourceProperty(new QName[] { RFTConstants.DELEGATION_ENDPOINT_FACTORY }); GetMultipleResourcePropertiesResponse response;

try { response = factoryPort.getMultipleResourceProperties(request); } catch (RemoteException e) { e.printStackTrace(); throw new GATInvocationException( "RFTGT4FileAdaptor: getMultipleResourceProperties, " + e); }

SOAPElement[] any = response.get_any(); EndpointReferenceType epr1 = null; try { epr1 = (EndpointReferenceType) ObjectDeserializer.toObject(any[0], EndpointReferenceType.class); } catch (DeserializationException e) {

throw new GATInvocationException( "RFTGT4FileAdaptor: ObjectDeserializer, " + e); } EndpointReferenceType[] endpoints = new EndpointReferenceType[] { epr1 }; return endpoints; }

synchronized void setStatus(OverallStatus status) { this.status = status; } public int transfersActive() { if (status == null) { return 1;

} return status.getTransfersActive(); } public int transfersFinished() { if (status == null) { return 0; }

return status.getTransfersFinished(); } public int transfersCancelled() { if (status == null) { return 0; }

return status.getTransfersCancelled(); } public int transfersFailed() { if (status == null) { return 0; }

return status.getTransfersFailed(); } public int transfersPending() { if (status == null) { return 1; }

return status.getTransfersPending(); } public int transfersRestarted() { if (status == null) { return 0; }

return status.getTransfersRestarted(); } public boolean transfersDone() { return (transfersActive() == 0 && transfersPending() == 0 && transfersRestarted() == 0); }

public boolean transfersSucc() { return (transfersDone() && transfersFailed() == 0 && transfersCancelled() == 0); } /* * private BaseFaultType getFaultFromRP(RFTFaultResourcePropertyType fault) { * if (fault == null) { return null; }

* * if (fault.getRftTransferFaultType() != null) { return * fault.getRftTransferFaultType(); } else if * (fault.getDelegationEPRMissingFaultType() != null) { return * fault.getDelegationEPRMissingFaultType(); } else if * (fault.getRftAuthenticationFaultType() != null) { return * fault.getRftAuthenticationFaultType(); } else if * (fault.getRftAuthorizationFaultType() != null) { return

* fault.getRftAuthorizationFaultType(); } else if * (fault.getRftDatabaseFaultType() != null) { return * fault.getRftDatabaseFaultType(); } else if * (fault.getRftRepeatedlyStartedFaultType() != null) { return * fault.getRftRepeatedlyStartedFaultType(); } else if * (fault.getTransferTransientFaultType() != null) { return * fault.getTransferTransientFaultType(); } else { return null; } }

*/ /* * private BaseFaultType deserializeFaultRP(SOAPElement any) throws * Exception { return getFaultFromRP((RFTFaultResourcePropertyType) * ObjectDeserializer .toObject(any, RFTFaultResourcePropertyType.class)); } */

void setFault(BaseFaultType fault) { this.fault = fault; } }

package tutorial; import org.gridlab.gat.GAT; import org.gridlab.gat.GATContext; import org.gridlab.gat.URI; import org.gridlab.gat.io.File; public class RemoteCopy {

public static void main(String[] args) throws Exception { GATContext context = new GATContext(); URI src = new URI(args[0]); URI dest = new URI(args[1]); File file = GAT.createFile(context, src);

file.copy(dest); GAT.end(); } }

Ibis as ‘Glue’

●  Use IPL + SmartSockets, generally for wide-area communication ●  Linking up separate ‘activities’ of an application

●  Activities: often largely ‘independent’ tasks implemented in any popular language or model (e.g. C/MPI, CUDA, Fortran, Java…)

●  Each typically running on a single GPU/node/Cluster/Cloud/… ●  Automatically circumvent connectivity problems

●  Example:

43 Ibis Mini-Course November 2011

With SmartSockets: No SmartSockets:

Ibis as ‘HPC Solution’

●  Use Ibis as replacement for e.g. C++/MPI code ●  Benefits:

●  (better) portability ●  malleability (open world) ●  fault-tolerance ●  (run-time) task migration

●  Downside: ●  requires recoding

●  Comparable speedups: ●  Not discussed here

44 Ibis Mini-Course November 2011

C++/MPI

Sockets + SSH Tunneling

SSH

●  Code pre-installed at each cluster site ●  Instable / faulty communication ●  Connectivity problems ●  Execution on each cluster ‘by hand’

MMCA: Situation in 2004/2005

Parallel

Horus

Client

Parallel

Horus

Server

Parallel

Horus

Client

Ibis Mini-Course November 2011 45

C++/MPI

Sockets + SSH Tunneling

JavaGAT + IbisDeploy

Phase 1: Ibis as ‘Master Key’ (2006)

Parallel

Horus

Client

Parallel

Horus

Server

Parallel

Horus

Client

●  Code pre-installed at each cluster site ●  Instable / faulty communication ●  Connectivity problems ●  Execution on each cluster ‘by hand’

Ibis Mini-Course November 2011 46

C++/MPI

IPL + SmartSockets

JavaGAT + IbisDeploy

Phase 2: Ibis as ‘Glue’ (2006/2007)

Parallel

Horus

Client

Parallel

Horus

Server

Parallel

Horus

Client

●  Code pre-installed at each cluster site ●  Instable / faulty communication ●  Connectivity problems ●  Execution on each cluster ‘by hand’

Ibis Mini-Course November 2011 47

Ibis/Java

IPL + SmartSockets

JavaGAT + IbisDeploy

Phase 3: Ibis as ‘HPC Solution’ (2008)

Parallel

Jorus

Client

Parallel

Jorus

Server

Parallel

Jorus

Client

●  Code pre-installed at each cluster site ●  Instable / faulty communication ●  Connectivity problems ●  Execution on each cluster ‘by hand’

Ibis Mini-Course November 2011 48

‘Master Key’ + ‘Glue’ + ‘HPC’

●  Step-wise conversion to 100% Ibis / Java ●  Phase 1: JavaGAT as ‘Master Key’ ●  Phase 2: IPL + SmartSockets as ‘Glue’ ●  Phase 3: Ibis as ‘HPC Solution’ ●  After each phase a fully functional, working solution was available!

●  Eventual result: ●  ‘wall-socket computing from a memory stick’ ●  Remember: the ‘Promise of the Grid’? ●  Awards at AAAI 2007 and CCGrid 2008

49 Ibis Mini-Course November 2011

Ibis Mini-Course November 2011 50

100% Ibis Implementation (2008++)

Seinstra, Maassen, Drost et al. (SCALE 2008 @ CCGrid 2008: First Prize Winner) Bal, Maassen, Drost, Seinstra et al. (IEEE Computer, Aug 2010)

Conclusions

●  Ibis enables problem solving (avoids system fighting)

●  Successfully applied in many domains ●  Astronomy, multimedia analysis, climate modeling,

remote sensing, semantic web, medical imaging, … ●  Data intensive, compute intensive, real-time…

●  One of key technologies selected by NLeSC

●  Open source, download: ●  www.cs.vu.nl/ibis/

51 Ibis Mini-Course November 2011

Conclusions (2)

●  Jungle Computing is hard

●  High-Performance Jungle Computing even harder

●  While research into efficient & transparent Jungle-aware programming models has only just begun…

●  …Ibis provides the basic functionality to efficiently & transparently overcome most Jungle Computing complexities

Ibis Mini-Course November 2011 52

End of Introduction

www.cs.vu.nl/ibis/

53 Ibis Mini-Course November 2011