Building Science GatewaysMarlon PierceCommunity Grids LaboratoryIndiana University
What Is a Web Portal?Web container that aggregates content from multiple sources into a single display.Start PagesTypically consume RSS/Atom news feeds.More powerful versions these days support Flickr, calendars, games, etc.Gadgets, widgets Examples: iGoogle, Netvibes, My Yahoo!
Grid Computing OverviewGrid computing software is designed to integrate large supercomputing facilities.TeraGrid, Open Science Grid, EGEE, etc.This is done via network servicesKey Service ComponentsAuthentication and authorization framework (MyProxy)Remote process access and control (GRAM, Condor) Remote file, I/O access (GridFTP)Additional ServicesInformation services, replica management, database federation, storage management, schedulers, etc.Example Grid Software Stacks: CTSS and VDT
TeraGrid Supercomputing Resources (GPIR)
Science Portals and GatewaysScience Gateways adapt Web portal technology to build user interfaces to the Grid.Science portals resemble standard portals, but must alsoSupport access to computing and storage resources.Allow users remote, Unix-like access to these resources.Provide access to science applications and data sets.And we must provide value added services as well as user interfaces.
JDBCHost 1 Host 2Host 3My 2002 octopus SOA diagram, from the archives.SOAP/HTTPHTTP(S)
TerminologyPortlet: this is a standard Java component that generates HTML and can also act as a client to a remote service.Lives in a portal container.I will also use this term generically.Web Service: a remotely invokeable function on the Internet.SOAP: the XML message envelop for carrying commands over HTTP.WSDL: describes the services API in XML. REST: A variation of this approach.Lots more info: http://grids.ucs.indiana.edu/ptliupages/presentations/I590WebService.ppt
But Why?Three-tiered Service Oriented Architecture is the network equivalent of the the famous Model-View-Controller design pattern.View: the user interface components.Controller: Web service middlewareModel: the backend resources.Independence of tiers gives flexibilityServices can be reused with alternative user interfaces Workflow composers like TavernaUser interfaces can work with different service implementations.Drawback: reliability and robustness are issues.
Two Approaches to the Middle TierFat Client Thin ClientGrid Protocol (SOAP)HTTP + SOAPGrid Protocol(SOAP)
Disloc output converted to KML and plotted.
GeoFEST Finite Element Modeling portlet and plotting tools
Whats In the Screenshots?GeoFEST and Disloc PortletsLive on gf7.ucs.indiana.eduManage the users display: Web forms, links to output, graphics.Save user session state persistently.QuakeTables Fault DB Web ServiceLives on gf2.ucs.indiana.eduContains geometric fault models.GeoFEST and Disloc Execution Web ServicesLives on gf19.ucs.indiana.eduGenerates input files from fault models.Runs and manages codes.
Best Practice for Scientific Web ServicesThere are many tools to choose from..NET, Apache Axis, Sun WS, Ruby on Rails, etc.Make them self-contained.If possible, generate input files within the service.Or have an input file generating service.Remember that they may be used by other people with other client tools. Communicate data files with URLs.Be very careful about exposing the state of the service.Dont assume persistent connections.
Components for PortalsOpen Grid Computing Environments Examples. See http://www.collab-ogce.org/
Components for Science PortalsOGCE is founded on the principal that portals should be built out of reusable parts.Key standard in our first phase: the JSR 168 portlet specification.Portlets can run in multiple containersuPortal, Sakai, GridSphere, LifeRay, etc.Allows us to build Grid specific components and deploy along side other goodies: Sakai collaboration tools, contributed portlets, etc.Future: Open Social compliant Google Gadgets
Dashboard Portlet*The dashboard portlet allows users to track jobs on the selected resource. The user can view either his own set of jobs or get information on all submitted jobs.
Queue forecasting portlets work with the NWS QBETS to predict wait times and deadlines.
PURSe portlets manage user requests for portal accounts and Grid credentials.
Condor and Condor-G
OGCE IFrame Portlet can be used to integrate external sites.
Client Libraries for Grid Computing
Two Major Grid Client EffortsThe Java COG KitSupports several versions of Globus and SSH. Condor-GHas a Web Service interface (BirdBath) and Java client libraries.Supports Globus (v2 and v4) and several other Grid middleware systems.You can build either portlets or Web services with either of these. OGCE portlets use primarily COGWe prefer Condor-G based Web services for long running jobs.
SSHOthersNanomaterialsBio-InformaticsDisasterManagementPortalsDevelopmentSupportCoG Abstraction Layers
TaskTaskHandlerServiceTaskSpecificationSecurityContextServiceContactThe class diagram is thesame for all grid tasks (running jobs, modifying files, moving data).Classes also abstract toolkit provider differences. You set these as parameters: GT2, GT4, etc.
Coupling CoG TasksThe COG abstractions also simplify creating coupled tasks.Tasks can be assembled into task graphs with dependencies.Do Task B after successful Task AGraphs can be nested.
Problems with Grid Client DevelopmentGrid portlets typically wrap each single Grid capability in a separate portlet Problem is that Grid portlets need to combine these operationsPortlets are entire web applications, so we need a component model for portlets: reusable portlet partsEven with the COG Abstraction Layer, we must still do a lot of coding to build new applications. To address these problems we have adopted Java Server FacesProvides several nice Model-View-Controller featuresJSF provides an extensible framework (tag libraries) for making reusable components.Apache JSF portlet bridge allows you to convert standalone JSF applications (development phase) into portlets (deployment phase).
Grid TagsAssociated Grid BeansFeaturesComponentBuilderBeanCreating components, job handlers, submitting jobsMonitorBeanHandling monitoring page actionsMultitaskBeanConstructing simple workflowMultitaskBeanDefining dependencies among sub jobsMyproxyBeanRetrieving myproxy credentialFileOprationBeanProviding Gridftp operationsJobSubmitBeanProviding GRAM job submissionsFileTransferBeanProviding Gridftp file transferResourceBeanDescribes common properties among all tags and beans. Passing values given by standard visual JSF components.
Managing Scientific Workflows
Scientific WorkflowsPortal interfaces encode scientific use cases.If you have a rich set of services, it is a lot of work to make portlets for all possible use cases.And power users will have always want something more.Example: our CICC project has dozens of chemical informatics Web services.http://www.chembiogrid.org.wikiWorkflow composers can simplify this.Allow users to encode and execute their own use cases.
Web Services and Workflows Perform a similarity search on the NIH DTP Human Tumor data.Filter the results based on Pharmacokinetic properties (FILTER)Convert to 3D (OMEGA) Docking into a pre-defined protein (FRED) Visualize (JMOL).
Taverna workflow connects remote services.
OGCEs XBaya Workflow Composer
Future of Science Gateways
JDBCHost 1Host 2Host 3Updating the OctopusRSS,JSON/HTTPHTTP(S)
Semantic Web: RDF, OWL, ontologiesMicroformats, folksonomies
Microformats,KML, and GeoRSS feeds used to deliver SAR data to multiple clients.
More InformationContact me: email@example.comSee what Im up to: http://communitygrids.blogspot.com/OGCE software: http://collab-ogce.org/QuakeSim: http://www.quakesim.org/CICC: http://www.chembiogrid.org/wiki/Lots of people worked on all of these.