23
Quarterly Meeting, December 5-7, 2006 Science Gateways Update Nancy Wilkins-Diehr Science Gateways Area Director

Science Gateways Update

Embed Size (px)

DESCRIPTION

Science Gateways Update. Nancy Wilkins-Diehr Science Gateways Area Director. Today’s Presentation. Q1FY07 Progress on top five goals Activities Gateway Futures Gateway Highlights Q2FY07 Top five goals Policy discussion on Thursday Community accounts Community software areas. - PowerPoint PPT Presentation

Citation preview

Page 1: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Science Gateways Update

Nancy Wilkins-DiehrScience Gateways Area Director

Page 2: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Today’s Presentation

•Q1FY07– Progress on top five goals– Activities

•Gateway Futures•Gateway Highlights•Q2FY07

– Top five goals

•Policy discussion on Thursday– Community accounts– Community software areas

Page 3: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Top 5 Goals for Q4FY06

1. Turn Primer into Gateway documentation• Primer available from public gateway page

2. Improve public web page and involvement with ER group• Some image collection, but need more content from gateways

3. Provide capability to regularly report number of production users

• Still needs work

4. GCE06, SC06 talks and BOF– Paper presentation at GCE06– Many successful booth talks and BOF– Tremendous interest in the program

5. Nancy to develop demo expertise with several gateways• SPRUCE demo at USC Geosciences workshop

• Increase usage of community accounts (Science Gateway link at http://accounts.teragrid.org/user_services/TGU/)

• Still needs work

Page 4: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Q1FY07 Activities

•October– Web services work

•WebMDS discussions, new view deployed•Improvements needed in TG documentation and in deployment of TG web services.•Additional issues to be addressed by new gateway staffer Steve Mock

– Von Welch presents on attribute-based authentication– Collection of content for SC06 – posters, film reels– GRAM audit capability tested, recommended for inclusion in CTSSv4

•November– DN mapping discussions, can we support VO model?

•For example an OSG user may have their DN mapped to a single team uid, as well as the same DN mapped to their own uid. This could potentially cause problems for global filesystems and other grid tools that rely on the ability to uniquely identify a user on a particular grid.

– SC06

•December– Presentation from JP on information services– Introduction of web services staffer Steve Mock

Page 5: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Gateway Timeline

•October, 2004 “TeraGrid Science Gateway” term originates–We will help them build gateway portals that leverage

TeraGrid capabilities and provide web-based interfaces to community tools. Typical services provided will include access to the following:•Data: metadata catalogs for the community data resources, the user’s experiments, and remote files, with access via browsable directories, query interfaces, or indexes.

•Analysis: hyperlinked visualization and other data analysis and grid-enabled desktop tools.

•Applications: applications encapsulated as web services and given a user interface in the portal. The portal manages back-end job management and, based on the user’s authorization capabilities, the level of resources applied to the user’s request.

•Collaboration: newsgroups, shared data spaces, “publication” mechanisms. •Workflow: tools that enable the user to compose TeraGrid and application services to create new applications to be “published” for others to use.

•Spring, 2005 Science Gateway RAT begins, “origin of the RAT”•October, 2005 Most gateways begin charging in earnest due to funding delays

Page 6: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Gateway Timeline

•2006 Program Plan – these are the milestones we’ve been addressing all year and reporting on in these meetings– Changes in allocation procedures, the mechanisms used to evaluate science

impact, and models for identity management, authentication and authorization that are more tuned to virtual organizations.

– The science gateways work also involves deployment of Web services. – We must design a set of procedures and policies that allow an RP to enable a

new science gateway using a standard approach.•Similarly, we must present the science gateway community with a consistent set of RP services and interfaces.

– While much of the “Deep” computing area deals with computation and management of large data sets, science gateways are often focused on access to data collections, repositories, and libraries. Integrating these approaches with simulation and analysis tools will require that we specify a set of standard interfaces for data collections, building on our current standard data interfaces (e.g., SRB, GridFTP).

– Instruments will be the next type of resource for which we must provide interface recommendations and, along with data, allocations and accounting processes.

– As we move to GT4 and Web services, we anticipate the need to evaluate workflow packages and begin to support at least a core set of tools in this area.

– A key focus area in 2006 is information services. Multiple information services running on the TeraGrid, some of which are specifically deployed for individual efforts

Page 7: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Gateway Futures

•Gateways move into production in FY07– Identify and address issues blocking production

•Future activities may include– Targeted support for new gateways

•ASTA-like model

– Generalized help desk support•Gateway developers are a growing community with unique needs

– Getting started support and documentation•Portal frameworks •Data management approaches•Workflow tools•Collaboration tools•Gateway-in-a-box

– Development of tools production gateways can use•Tracking number of users, use of TG resources•Accounting/authorization tools•Citation capabilities•Proposal tips•RP capabilities – CTSS SGW kit•Other general capabilities as needed

•Conversations ongoing with gateway PIs– Preparing for January Program Plan

Page 8: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

OSG Interop Highlights [1/2]

•Q1FY07 Accomplishments– I worked with two groups providing general frameworks for

application hosting to interface with Teragrid’s grid services•P-Grade/GEMLCA•Application Hosting Environment (AHE)

– Both AHE and P-Grade teams had successful demos at SC in the TeraGrid booth

– These demonstrations showed•At a workflow level and service hosting level, interoperability between OSG, TG, NGS, EGEE is possible

•The development effort to add new applications to be hosted in the framework is small enough that it could be used as a means to significantly scale the number of “gateways” using Teragrid

Page 9: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Application Hosting Framework Plans [2/2]

•Q2FY07 Plans– I will continue to work with AHE and P-Grade groups to further

enhance the capabilities in their frameworks, such as•Service persistence / recovery•Throughput•Scalability•Resource selection•Throttling controls

– I will write a paper on Application Hosting Frameworks that can be used to coordinate more development effort from other grids that are investigating similar application hosting framework infrastructure

Page 10: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

NVO/HEP Gateway Highlights

•Q1FY07 Accomplishments– New access mechanism: fat browser

–connects directly to TG resource, cert is stored in browser–complements existing web proxy and command-line access

– New HEP service available–Pythia event simulator–complements existing services: Mosaic, Multicutout, Coaddition

– Documentation elaborated–http://us-vo.org/nesssi–New chapter for NVO book (publication spring 07)

– Science–Cutouts used for real-time assessment of astronomical transients–Galaxy morphology code compiled and running

•Q2FY07 Plans– Reduce code base by making ServiceDescription.xml

–makes forms for users without code– Run Galaxy Morphology code

–using entire SDSS image set at SDSC•Number of users using TG resources through the gateway

– Still only a handful. We need to have the courage to advertise ....

Page 11: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

GISolve Q1FY07 Accomplishments Highlights [1/2]

• Background– Geographic Information Science (GIScience), an interdisciplinary field

involving geography and other social sciences, computer science, geodesy, information sciences, and statistics to study generic issues about the development and use of geographic information systems (GIS) technologies

• Milestones– Regular number of users: approximately 25– A demo and talk at SC|06 TeraGrid booth– A proposal has been approved to use GISolve in the following courses at the

University of Iowa during 2007• Foundations of Geographic Information Systems (undergraduate)• Principles of Geographic Information Systems (undergraduate and graduate)• Bayesian Statistics (graduate)• Computing in Statistics (undergraduate and graduate)

• Describe impact on science– Produced the following research publications

• Yan, J., Cowles, M. K., Wang, S., and Armstrong, M. P. 2006. “Parallelizing MCMC for Bayesian Spatiotemporal Geostatistical Models.” Statistics and Computing, Under Review

• Wang, S., and Armstrong, M. P. 2006. “A Theory of the Spatial Computational Domain to Support Use of Cyberinfrastructure in Geographical Analysis.” the International Journal of Geographical Information Science, revised and resubmitted

– Provided user-friendly capabilities for performing geographic information analysis using computational Grids

– Helped non-technical users directly benefit from accessing Grid capabilities

Page 12: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Q2FY07 Plans [2/2]• Milestones

– Double the number of users• This will be accomplished through teaching use in next spring semester

– Add a new analysis module (portlet)– Develop some initial on-line documentation to train developers for creating

new gateways– Write one paper that describes service-oriented GIScience based on GISolve

experience• Describe impact on science

• GISolve can support scientific investigations and decision-making in a wide variety of application domains (e.g., environment science, public health, and social sciences) by enabling grid-based computationally intensive spatial-temporal data analyses

• GISolve allows users to interactively steer computationally spatial-temporal data analyses using TeraGrid

Page 13: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

RENCI Bioportal•Q1FY07 Accomplishments

– Security enhancements required for job submissions from Workflows to TG– Jobs generated via Workflows can now be sent to TG resources– Continued Workflow developments with targeted researchers such as Erik

Jacobsen and Fred Wright – focused efforts with individual high profile scientists

– Additional testing of TG core services: auditing and accounting– SC’06 presentations, including Erik Jacobsen’s Motif Network workflow– Workshops for students and faculty at NC universities

•Q2FY07 Plans– Hardening and production deployment of TG powered Workflows– Work with two more high profile scientists to develop workflows– Assist with requirements and testing of TG core web services

developments– Introduce a platform to enable domain scientists to develop complex

scientific workflows that leverage the TeraGrid, with little to no assistance from Computer Scientists or IT professionals.

– Q3FY07: Provide publicly available Bio web services powered by TG resources

Page 14: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Life Science Gateway [1/1]

•Q1FY07 Accomplishments– Completed high-throughput bioinformatics web service

•Performance is lower than expected

– Scientists happy to have resources, but need them to perform faster– The set of applications is sufficient, but a strategy for more resources

is desired– Presented LSGW Progress at SC’07, had a user meeting to gather

feedback in person

•Q2FY07 Plans– Analyzing performance issue, determining the corrective action– Working with community to refine interfaces to web services so they

are simpler to use– Integrate more resources, as available– Preparing for a potential ?RAC proposal to get more resources for the

regular, large jobs the community wants to run

Page 15: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

LEAD Quarterly Gateway Highlights1-2 slides

•Q1FY07 Accomplishments– At Supercomputing 2006 in Tampa, LEAD demonstrated the following

new capabilities•Scheduled LEAD workflows using SPRUCE Emergency Scheduling tokens.•Integrated with VGrADS (NSF sponsored large ITR) to schedule LEAD workflows with a deadline

– LEAD Gateway accomplishments in Q1FY07 •Alpha release to a selected atmospheric science community composed of faculty, researchers and educational users. (about 20 testers).

•Demonstrated the capability to dynamical launch forecasting workflows based on trigger generated by data mining radar data from user selected region.

•Based on user feedback, integrated extensive data search capabilities into the gateway.

–User selects a region of the country, the type of data they need and a time period of interest.

–Data search consults a metadata catalog that is built by crawling on-line weather data sources.

–Returns results to use who may import them to their private directory or visualize them with a standard desktop tool.

Page 16: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Q2 Plans

•Q2FY07 Plans–Educational partners to use LEAD gateway to participate in National collegiate forecasting contest

–Major science goal: Continue to develop the capability to launch dynamic adaptive workflows based on detected weather events. This requires:• Tight integration of data mining with complex event stream processing and workflow.

• The ability for the workflow to be modified on-the-fly by a user.

• Integration of dynamic resource allocation based on storm intensity and size.

–Integration of an experimental data/workflow provenance system into the gateway.

Page 17: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Driving Dynamic Workflows from the Gateway

–“track the weather over Chicago for the next 400 hours and run a tornado workflow if a supercell is detected.”

Page 18: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

TeraGrid Visualization Gateway

•Q4FY06 Accomplishments– Presentations & demonstrations of TeraGrid Visualization beta portal at SC06– Collaborative and Remote Visualization Functionality

•Remote Paraview on UC/ANL systems•Remote & Collaborative Visualization on Maverick

– Offers support for TeraGrid User Portal accounts and community accounts•Current TeraGrid User Portal users can login with their TGUP account and have full access to Viz Gateway.

•Community users can create their own accounts with restricted access.

•Q1FY07 & Future Plans– Continue development to production quality portal

•Clean up and polish existing portlets•Add auditing on portal end and through GRAM auditing•Continue adding functionality for community users, in addition to segregating community user data

– Integrate Additional Visualization Tools•Expand current set of visualization tools and work with other RP sites to see what they can offer.

•Look into registering ‘visualization services’ with the portal, either through web services or importing additional functionality.

– Milestones:•Have portal v1.0 complete and in production by TG07 Conference (and TG Viz User’s Workshop)

•Paper submission to TG 07 Conference

Page 19: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Remote Visualization Portlet

Page 20: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

ParaView Portlet

Page 21: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Neutron Science TeraGrid Gateway(NSTG) Highlights

•Q1FY07 Accomplishments– Neutron Portal up and running at SNS.

• XCAMS AA• Live Data Archive(paced by SNS commissioning)• Publicly available http://portal.sns.gov/snsportal • Searchable; Integrated Viz.; NeXus format for data (community standard)

– Coupling: TeraGrid access and SNS simulations. Results in portal– Milestone: Work with GIG efforts and RP operations on advanced

scheduling, especially advanced reservations. Provide requirements input from NSTG and neutron users needs and help develop resource policy environment to deploy advanced scheduling tools.

• Input provided: Advanced reservations needed. (One of many expressed needs for advanced scheduling)

• New SDSC “on-demand” resource interesting future prospect.– Continued Collaborations

• PSI workshop “International Workshop on Applications of Advanced Monte Carlo Simulations in Neutron Scattering” http://lns00.psi.ch/mcworkshop/

•Q2FY07 Upcoming– Ongoing: SNS collaboration during ramp up.– “Look at” multi-portal access to SNS and other facility data– “Look at” continued Collaborations: ESnet, UK eScience, OSG,

NCAR/ESG, ITER, Relevant SciDAC efforts•In the Candy Store Window

– RLM: SDSC will provide a modest on-demand experimental resource – match to expressed neutron user community need for time-slot resource reservations.

– Information Services: Application service and XML publishing •We are Hiring: Looking a good postDoc with Portal experience, storage management, high-performance data movement.

Portal Access to SNS Data

Portal Access to Simulation

Note simulation corresponds to SNS Backscattering Spectrometer, including realistic flight path and crystal geometry.

Compare to corresponding real data. Black holes correspond to tube dead spots. Simulations allow calibrations of dead spots.

Page 22: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Q1FY07 Top 5ish

1. CY07 Program Plan2. Identify issues that are blocking current production3. Provide capability to regularly report number of production

users4. Begin work on web services

Page 23: Science Gateways Update

Quarterly Meeting, December 5-7, 2006

Thank you for allowing me to present remotely!