19
Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Environment from the Molecular Level: An e-science project for modelling the atomistic processes involved in environmental issues (funded by NERC)

Embed Size (px)

Citation preview

Environment from the Molecular Level:

An e-science project for modelling the atomistic processes involved in environmental issues

(funded by NERC)

Radioactive waste disposal

Crystal growth and scale inhibition

Pollution: molecules and atoms on mineral surfaces

Crystal dissolution and weathering

Molecular Environmental Issues

Rocks and Mineral StructuresRadioactive waste disposal

Crystal growth and scale inhibition

Pollution: molecules and atoms on mineral surfaces

Crystal dissolution and weathering

The “Grand Challenge”.

Level of theory

Adsorbing surface

Contaminant

Quantum Monte Carlo

Large empirical models

Linear-scaling quantum mechanics

Organic molecules

HalogensMetallic elements

Cla

ys,

mic

as

Alu

min

osili

cate

s

Nat

ura

l org

anic

mat

ter

Pho

sph

ates

Car

bona

tes

Oxi

des/

hydr

oxid

es

Sul

phi

des

Requires scientists to work together in teams - a Virtual Organisation

DesignApproach taken:

– Over approx 3 years we have engaged in many workshops, tutorials and prototyping with developers and users. Teaching users what e-Science can “do for them”, including security.

• Cooperation between CCLRC and NIEeS in Cambridge.

– Planned to integrate together some tools which had already been developed/ prototyped at CCLRC, UCL and Reading.

• A service-oriented approach is used for certain aspects: Grid, data management, user interfaces, metadata management. Workflow was found to be important to users, e.g. for combinatorial studies.

• Several iterations of software have enabled some usability issues to be addresses.

– Originally envisaged an “Integrated Portal Architecture” linking HPCPortal, DataPortal and visualisation services.

• We thought we knew what users would like, but actually they preferred a simpler incremental approach;

• Workflow scripting was preferred to a single portal. There are now several separate tools in use.

E-Minerals Portal

Technical Strategy

• Technology considerations:

– Considered: Globus GT2, SRB, Harness, CCF, Portal, Web services, visualisation tools

• Various tool sets were tried and the users “voted with their feet”

– Used: Globus, Condor, SRB, AG, MAST, RCommands, Metadata Editor, Workflow scripts, Web services, XML/ RDF/ OWL for data interoperability.

• Infrastructure

– E-Minerals “mini-Grid” was a great success, based on earlier work at Daresbury and Manchester on Grid evaluation. Mini-Grid focuses resources of the e-Minerals VO and includes large campus Condor pools and parallel computers. Using Globus, Condor and GSI. Data managed using SRB.

• Collaboration tools

– Access Grid, MAST, Wiki

Integrated Portal Architecture

Generic portal design using Globus and Web Services:

Visualisation

DataPortal

HPCPortal

HPC Systems

Data Systems

Web Services

Web Services

Web Services

Working with GGF Grid Computing Environments Research Group

GridFTP

GSI

Globus

Development Issues

• Constraints and other issues:– Project divided from outset into:

• development team; • application team; • science team.

– All teams work together and collaborate on papers– Tools written in C to integrate with existing “heritage”

applications, e.g. from the Collaborative Computational Projects (CCPs)

– Other interoperability issues addressed using Web services, e.g. gSOAP (client) +AXIS (server), XML-based data models and Semantic Grid technologies RDF+OWL

– Constraints: short term goals, no prior experience of e-Science, new technology must not disrupt current work.

– High requirements on computing resources for simulation studies• This lead to a focus on workflows for repeated calculations, data

management for storing and retrieving results, semantic Web technologies for data interoperability between codes

Evaluation• Papers presented at All Hands 2005 included:

– E-Science Usability: the e-Minerals Experience (paper 425)

– The e-Minerals Project: Developing the Concept of the Virtual Organisation to support Collaborative Work on Molecular-scale Environmental Simulations (paper 518)

• User engagement and evaluation:

– Looked at the Usability Task Force metrics.

– Our approach did not readily map onto them, but there are overlaps

– Key: understand the science users, their needs, and their natural ways of working.

– Good and bad points summarised on next slides

Lessons LearntWhat was usable?

– Keep it simple – use effective lightweight tools for the job

– Condor and Globus – Condor job scripts were accepted readily. Condor-G and DAGMan now used. RSL also embedded in scripts.

– SRB – required little training and was found to be useful, SCommands in scripts.

– Resource Management – Globus-based resource-monitoring tool was developed (in the Portal). A meta-scheduler is being developed.

– Security – GSI proved “easy for users to work with”. The Portal uses MyProxy to ensure pervasive access. Certificates were not a problem – we offered training from Day 1.

– Collaboration tools – desktop use of AG enables ad hoc meetings + MAST (Multi-cast Application Sharing Tool). Wiki and Instant Messaging also used.

– Semantic technologies. CML was initially used with XSLT and SVG. This now extended in the AgentX toolkit.

Lessons LearntWhat was not usable?

– Client tools * – installation has caused difficulties, e.g. Globus. Initially used “submit machines”. Solutions investigated include:

• Portal – hides the complexity behind a Web interface, user doesn’t install anything;

• Web service interfaces – for Condor (Chapman et al.), GROWL for Globus and SRB (Allan et al.);

• BPEL interface – work at UCL/ OMII – plug-in for Eclipse.

– Firewall issues – for both users and infrastructure – changes to rules lead to instability. Portal and Web services solve this problem for users.

– Meta-data – tools are available, but automatic harvesting required to avoid mistakes. RCommands developed to improve this, can be linked into the workflow scripts.

* A recent workshop “Lightweight Grid Computing” was held 2-3/5/06 at Losehill Hall. Attendees from GROWL, RealityGrid, Imperial College, e-Minerals, e-CCP… Transcript of discussions on usability issues is available giving more detailed information.

Future PlansCurrent and Future development plans:

– New tools are being developed, for instance recently the meta-data editor and RCommands were added to the suite .

– AgentX data-interoperability tools have been added from e-CCP extending the use of CML. Such work is now timely and illustrates how existing large codes, e.g. Siesta and GULP from CCP5 can be integrated easily with visualisation tools.

– Development staff also work on other projects and with other developers. E-Minerals tools are now being evaluated in other areas, e.g. Integrative Biology and e-CCP. There are key synergies and critical mass, sharing of experiences and code/ services.

– Full integration via a portal interface was not initially wanted, and also could not be achieved at the start of the project as the technology was not adequate (we tried PHP, now have JSR-168). This is now being re-visited as it provides a good solution to many of the problems highlighted.

– Re-usable portlet-based tools from the NGS Portal can be re-used, already done for Integrative Biology and other projects. Can be combined with Wiki etc.

Some following slides show more details of some of the tools.

Blatant advert: Portals and Portlets 2006 http://www.nesc.ac.uk/esi/events/686/

MOLECULE

“Mol_frag_id”

ATOM

“Atom_frag_id”

xCoordinate

“xCoor_frag_id”

locator

locator

locator

O

0.000

0.000

0.000

H

0.000

0.757

0.587

H

0.000

-0.757

0.587

AgentX Framework - OverviewSpecify how to locate data (XML, CML, XLink) with a particular meaning

Applications can use tools (AgentX library) that work with the specification to obtain information

Classes and properties of entities are specified in an ontology(OWL, RDF/ XML)

Mappings (RDF/ XML) associate classes and properties with fragment identifiers(XPointer)

Fragment identifiers can be used to locate logical collections (classes) and data items (properties)

Ontology Mappings Data

AgentX Framework - Example

CONTROL

CONFIG.xml

Mappings

DL_POLY3

AgentX

core

Fortran

wrapper

Standard

Ontology

Standard

Mappings

AgentX

core

Python

wrapper

REVCON.xml

Mappings

CCP1 GUI

DL_POLY3 (CCP5) integrated with CCP1 GUI

AgentX

- Core library written in C

- Wrappers for Python, Perl and Fortran

- Hides the complexities of dealing with XML

- Simple API

- Enables straightforward exchange of information

RCommands

• RCommands are shell tools and associated Web services for meta-data manipulation

• RCommands primary use case is within e-Minerals workflow, i.e. to allow automatic insertion of meta-data as a post processing action

Function Domain RCommand

Authentication / Session

Rinit

Rexit

Rpasswd

Entity Operations

Rls

Rcreate

Rrm

Parameter Operations

Rannotate

Rsearch

Permissions Rchmod

RCommands Service-based Arch

RCommands

gSOAP

RCommand Server Code

JDBC

Axis

Relational Database

Client Side

Server Side

BPEL Engine

SOAP

Link into workflows

Subset of Schema

Name Value Pairs

• Title• Description• Notes• Start / End Dates• Originator

• Name• Description

• Name• URI

Royal Institution

University of

Reading