Upload
georgina-day
View
217
Download
0
Embed Size (px)
Citation preview
EMBL-EBI
Structural Proteomics Automatic
Target SelectionGordon Whamond
EMBL-EBI
Aim: • Provide a resource that facilitates the automatic selection of potential targets for protein structure determination while minimising human interaction with the software (if required).
Input: • Raw amino acid sequence• UniProt accession number• UniProt accession number and a sequence range
Output:• Query sequence showing possible domains• All candidates for structure determination• Recommendation for which sequence to use
Project Overview
EMBL-EBI
Considerations
• Is there a known structure?
• Are there Classified Structural (CATH, SCOP) Domains?
• Are there Known Sequence (Pfam) Domains?
• Are there Predicted Structural (Gene3D, Superfamily) Domains?
• Do Domain Boundaries Conform to Secondary Structure Restrictions?
• Which Species has a Representative Domain that is the Most Compactly Folded?
• The core implementation needs to be extendible and easily maintainable.
EMBL-EBI
The software is to be implemented using the Taverna workbench.
This is a tool that can be used to formulate the workflow and implement each of the processes as distributed web services.
Tom Oinn - http://taverna.sourceforge.net/
Taverna
Advantages: • Distributed computing reduces resource requirement.• Easily extendible system• Maintenance issues shifted to external providers
Disadvantages:• Learning curve• Convincing service providers to adopt a standard format• Maintenance issues shifted to external providers
EMBL-EBI
Taverna
The prototype workflow:
When it is expanded to show all of
the incorporated sub-workflows is
quite complex
Luckily Taverna can provide a top
level view.
EMBL-EBI
Taverna
EMBL-EBI
Dealing With DAS
EMBL-EBI
Taverna
EMBL-EBI
Process Data
Secondary Structure Elements:(Method not yet chosen)
Sequence Domains:Pfam, Gene3D, Superfamily etc
Protein Folding:RONN, FoldIndex, DisEMBL
Rank Target Selection:Based on loop lengths, folding predictions, etc
EMBL-EBI
Starting the Process
EMBL-EBI
Monitoring Progress
EMBL-EBI
Assess Data
EMBL-EBI
Review Results
EMBL-EBI
Extensibility
Java Services
• Straightforward to provide as a web service using Tomcat and Axis
• WSDL (describing the service) can be generated automatically
Legacy Software
• Any command line based tools can be wrapped into a web service using Soaplab
•For example the EMBOSS tools are already available
EMBL-EBI
Extensibility
Output Format:
To ensure generic service compatibility it helps to define a common
results format. As a result we are using the e-Family service schema
(http://www.efamily.org.uk/)
Current collaborators include:
The Weizmann Institute - FoldIndex
University of Oxford - RONN
EMBL-EBI
http://www.efamily.org.uk/software/dasclients/spice/
Results Viewers
EMBL-EBI
Conclusions
Taverna and Web Services:
• Taverna facilitates the provision of complex distributed systems that utilise web services
• This reduces maintenance overheads and keeps technology requirements at a reasonable level
• It is also easily extensible to accommodate new services
Availability:
• Hopefully the core system will be ready by the end of the year
• This will provide the basic workflow for users to customise according to their needs
EMBL-EBI
Acknowledgments
Thanks to:
Tom Oinn
Andreas Prlic
The RONN and FoldIndex teams
The MSD Group