Download ppt - BioMOBY 2005: Its working! Now What?!

BioMOBY 2005: Its working! Now What?!

Benjamin GoodWilkinson Laboratory

iCAPTURE Centre

University of British Columbia

http://bioinfo.icapture.ubc.ca/bgood









Acknowledgements

Mark Wilkinson , Edward Kawas, Nina Opushneva – iCAPTURE @ UBCPhillip Lord, Martin Senger – myGrid @ U Manchester

Heiko Schoof, Rebecca Ernst – MIPSPaul Gordon - University of Calgary

Carole Goble – myGrid @ U Manchester Lincoln Stein - CSHL

Damian Gessler, Andrew Farmer, Gary Schiltz - NCGRBill Crosby, Matthew Links, Luke McCarthy – U of S

Midori Harris – EBI & GO ConsortiumMike Niemi – IBM

Fiona Cunningham, Shuly Avraham – CSHLKen Stuebe – SDSC

Outline

• What BioMOBY is

• Why it was needed

• How it works

• What is being done with it now

• What might be next.

What BioMOBY is

A generic solution for sharing distributed computational resources

Why it was needed

High throughput Biology

SGDSGD

SGDSGD

SGDSGD

SGDSGD

Why it was needed


SGDSGD

SGDSGD

SGDSGD

TAIR

SGD

Why it was needed


SGDSGD

SGDSGDMIPS

NCGR

TAIR

SGD

Why it was needed


SGDSGD

SGDGO

MIPS

NCGR

TAIR

SGD

Why it was needed


SGDSGD

?!?!?

GO

MIPS

NCGR

TAIR

SGD

Integration?

DB1 Program DB2

Dis-

Moby DIC Meeting Sept. 2001

• Model Organism Bring Your Own Database Interface Conference

• All model organism databases invited

• Some could not attend because it happened right after September 11th

- BioMOBY project emerged from this meeting

Mark Wilkinson

finished the titleadded 'project' to the last sentence

Note the Target Audience

• Not NCBI• Small to medium sized resource providers

• First priority to support their own users• Limited time and money

• Makes certain options impossible• No massive data warehouse• No standardization of implementation

(database, programming language)

Outline

• What BioMOBY is


• How it works



The Moby plan

1. Design an ontological framework for data-type creation

2. Let independent service providers build data-types using this framework

3. Use these data-types to define web service interfaces.

4. Register these interfaces in a “yellow pages”

• Machines can find an appropriate service• Machines can execute that service

unattended

Object Ontology• Data types defined in an open, shared GO-

like ontology– Nodes define data Classes– Edges define the relationships between Classes

• Edges define one of three relationships– ISA

• Inheritance relationship• All properties of the parent are present in the child

– HASA• Container relationship of ‘exactly 1’

– HAS• Container relationship with ‘1 or more’

Mark Wilkinson

Changed order and layout of this slide

Data-typing is the key

• Each Object in the ontology maps to a simple, concise XML Schema

• This rigid yet easily extensible structure facilitates serialization and parsing in any language.

• Sharing a framework for creating data-types turns out to be largely sufficient to achieve interoperability

Mark Wilkinson

Changed bullet 2

The Simplest Data-Type<Object namespace=‘NCBI_gi’ id=‘111076’/>

Object

The combination of a namespace and an identifier within that namespace uniquely identify a data ‘entity’.

(Not its representation)

MOBY Primitives

Object

Integer

String

Float

DateTimeISA

ISA

ISA

ISA

<Integer namespace=‘’ id=‘’>38</Integer>

Mark Wilkinson

Does this slide render better on a Mac? It looks terrible in windows...

A MOBY Data-Type<VirtualSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer></ VirtualSequence >

Object

Integer

VirtualSequence

String

ISA

ISA

ISA

HASA

A MOBY Data-Type<GenericSequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”>

ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String></ GenericSequence >

Object

Integer

VirtualSequence

String

ISA

ISA

ISA

HASA

GenericSequence

ISA

HASA

A MOBY Data-Type<DNASequence namespace=‘NCBI_gi’ id=‘111076’> <Integer namespace=‘’ id=‘’ articleName=“length”>38</Integer> <String namespace=‘’ id=‘’ articleName=“SequenceString”>

ATGATGATAGATAGAGGGCCCGGCGCGCGCGCGCGC </String></ DNASequence >

Object

Integer

VirtualSequence

String

ISA

ISA

ISA

HASA

GenericSequence

ISA

HASA

DNASequence

ISA

A portion of the MOBY-SObject Ontology

…community-built!

137 registered by 34 authorities

Gene names

MOBY CentralregistrySequence

Express. Protein Alleles…

How it worksService Providers

Client

Mark Wilkinson

Fixed some of the borders to clean it up... possibly a MS->Mac rendering problem? Check that it looks right for you now.

• What BioMOBY is


• How it works



Outline

Moby Stats

• Mailing list count 162 members

• Google Scholar – ‘BioMOBY’ 103– Citations of original BioMOBY paper

52

• Google links to biomoby.org 322

Deployed Moby Services

http://castor.brc.mcw.edu/files/mobysphere/

> 10 < 10

Thanks to Simon Twigger

• Services registered 272 total, 249 non-test• Services developers (by contact email) 69• Budget - US$230,000 3 years

Mark Wilkinson

Interestingly, this wont render for me. You had better make sure that it renders on the machine you are using in Virginia...

Major Implementations

• PlaNet consortium– European consortium of plant

databases– 121 Services

• National Bioinformatics Institute of Spain– Nationwide initiative – 35 Services

• CGIAR-GCP & ACPFG

Mark Wilkinson

Perhaps send a message to the mailing list to get this information

Registry use 2004-2005

0

50000

100000

150000

200000

250000

300000

350000

400000

Jan-04Feb-04Mar-04Apr-04May-04Jun-04Jul-04Aug-04Sep-04Oct-04Nov-04Dec-04Jan-05Feb-05Mar-05Apr-05

Requests to Moby

Requests

Month

PlaNet implementsseparate Moby registry

It seems to be working! Why?

• It provides useful functionality for the target audience.

• Functionality not currently available from any other WS/SWS project

• It is not difficult to deploy services.

Mark Wilkinson

Formatting changes

Is it useful outside of these consortia?

• Many public services now available (via passive altruism).

• As a result, interesting clients are emerging.

Client style 1,2,3

1. Power User when you want to do what you already know how to do

– Taverna• Produced by the myGrid Consortium• Graphical workflow composer and

invoker• Supports BioMOBY services (and

many others)

Mark Wilkinson

Formatting and font changes

Taverna

Client style 1,2,3

2. Quick and Dirty You know what you have and what you want, but you don’t know how to make it happen

– MobyGraphs • Martin Senger of myGrid• Discovers service connectivity between two

datatypes

– PlaNet Service Aggregator• Precomputes all possible workflows starting

from a single input

Client style 1,2,3

3. Exploration Mode

– Gbrowse_moby– Ahab

Starting Data

Ahab

• Java Server Pages• Simultaneous service invocations• Session stored as RDF graph• Results displayed with clickable

graph.

• 0_1 Runs all possible services• 0_2 Gives user control

http://bioinfo.icapture.ubc.ca/bgood/Ahab.html

• What BioMOBY is


• How it works



Outline

Current Development

1. Make service development even easier

2. Expand myGrid collaboration– Migrate to their registry & service

ontology– Enhance support for BioMOBY in

Taverna• Validation of workflows• Workflow construction “wizards”

3. Continue Development of Ahab– Visualization

Current Research

1. mySWeb– “Ishmael” MOBY exploration tool– Unattended construction of a

personalized semantic web centered around user requests

2. Minimally curated community ontology construction

– It can work– How can we use and improve the

process

Mark Wilkinson

Not offline :-)

Summary

• BioMOBY was designed to allow distributed communities to share their computational resources, it seems to be working

• Many new opportunities for real distributed data integration are starting to appear

• New ways of thinking about the Semantic Web are arising!

Conclusion

If the Service Web and the Semantic Web are to succeed as the WWW has, the end-users and the novice developers must be able to contribute easily

BioMOBY is working because it makes this possible

Sponsors

BioMOBY

BC Bioinformatics Training Program

National Science Foundation (NSF), USACanadian Bioinformatics Resource, NRC, Halifax

Open-Bio FoundationIBM

http://biomoby.org


Contact

http://biomoby.org/