49
A new language for a new biology: How SBML and other tools are transforming models of life Michael Hucka, Ph.D. Department of Computing + Mathematical Sciences California Institute of Technology Pasadena, CA, USA Victorian Systems Biology Symposium, Australia, August 2013 Email: [email protected] Twitter: @mhucka

A new language for a new biology: How SBML and other tools are transforming models of life

Embed Size (px)

DESCRIPTION

Presentation given at the Victorian Systems Biology Symposium (http://www.emblaustralia.org/About_us/news/mike-hucka.aspx) at the Walter and Eliza Hall Institute in Melbourne, Australia, on 20 August 2013.

Citation preview

Page 2: A new language for a new biology: How SBML and other tools are transforming models of life

Outli

ne

Background and introduction

The Systems Biology Markup Language (SBML)

Complementary efforts: MIRIAM and SED-ML

COMBINE: the Computational Modeling in Biology Network

Conclusion

Page 3: A new language for a new biology: How SBML and other tools are transforming models of life

Outli

ne

Background and introduction

The Systems Biology Markup Language (SBML)

Complementary efforts: MIRIAM and SED-ML

COMBINE: the Computational Modeling in Biology Network

Conclusion

Page 4: A new language for a new biology: How SBML and other tools are transforming models of life

Research today: experimentation, computation, cogitation

Page 5: A new language for a new biology: How SBML and other tools are transforming models of life

“ The nature of systems biology”Bruggeman & Westerhoff,

Trends Microbiol. 15 (2007).

Page 6: A new language for a new biology: How SBML and other tools are transforming models of life

Large-scale integrative models are growing

Page 7: A new language for a new biology: How SBML and other tools are transforming models of life

Many models have traditionally been published this way

Problems:

• Errors in printing

• Missing information

• Dependencies onimplementation

• Outright errors

• Can be a hugeeffort to recreate

Is it enough to communicate the model in a paper?

Page 8: A new language for a new biology: How SBML and other tools are transforming models of life

Is it enough to make your (software X) code available?It’s vital for good science:

• Someone with access to the same software can try to run it, understand it, verify the computational results, build on them, etc.

• Opinion: you should always do this in any case

Page 9: A new language for a new biology: How SBML and other tools are transforming models of life

Is it enough to make your (software X) code available?It’s vital for good science—

• Someone with access to the same software can try to run it, understand it, build on it, etc.

• Opinion: you should always do this in any case

But it’s still not ideal for communication of scientific results:

• Doesn’t necessarily encode biological semantics of the model

• What if they don’t have access to the same software?

• What if they don’t want to use that software?

• What if they want to use a different conceptual framework?

• And how will people be able to relate the model to other work?

Page 10: A new language for a new biology: How SBML and other tools are transforming models of life

Different tools ⇒ different interfaces & languages

Page 11: A new language for a new biology: How SBML and other tools are transforming models of life

Outli

ne

Background and introduction

The Systems Biology Markup Language (SBML)

Complementary efforts: MIRIAM and SED-ML

COMBINE: the Computational Modeling in Biology Network

Conclusion

Page 12: A new language for a new biology: How SBML and other tools are transforming models of life

SBML: a lingua fra

nca

for software

Page 13: A new language for a new biology: How SBML and other tools are transforming models of life

Format for representing computational models of biological processes

• Data structures + usage principles + serialization to XML

• (Mostly) Declarative, not procedural—not a scripting language

Neutral with respect to modeling framework

• E.g., ODE, stochastic systems, etc.

Important: software reads/writes SBML, not humans <Beginning of SBML model definition>

List of function definitionsList of unit definitionsList of compartments

List of molecular speciesList of parameters

List of rulesList of reactions

List of events<End of SBML model definition>

SBML = Systems Biology Markup Language

Page 14: A new language for a new biology: How SBML and other tools are transforming models of life

The raw SBML (as XML)

Page 15: A new language for a new biology: How SBML and other tools are transforming models of life

The process is central

• Literally called a “reaction” in SBML

• Participants are pools of entities (biochemical species)

Models can further include:

• Compartments

• Other constants & variables

• Discontinuous events

• Other, explicit math

Core SBML concepts are fairly simple

• Unit definitions

• Annotations

Page 16: A new language for a new biology: How SBML and other tools are transforming models of life

SBML is now widely used

Dozens of journals accept models in SBML format

100’s of software tools available today

1000’s of models available in SBML format today

0

100

200

300

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

254+ today

Page 17: A new language for a new biology: How SBML and other tools are transforming models of life

Contents of BioModels DatabaseContents today:

• 142,000+ pathway models (converted from KEGG)

• 460+ hand-curated quantitative models

• 460+ non-curated quantitative models

8%2%

3%6%

6%

7%

8%

9%24%

27%

signal transductionmetabolic processmulticelullar organismal processrhythmic processcell cyclehomeostatic processresponse to stimuluscell deathlocalizationothers (e.g., developmental process)

Database data from 2013

Page 18: A new language for a new biology: How SBML and other tools are transforming models of life

Free software libraries – libSBMLReads, writes, validates SBML

Can check & convert units

Written in portable C++

Runs on Linux, Mac, Windows

APIs for C, C++, C#, Java, Octave, Perl, Python, R, Ruby, MATLAB

Well documented API

Open-source (LGPL)

http://sbml.org/Software/libSBML

Page 19: A new language for a new biology: How SBML and other tools are transforming models of life

Evolution of SBML continuesToday: SBML Level 3

• Level 3 Core provides framework for common models

• Level 3 packages add additional constructs to the Core

Page 20: A new language for a new biology: How SBML and other tools are transforming models of life

Level 3 package What it enablesHierarchical model composition Models containing submodels ✔

Flux balance constraints Constraint-based models ✔

Qualitative models Petri net models, Boolean models ✔

Graph layout Diagrams of models ✔

Multicomponent/state species Entities w/ structure; also rule-based models draft

Spatial Nonhomogeneous spatial models draft

Graph rendering Diagrams of models draft

Groups Arbitrary grouping of components draft

Distributions Numerical values as statistical distributions in dev

Arrays & sets Arrays or sets of entities in dev

Dynamic structures Creation & destruction of components in dev

Annotations Richer annotation syntax

Status

Page 21: A new language for a new biology: How SBML and other tools are transforming models of life

National Institute of General Medical Sciences (USA) European Molecular Biology Laboratory (EMBL)JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)JST ERATO-SORST Program (Japan)ELIXIR (UK)Beckman Institute, Caltech (USA)Keio University (Japan)International Joint Research Program of NEDO (Japan)Japanese Ministry of AgricultureJapanese Ministry of Educ., Culture, Sports, Science and Tech.BBSRC (UK)National Science Foundation (USA)DARPA IPTO Bio-SPICE Bio-Computation Program (USA)Air Force Office of Scientific Research (USA)STRI, University of Hertfordshire (UK)Molecular Sciences Institute (USA)

SBML funding sources over the past 13+ years

Page 22: A new language for a new biology: How SBML and other tools are transforming models of life

Outli

ne

Background and introduction

The Systems Biology Markup Language (SBML)

Complementary efforts: MIRIAM and SED-ML

COMBINE: the Computational Modeling in Biology Network

Conclusion

Page 23: A new language for a new biology: How SBML and other tools are transforming models of life

Modelers want to use their own conventions

Page 24: A new language for a new biology: How SBML and other tools are transforming models of life

Modelers want to use their own conventions

No standard identifiers

Page 25: A new language for a new biology: How SBML and other tools are transforming models of life

Modelers want to use their own conventions

Low info content

No standard identifiers

Page 26: A new language for a new biology: How SBML and other tools are transforming models of life

Raw models alone are insufficient

Need standard schemes for machine-readable annotations

• Identify entities

• Mathematical semantics

• Links to other data resources

• Authorship & pub. info

Modelers want to use their own conventions

Low info content

No standard identifiers

Page 27: A new language for a new biology: How SBML and other tools are transforming models of life

Addresses 2 general areas of annotation needs:

MIRIAM is not specific to SBML

MIRIAM (Minimum Information Requested In the Annotation of Models)

Requirements for reference correspondence

Scheme for encoding annotations

Annotations for attributing model creators & sources

Annotations for referring to external

data resources

Page 28: A new language for a new biology: How SBML and other tools are transforming models of life

Addresses 2 general areas of annotation needs:

MIRIAM is not specific to SBML

MIRIAM (Minimum Information Requested In the Annotation of Models)

Requirements for reference correspondence

Scheme for encoding annotations

Annotations for attributing model creators & sources

Annotations for referring to external

data resources

Annotations for referring to external

data resources

Page 29: A new language for a new biology: How SBML and other tools are transforming models of life

Example of a problem that can be solved with annotations

http://www.ebi.ac.uk/chebi

Low info content

Page 30: A new language for a new biology: How SBML and other tools are transforming models of life

Example of a problem that can be solved with annotations

http://www.ebi.ac.uk/chebi

Low info content

Known by different names – do you want to write all of

them into your model?

salicylic acid

Page 31: A new language for a new biology: How SBML and other tools are transforming models of life

MIRIAM annotations for external referencesGoal: link model constituents to corresponding entities in bioinformatics resources (e.g., databases, controlled vocabularies)

• Supports:

- Precise identification of model constituents

- Discovery of models that concern the same thing

- Comparison of model constituents between different models

MIRIAM approach avoids putting data content directly in the model

• Instead, it points at external resources that contain the data

Page 32: A new language for a new biology: How SBML and other tools are transforming models of life

How do we create globally unique identifiers consistently?Long story short—developed by the Le Novère group at the EBI

• Resource identifiers (URIs) combine 2 parts:

• There’s a registry for namespaces: MIRIAM Registry

- Allows people & software to use same namespace identifiers

• There’s a URI resolution service: MIRIAM Resources & identifiers.org

- Allows people & software to take a given identifier and figure out what it points to

namespace entity identifier{ {

Identifies a dataset Identifies a datumwithin the dataset

Page 33: A new language for a new biology: How SBML and other tools are transforming models of life

Another problem: software can’t read figure legends

?

BIOMD0000000319 in BioModels Database

Decroly & Goldbeter, PNAS, 1982

Page 34: A new language for a new biology: How SBML and other tools are transforming models of life

SED-ML = Simulation Experiment Description MLApplication-independent format

• Captures procedures, algorithms, parameter values

Can be used for

• Simulation experiments encoding parametrizations & perturbations

• Simulations using more than one model and/or method

• Data manipulations to produce plot(s)

http://sedml.org

Simulation

Model

Task Data generators

Reports

Page 35: A new language for a new biology: How SBML and other tools are transforming models of life

Efforts like SED-ML improve reproducibility of publications

Waltemath et al., BMC Sys Bio 5, 2011.

Page 36: A new language for a new biology: How SBML and other tools are transforming models of life

Outli

ne

Background and introduction

The Systems Biology Markup Language (SBML)

Complementary efforts: MIRIAM and SED-ML

COMBINE: the Computational Modeling in Biology Network

Conclusion

Page 37: A new language for a new biology: How SBML and other tools are transforming models of life

Need interoperable formats, but developing them is not easyNeed people with diverse set of knowledge & skills

• Scientific needs

• Technical implementation skills

• Practical experience

Need manage multiple phases of a standardization effort

• Creation

• Evolution

• Support

Page 38: A new language for a new biology: How SBML and other tools are transforming models of life

Need interoperable formats, but developing them is not easyNeed people with diverse set of knowledge & skills

• Scientific needs

• Technical implementation skills

• Practical experience

Need manage multiple phases of a standardization effort

• Creation

• Evolution

• Support} This is just for the specification of the

standards, to say nothing of the necessary software and other infrastructure!

Page 39: A new language for a new biology: How SBML and other tools are transforming models of life

Realizations about the state of affairs in late-2000’s

• Many standardization efforts overlapped, but lacked coordination

• Efforts were inventing their own processes from scratch

• Many individual meetings meant more travel for many people

• Limited and fragile funding didn’t support solid, coherent base

COMBINE = Computational Modeling in Biology Network

• Coordinate standards development

• Develop common procedures & tools (but not impose them!)

• Coordinate meetings

• Provide a recognized voice

Motivations for the creation of COMBINE

Page 40: A new language for a new biology: How SBML and other tools are transforming models of life

Standardization efforts represented in COMBINE today

BioPAX

Qualifiers

GPML

COMBINE Standards

Associated Standardization Efforts

Related Standardization Efforts

Page 41: A new language for a new biology: How SBML and other tools are transforming models of life

COMBINE formats cover many types of models– from Nicolas Le Novère

Page 42: A new language for a new biology: How SBML and other tools are transforming models of life

Examples of community organizationTwo main annual meetings, plus ad hoc workshops

• COMBINE meeting: status updates, presentations, outreach

- Next COMBINE: Paris, Sep 16–20, 2013

• HARMONY: Hackathon on Resources for Modeling in Biology

- Software development, interoperability hacking

COMBINE 2012, TorontoCOMBINE 2011, Heidelberg

Page 43: A new language for a new biology: How SBML and other tools are transforming models of life

COMBINE is open to all—and COMBINE needs you!

http://co.mbine.org

Current coordinators:

• Nicolas Le Novère, Mike Hucka, Falk Schreiber, Gary Bader

Page 44: A new language for a new biology: How SBML and other tools are transforming models of life

Outli

ne

Background and introduction

The Systems Biology Markup Language (SBML)

Complementary efforts: MIRIAM and SED-ML

COMBINE: the Computational Modeling in Biology Network

Conclusion

Page 45: A new language for a new biology: How SBML and other tools are transforming models of life

Time it well

• Too early and too late are bad

Start with actual stakeholders

• Address real needs, not perceived ones

Start with small team of dedicated developers

• Can work faster, more focused; also avoids “designed-by-committee”

Engage people constantly, in many ways

• Electronic forums, email, electronic voting, surveys, hackathons

Make the results free and open-source

• Makes people comfortable knowing it will always be available

Be creative about seeking funding

Some things we (maybe?) got right with SBML

Page 46: A new language for a new biology: How SBML and other tools are transforming models of life

Not waiting for implementations before freezing specifications

• Sometimes finalized specification before implementations tested it

- Especially bad when we failed to do a good job

‣ E.g., “forward thinking” features, or “elegant” designs

Not formalizing the development process sufficiently

• Especially early in the history, did not have a very open process

Not resolving intellectual property issues from the beginning

• Industrial users ask “who has the right to give any rights to this?”

Some things we certainly got wrong

Page 47: A new language for a new biology: How SBML and other tools are transforming models of life

Nicolas Le Novère, Henning Hermjakob, Camille Laibe, Chen Li, Lukas Endler, Nico Rodriguez, Marco Donizelli, Viji Chelliah, Mélanie Courtot, Harish Dharuri

Attendees at SBML 10th Anniversary Symposium, Edinburgh, 2010

John C. Doyle, Hiroaki Kitano

Mike Hucka, Sarah Keating, Frank Bergmann, Lucian Smith, Andrew Finney, Herbert Sauro, Hamid Bolouri, Ben Bornstein, Bruce Shapiro, Akira Funahashi, Akiya Juraku, Ben Kovitz

Original PI’s:

SBML Team:

SBML Editors:

BioModels DB:

Mike Hucka, Nicolas Le Novère, Sarah Keating, Frank Bergmann, Lucian Smith, Chris Myers, Stefan Hoops, Sven Sahle, James Schaff, Darren Wilkinson

And a huge thanks to many others in the COMBINE community

This work was made possible thanks to a great community

Page 48: A new language for a new biology: How SBML and other tools are transforming models of life

SBML http://sbml.org

BioModels Database http://biomodels.net/biomodels

MIRIAM http://biomodels.net/miriam

identifiers.org http://identifiers.org

SED-ML http://biomodels.net/sed-ml

SBO http://biomodels.net/sbo

SBGN http://sbgn.org

COMBINE http://co.mbine.org

URLs

Page 49: A new language for a new biology: How SBML and other tools are transforming models of life

I’d like your feedback!You can use this anonymous form:

http://tinyurl.com/mhuckafeedback