Upload
fairdom
View
903
Download
0
Embed Size (px)
Citation preview
http://sems.uni-rostock.de
Dagmar WaltemathSeptember 2015, Rostock-Warnemünde | dcite
Reproducibility of model-based results:Standards, infrastructure and recognition
http://sems.uni-rostock.de
What is a model?
Fig.: Modeling Cellular Reprogramming Using Network-based Models. Courtesy Antonio del Sol Mesa, LCSB Luxembourg
Fig.: Modeling the cell cycle using ODE systems. Goldbeter (1991), http://www.ncbi.nlm.nih.gov/pubmed/1833774
Fig.: Modeling large-scale networks. Lee et al (2013), http://www.nature.com/articles/srep02197.
2In systems biology, a computational model represents biological facts in
the computer. Often, the representation is simulated to help understand
the system's dynamic behavior.
http://sems.uni-rostock.de
Re[usea|produci]bility challenge
3
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
http://sems.uni-rostock.de
Re[usea|produci]bility challenge
4
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
“With greater interaction between tools, anda common format for publications and databases, userswould be better able to spend more time on actual research
rather than on struggling with data format issues.”
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2003)
5
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
“With greater interaction between tools, anda common format for publications and databases, userswould be better able to spend more time on actual research
rather than on struggling with data format issues.” (SBML L1)
http://sems.uni-rostock.de
→ Standardised model representation
6
Ron Henkel et al. Database 2015;2015:bau130
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2010)
7
Fig.: Nature Blogs: Of Schemes and Dreams (2014)
Nine Worrying Stats on the Effect of Poor Scientific Data Management
Vijayalakshmi Chelliah et al. Nucl. Acids Res. 2015;43:D542-D548
Finding relevant models.
http://sems.uni-rostock.de
→ Strategies for model similarity, ranking, clustering, filtering
Fig.: Henkel et al 2010 http://www.biomedcentral.com/1471-2105/11/423/
Fig.: Schulz et al 2011 DOI: 10.1038/msb.2011.41
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
CellCycle Models
x x x x x x
x x x x x
x x
x
x x x x
x x x
x
x x x x
x x x
x
x
x x x x
x
x x x x
x x x
x x
x x
x x x
x x
x x x x
x x x
x
x x x x x
x x x x x x
x x x x x x
x x x x x x x
x x x x x
x x x x x
x x x x
x x x x x x x
x x x x x x
x x x x x x x x x
x x x x x x
x x
Fig.:Alm et al (2014) doi:10.1186/s13326-015-0014-4
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2012)
Reproducing published models.
http://sems.uni-rostock.de
→ Standardised simulation descriptions
Fig.:Waltemath et al (2012) doi:10.1186/1752-0509-5-198
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2014)
Model-related data in the systems biology workflow
Linking the relevant files.
http://sems.uni-rostock.de
→ Retrieval and archiving of simulation studies and asssociated files
Model-related data in the systems biology workflow
Linking model-related data
Give me all the files I need to run this simulation study.
Which are the most frequently used GO annotations in my model set?
Which models contain reactions with 'ATP' as reactant and 'ADP' as product?
Find good candidates for features describing my set of
models.
http://sems.uni-rostock.de
State of affairs in 2015
● Standards:
– support for all steps of the modeling cycle
– support of various modeling techniques
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review))
● Infrastructures:
– Software tools export/import standards
– Open model repositories and management systems
– Education
● Recognition
http://sems.uni-rostock.de
COMBINE Standards
● COmputational Modeling in BIology Network
● Goals:– Avoid overlap of standardisation efforts– Coordinate standard developments– Coordinate meetings – Coordinate development of procedures & tools– common infrastructure for specification development, semantic
annotation, and dissemination
● All specifications now citable and accessible in one place: Schreiber et al. (2015) http://journal.imbio.de/articles/pdf/jib-258.pdf
http://sems.uni-rostock.de
COMBINE Standards
Fig. : COMBINE standards today. Slide courtesy M. Hucka. http://www.slideshare.net/thehuck/a-summary-of-various-combine-standardization-activities
http://sems.uni-rostock.de
COMBINE Standards
● Data formats– Community-developed representation formats for models and
related data– Format: XML, OWL, RDF/XML
● Minimum Information/Reporting guidelines:– Minimum amount of data and information required reproduce
and interpret an experiment– Format: human-readable specification documents
● Basis for the specification of data models and metadata● Bio-ontologies
http://sems.uni-rostock.de
SBML
Fig.: SBML Level 3 Packages. Slide courtesy M. Hucka (ICSB 2014).
http://sems.uni-rostock.de
SBML
Fig.: SBML Level 3 Packages. Slide courtesy M. Hucka (ICSB 2014).
Lucky modelers: You should not need to worry about the details of these (XML) formats, the tools should handle import and export! (Tool developers should though.)
http://sems.uni-rostock.de
Minimum Information Guidelines
● Reporting guidelines and checklists● Narrative description of the information necessary to
reproduce a model-based result● MIRIAM: Minimum Information about the Annotation of a
Model● MIASE: Minimum Information about a Simulation Experiment● MIAPE,MIAME… for experimental setups
http://sems.uni-rostock.de
MIRIAM – information to provide about a model
● Models must– be encoded in a public machine readable format– be clearly linked to a single publication– reflect the structure of the biological processes described in the
reference paper (list of reactions, …)– be instantiable in a simulation (possess initial conditions, …)– be able to reproduce the results given in the reference paper– contain creator’s contact details– unambiguously identify each model constituent through annotation
http://sems.uni-rostock.de
MIRIAM – information to provide about a model
● Models must– be encoded in a public machine readable format– be clearly linked to a single publication– reflect the structure of the biological processes described in the
reference paper (list of reactions, …)– be instantiable in a simulation (possess initial conditions, …)– be able to reproduce the results given in the reference paper– contain creator’s contact details– unambiguously identify each model constituent through annotation
You should worry about the details of the guidelines, as they help you to check whether you provide all necessary information.
http://sems.uni-rostock.de
Bio-ontologies for model annotation
● Major ontologies
● Linking framework: RDF/XML
● Annotation scheme: used to semantically enrich model files with detailed descriptions of the underlying biological entities, mathematical concepts or algorithms used during analysis
● De facto standard: SBML annotation scheme
http://sems.uni-rostock.de
Bio-ontologies for model annotation
enzyme
enzyme
product
substrate
enzymatic rate law
catalytic rate constant
urn:miriam:SBO:0000011urn:miriam:SBO:0000014
urn:miriam:SBO:0000014
urn:miriam:SBO:0000025
urn:miriam:SBO:0000015
http://sems.uni-rostock.de
Bio-ontologies for model annotation
Tyrosine
Phenylalanine-4-hydroxylase
Phenylalanine-4-hydroxylase
Tetrahydrobiopterin
urn:miriam:uniprot:P00439
urn:miriam:uniprot:Q03393
urn:miriam:uniprot:P07101
urn:miriam:uniprot:P00439
http://sems.uni-rostock.de
Levels of standardisation
Fig.: COMBINE standards that are relevant to this workshop; adapted from (Chelliah et al., 2009, DILS)
http://sems.uni-rostock.de
State of affairs in 2015
● Standards:
– support for all steps of the modeling cycle
– support of various modeling technique
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review))
● Infrastructures:
– Software tools export/import standards
– Open model repositories and management systems
– Education
● Recognition
http://sems.uni-rostock.de
Software tool support
● Standard converters (SBML ↔ SBGN; SBML ↔ CellML...)● Standard support in software● Interoperability tools
– Cytoscape for network analysis and visualization (SBML, SBGN, BioPax)
– The Virtual Cell for modeling (SBML, BioPAx)– VANTED for network analysis, visualization and manipulation
(SBML, SBGN)Check COMBINE Website
for details
http://sems.uni-rostock.de
Software tool support in SBML
Fig.: Software supporting SBML. Slide courtesy M. Hucka (ICSB 2014).
Also check the SBML Software Matrix
http://sems.uni-rostock.de
Open model repositories
● Structured, type-specific archives● Offer download of curated, annotated, published models
and associated files (visual representations, simulation descriptions, publication…)
CCDB
http://sems.uni-rostock.de
Model management systems
Fig.: The SEEK. Wolstencroft et al (2015). doi:10.1186/s12918-015-0174-y
Model management tasks:● Storage & Integration of data● Search & Retrieval ● Version Control● Provenance
http://sems.uni-rostock.de
Getting involved
● COMBINE user meeting→ next: COMBINE 2015, OCT 11-16, Salt Lake City
● COMBINE developers meeting → next: HARMONY 2016, June 7-11, Auckland
● FAIR-DOM activities: webinars, blogs, foundries● COMBINE activities: workshops, presentations, tutorials● Help through specification documents, show cases, mailing
lists, ...
http://co.mbine.org/ http://fair-dom.org/
http://sems.uni-rostock.de
State of affairs in 2015
● Standards:
– support for all steps of the modeling cycle
– support of various modeling technique
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review))
● Infrastructures:
– Open model repositories
– Software tools export/import standards
– Model management systems
– Education
● Recognition
http://sems.uni-rostock.de
Recognition
33
1) Higher visibility of research
2) Long-term availability
3) Link to other resources
4) Quality-checks
Fig.: Piwowar and Vision (2013) Data reuse and the open data citation advantage. PeerJ
http://sems.uni-rostock.de
Model curation and publication in BioModels Database
Fig.: Li et al (2010)
http://sems.uni-rostock.de
Functional curation of models through virtual experiments
Fig.: Functional curation of models in the Web Lab. Cooper et al (2015) https://peerj.com/preprints/1338/ ; Cooper et al (2014) doi:10.1016/j.pbiomolbio.2014.10.001
Try out theCardiac physiology
Web Lab
http://sems.uni-rostock.de
Enabling model version control
Fig.: courtesy Martin Scharm, BudHat
http://sems.uni-rostock.de
Enabling on-the-fly reproduction of the model-based results
Fig.: Software supporting SBML and SED-ML.Waltemath et al (2011). doi:10.1186/1752-0509-5-198
http://sems.uni-rostock.de
So far for the theory… and in practice?
● Check for existing standards and specifications thereof: http://co.mbine.org
● Get involved in standard development → through the relevant mailing lists
● Problems with getting your model into the right format?
– Is it a problem with finding the approriate format or tool? → Ask on the relevant mailing list... people are friendly and happy to help.
– Is it a tool problem? → Complain with tool developers... who will hopefully change it.
– Is is a problem with the lack of a standards? → Feed back into the standard's community… people are friendly and happy to improve the standard.
● Follow best practices when aiming at publishing a result.
http://sems.uni-rostock.de
Best practices for publishing reproducible modeling results
1) Encode the model in a standard format, e.g. SBML.
2) Annotate the SBML model, following MIRIAM.
3) Publish the simulation experiment descriptions in standard format, e.g. SED-ML. If unsure what to include, consult the MIASE guidelines.
4) Try to reproduce the results *yourself*.
5) Ask a colleague to reproduce the results.
6) If successful: Archive all steps that led to your results.
7) Disseminate model code and simulation description through an open repository. Adapted from: Waltemath et al (2013), doi:10.1007/978-94-007-6803-1_10