Upload
carole-goble
View
304
Download
1
Embed Size (px)
Citation preview
FAIR Data, Operations and Model management for Systems Biology and Systems Medicine Projects
Prof Carole GobleThe FAIRDOM [email protected]://fair-dom.org, http://fairdomhub.org
1st Conference of the European Association of Systems Medicine, 26-28 October 2016, Berlin
Asset Management and Sharing
• Access to public funded research
• Reproducible results• Value and cite all
research outcomes• Sustained data
infrastructure
Findable
Accessible
Interoperable
Reusable(Intelligible)
(Reproducible)
(Citable)
(Trackable)
https://www.force11.org/group/fairgroup/fairprinciples
Projects .... and Programmes....funder and research project legacy
P1. BaCell-SysMOThe transition from growing to non-growing Bacillus subtilis cells - A systems biology approach
P2. COSMICSystems Biology of Clostridium acetobutylicum - a possible answer to dwindling crude oil reserves
P3. SUMOSystems Understanding of Microbial Oxygen Responses Escherichia coli
P4. KOSMOBACIon and solute homeostasis in enteric bacteria Escherichia coli
P5. SysMO-LABComparative Systems Biology: Lactic Acid Bacteria: Lactococcus lactis, Enterococcus faecalis, Streptococcus pyogenes
P6. PSYSMOSystems analysis of biotech induced stresses: towards a quantum increase in process performance in the cell factory Pseudomonas putida
P7. SCaRABSystems Biology of a genetically engineered Pseudomonas fluorescens with inducible exo-polysaccharide production: analysis of the dynamics and robustness of metabolic networks
P8. MOSESMicroOrganism Systems Biology: Energy and Saccharomyces cerevisiaeP9. TRANSLUCENT
Gene interaction networks and models of cation homeostasis
in Saccharomyces cerevisiaeP10. STREAM Global metabolic switching in
Streptomyces coelicolor
P11. SulfoSYS Silicon cell model for the central carbohydrate metabolism of the archaeon Sulfolobus solfataricus under temperature
variation
P12. SysMO-DB Data management group • Reuse • Compliance• Retention• Dissemination• Collaboration• Reproducibility• Resource &
Skills Limitations
FindableAccessibleInteroperableReusable
DataOperationsModels
Sponsors
FAIRDOM Association e.V. Partnersopen innovation, not for profit
LifeGlimmer GmBH SB-ScienceManagement GmBH New Forest Ventures Ltd
FAIRDOM Pillars
Project Support
Community Actions
Platforms, Tools
Public Project Commons
Systems Approach…people, assets, processes pragmatics
• Multiple, interrelated assets• Multiple, dispersed repositories• Multi-partner, -discipline
projects
• Team science practices• Experiment – Asset
lifecycles• Academic innovation
drivers
Multiple, interrelated assetsstructured formats, standards, ontologies context
Analytics &Pipelines
Literature
SBML, CellML, PharmMLMatlab, MathematicaFortran, R, Python
SOPS
Multiple omics:genomics, transcriptomicsproteomics, metabolomicsfluxomics, reactomics
ImagesReaction kineticsSamples, Specimens, StrainsHuman data
STANDARDSversioning,
tracking:provenance, parameters,
citation
Operations
Data
SOPs
Models
FAIR Data and Metadata Standards that help to improve understanding and exchange….
Nicolas Le Novère, Babraham Institute, UK.
…researchers do not always use them....
Format MetadataMetadata Ontologies
*top three most popular
The evolution of standards and data management practices in systems biology (2015). Stanford et al, Molecular Systems Biology, 11(12):851
… makes model reuse tricky…
Stanford et al The evolution of standards and data management practices in systems biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
Specialist Public Repositories
General archives
Multi Repository RepertoireAccess, Reuse
Local Data Stores
The evolution of standards and data management practices in systems biology (2015). Stanford et al, Molecular Systems Biology, 11(12):851
sharing/publishing assets in public archives…
Data Models
*top three most popular
The evolution of standards and data management practices in systems biology (2015). Stanford et al, Molecular Systems Biology, 11(12):851
Multi-partner, multi-disciplinary projectswhere sharing and metadata collection isn’t second nature
ConsortiaGrp
3Grp
1
Grp 2
Multi-partner, multi-disciplinary projectsSOPs and Yellow Pages top request…
Who is working with wh
ich organism?
What methods are been used to determine enzyme activity?
Under which experimental conditions are
my
partners working on for the measurement
of glucose
concentration?What is the provenance of the parameters for this version of the model?What SOP was used for this
sample?
Where is the validation data for this model?
Is there any group generating kinetic data?
Is this data available?
Track versions of my model
Whats the relationship between the data and model?
Which data belong to which publications?
Downstream assets discovery and sharing
Organisation Communication Dissemination
Navigate through assets
Reuse later
Enable team to reuse/
reproduceHelp
others find out
Reuse with new
partners
Tell more, take credit
Standardised metadata practices
Assets
Find… own hard disk for storage…
The evolution of standards and data management practices in systems biology (2015). Stanford et al, Molecular Systems Biology, 11(12):851
Samiul Hasan, GSKBiocuration need in Pharma: Drivers from a Translational Bioinformatics Perspective, Poster S16
The FAIR ProjectChallengeTrack collection of data and metadata X X
Maintain the experimental context X
Find and exchange assets X X
Retain results beyond a project X X
Share, disseminate and publish assets sensitively
X X X
Consistent reporting for interpretation, interoperability & comparison
X X
Promote standardised metadata practices. X X
Organise and link assets X X
Reuse tools and community archives X
Respect local and legacy solutions X X
Support reproducible publications X X X X
Credit owners X X
FAIRDOM Pillars
Project Support
Community Actions
Platforms, Tools
Public Project Commons
Community, Knowledge Hubhttp://www.fair-dom.org
Know-how, Guides, Templates, Workshops, Training, Webinars, Standards and Policy Forums
Project Support Processes, Practices, People…take time and persuasion
Community support Special project
support Special project
support
PALs project ambassadors
best practices, forums, trainingcuration handholdingSBML model technical curation
Asset Management Platformsan ecosystem of resources
Front endWeb based rich interfaceCatalogue and CommonsAll about the metadataResults repositoryhttp://seek4science.org
Back end Scaled LIMS and analyticsAuto-archivingInstruments data repositoryhttps://sis.id.ethz.ch/software/openbis.html
A community Commons….self managed workspaces
Controlled sharing and publishing
• Licenses• Negotiated access• Embargos• Permission controls• Staged sharing• Private walled gardens
FAIR Play Practices
Using FAIRDOM my own lab colleagues saw what I was doing and called to collaborate!
Jurgen HannstraVrije Universiteit Amsterdam, Netherlands
Investigation
Study Analysis
Data
Model
SOP(Assay)
….organised in an ISA (Investigation, Study, Assay/Analysis) format.
Linking, “Packaging” & Citing Codes, Data, Models, SOPs, Samples, Strains, Articles, People, Projects….
PackagingRetaining ContextSupporting Decision making
STUDY ASSAYINVESTIGATION
Experimental assay
Modeling assay
Publication
[Maksim Zakhartsev]
... a “Research Object” Cataloguemetadata aggregated across repositoriesretaining context to support decision making and reuse
Local Stores
ExternalDatabases
Publishing services
Secure Stores
Model Resources
… with integrated toolingmetadata annotation against standardsmodel validation, comparison and simulation
SBML Model simulation
Model comparison
Model versioning
Reproducing simulations
[Jacky Snoep, Dagmar Waltemate, Martin Peters, Martin Scharm]
Retaining context, supporting decision makingTowards data harmonisation and indexing
[Susanna Sansone]
Stealthy Ramps for helping with Metadata Standards Tooling for annotations and templates for different types of assay datatowards data harmonisation. Incentive by side effect.
Embed ontologies into Excel templates
Excel spreadsheets enriched with ontology annotations
Upload, extract metadata and register
http://www.rightfield.org.uk
Exchange and PublishingSupplementary information
Annotation file
Stoichiometric matrix
SBML Stationary fluxes
[Maxim Zakhartsev]
https://doi.org/10.15490/seek.1.investigation.56
Penkler et al (2015) FEBSJ 282:1481-1511.
Reproducible Exchange and Publishingand better credit
reviewer
Author List: Joe Bloggs; Jane DoeTitle: My Investigation Date: September 2016DOI: https://doi.org/10.15490/seek##
information travels with the data and models
FAIRDOM-SEEK local or public commons
*Troup, E.; Clark, I; Swain, P; Millar, AJ; Zielinski, T (2015) Practical evaluation of SEEK and openBIS for biological data management in SynthSys http://hdl.handle.net/1842/12236
FAIRDOMHub.org
Vrije Universiteit
Yellow Pages
IMOMESIC pathwayIntegrating Modelling of Metabolism and Signalling towards an Application in Liver Cancerhttps://fairdomhub.org/projects/24
[Ursula Klingmüller, Martin Böhm]
What about FAIR Systems Medicine?Olaf Wolkenhauer et al, Enabling multiscale modeling in systems medicine, Genome Medicine 2014 6:21*
1. Samples
2. Access to sensitive data
3. Multi-models
*DOI: 10.1186/gm538, http://genomemedicine.com/content/6/3/21
Samples metadata frameworkBBMRI, ELIXIR, Biosamples, FAIRDOM, UKCRC Tissue Directory, UK Synthetic Biology Centres
User defined sample models
Interlinking between sample typesSample type defines a sharable standard
Template toolingAuto extraction
Tied to assay processes
FAIR Sensitive Data, certified repositorieswalled gardens and registration flags for Cataloguelegal restrictions for sharing anonymised and non-anonymized data
Open Data
Register metadataUpload dataRegister link
Register access methodRegister metadata
Register access methodLocal AAI service
Register metadata
Closed Data
Closed Data
Model Laissez-Faire
• Navigation between• Single standards at 1
scale• Multi-model hosting
Linking models….• connecting (experimental/simulation) data to
models• connecting the single standards?• interfacing between the different scales?
In summary…Pragmatic FAIR support for projects people, assets, processes
• Multiple, interrelated assets• Multiple, dispersed repositories• Multi-partner, -discipline
projects• Multiple community tools
• Team science practices• Experiment – Asset
lifecycles• Academic innovation
drivers
ISA structured
“Research Objects”
Repository spanning catalogue
metadata
Standards-based tools
Challenges to FAIR Asset Management
Free Puppies
FAIR Play microscopes -> data scopes, sharing citizenship, incentives by side effects
PI leadershipSticking to conventionsLocal responsibilityTime and resourceCuration recognition
Trust• Tribal trading behaviours• Enclave sharing • Not public donation• Reciprocity & credit
Drivers … • External dominate• Personal productivity
affecting behavioural change through libertarian paternalism
[Kristian Garza]
Jon Olav Vik, Norwegian University of Life ScienceMaksim ZakhartsevUniversity Hohenheim, Stuttgart, Germany
Alexey KolodkinSiberian BranchRussian Academy of Sciences
Tomasz Zieliński,SynthSys CentreUniversity Edinburgh, UK
Martin Peters, Martin Scharm Systems Biology BioinformaticsUniversity of Rostock, Germany
3rd Foundry meeting, Dec 1-2 2016
Frankfurt
Developers FoundrySupport developers of Systems Biology tools and platforms