23
E-BIOGENOUEST: A REGIONAL LIFE SCIENCES INITIATIVE FOR DATA INTEGRATION Datacite Annual Conference 2014 - Nancy Olivier Collin – IRISA/INRIA [email protected] http://www.genouest.org

E- Biogenouest : a regional Life Sciences initiative for data integration

  • Upload
    marge

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

E- Biogenouest : a regional Life Sciences initiative for data integration. Datacite Annual Conference 2014 - Nancy Olivier Collin – IRISA/INRIA [email protected] http://www.genouest.org. Agenda. Context Biogenouest Biology The e- biogenouest project - PowerPoint PPT Presentation

Citation preview

Page 1: E- Biogenouest : a regional Life Sciences initiative for data  integration

E-BIOGENOUEST: A REGIONAL LIFE SCIENCES INITIATIVE FOR DATA INTEGRATION

Datacite Annual Conference 2014 - Nancy

Olivier Collin – IRISA/INRIA

[email protected]

http://www.genouest.org

Page 2: E- Biogenouest : a regional Life Sciences initiative for data  integration

Agenda• Context

• Biogenouest• Biology

• The e-biogenouest project• “Bridging data, metadata and computation”

• A system of systems : collaborative portal, metadata management environment, data analysis portal

Page 3: E- Biogenouest : a regional Life Sciences initiative for data  integration

Biogenouest

Biogenouest is a network bringing together technological core facilities dedicated to Life and Environmental Sciences in the West of France

Page 4: E- Biogenouest : a regional Life Sciences initiative for data  integration

Biogenouest

Created in 2002, Biogenouest coordinates 31 technological core facilities based in the regions of Brittany and Pays de la Loire, with the aim to organize and pool interregional resources.

Biogenouest also federates 70 research units involved in thematic research covering 4 areas of activity : Marine resources, Agri-food, Health and Bioinformatics.

Page 5: E- Biogenouest : a regional Life Sciences initiative for data  integration

GenOuest : Bioinformatics core facility

• Member of the Biogenouest network• Member of the IFB : French Bioinformatics Institute• National recognition : IBiSA platform• Regional strategic facility for INRA (National Institute of

Agronomical Research)• ISO9001:2008 certified

• Established since 2002• 10 to 12 people• Computing infrastructure, storage, software development,

expertise, R&D projects

Page 6: E- Biogenouest : a regional Life Sciences initiative for data  integration

Computation

DataWorkflows

Portals

Collaboration

Grid Cloud Cluster

BioMAJ

SeqCrawler

MetaData

EMME

HubZero

Galaxy

Mobyle

Ontologies

BiosciencesMobyle2

R&D projects

Page 7: E- Biogenouest : a regional Life Sciences initiative for data  integration

Computation

DataWorkflows

Portals

Collaboration

Grid Cloud Cluster

BioMAJ

SeqCrawler

MetaData

EMME

HubZero

Galaxy

Mobyle

Ontologies

BiosciencesMobyle2

R&D projects

E-Biogenouest

Page 8: E- Biogenouest : a regional Life Sciences initiative for data  integration
Page 9: E- Biogenouest : a regional Life Sciences initiative for data  integration

Context

Kahn. On the future of genomic data. Science (2011) vol. 331 (6018) pp. 728-9

Now : Genomics : Next Generation Sequencing

Next : Proteomics

Next : Bio-imaging

Digital data Huge amount Heterogenous

Critical situation for some laboratories

Page 10: E- Biogenouest : a regional Life Sciences initiative for data  integration

E-BIOGENOUEST

Page 11: E- Biogenouest : a regional Life Sciences initiative for data  integration

E-Biogenouest• Started in May 2012 for 3 years• Funded by Brittany and Pays de la Loire • E-science initiative for the Biogenouest network

• Community building• Training/workshops• Roadmap preparation• Experimentation/Pilot project : Virtual Research

Environment (VRE)

Page 12: E- Biogenouest : a regional Life Sciences initiative for data  integration

A system of systems

• Combination of various tools• A data analysis portal : Galaxy• A metadata management tool : ISAtools suite• A collaborative portal : HubZero• Additional utilities :

• Pydio : file transfer

• Some software glue to make it work…• BioBlend : Galaxy API• In-house developments

Page 13: E- Biogenouest : a regional Life Sciences initiative for data  integration

Galaxy portal• Galaxy : a web based portal for biomedical data analysis

• Intuitive interface• Workflows

• Galaxy@Genouest• 800 tools (transcriptomics, population genetics, quantitative

genetics, metagenomics, proteomics, etc.)

• http://galaxyproject.org/Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y, Blankenberg D, Albert I, Taylor J, Miller W, Kent WJ, Nekrutenko A. "Galaxy: a platform for interactive large-scale genome analysis." Genome Research. 2005 Oct; 15(10):1451-5.

Page 14: E- Biogenouest : a regional Life Sciences initiative for data  integration

ISAtools Suite • Open Source tools for experimental metadata

management• Enforces the description of experiments with standards or

ontologies• Creates local repository• Allows publication to public repositories

• ISA@GenOuest = EMME• Additional developements and auxiliary tools.

• http://www.isa-tools.org/• Rocca-Serra, P. et al. ISA software suite: supporting standards-

compliant experimental annotation and enabling curation at the community level. Bioinformatics 26, 2354–6 (2010).

Page 15: E- Biogenouest : a regional Life Sciences initiative for data  integration

EMME

Wet Lab Experiment

Data MetaData

IsaTools

ISAtab files

ISAarchive

Link to raw data

Page 16: E- Biogenouest : a regional Life Sciences initiative for data  integration

EMME

Wet Lab Experiment

Data MetaData

ISAarchive

Galaxy

ImportDecompress

Import

Data Analysis

Page 17: E- Biogenouest : a regional Life Sciences initiative for data  integration

HubZero • Scientific web portal

• Collaboration: wiki, blog, etc.• Resources : results, articles, presentations, etc. • Lightweight project management

• https://hubzero.org/M. McLennan, R. Kennell, "HUBzero: A Platform for Dissemination and Collaboration in Computational Science and Engineering," Computing in Science and Engineering, 12(2), pp. 48-52, March/April, 2010

Page 18: E- Biogenouest : a regional Life Sciences initiative for data  integration

Continuum

• Continuum for the management and analysis of biological data

• Collaborative environment

HubZero

Galaxy EMME

Page 19: E- Biogenouest : a regional Life Sciences initiative for data  integration

19

VRE : Virtual Research Environment

Data

Versioning

ProvenanceSecurity

Sharing

Workflows

Versioning

ProvenanceSecurity

Sharing

Web portal

Project management

Collaboration

Dissemination

Data infrastructure

Computing infrastructure

Page 20: E- Biogenouest : a regional Life Sciences initiative for data  integration

A paradigm shift

Data

IT Environment

Data

IT Environment

From… To…

Page 21: E- Biogenouest : a regional Life Sciences initiative for data  integration

Next steps• What we learned :

Acceptance / adoption issues are key issues

• What we will do : • Switch to a production environment• Identity federation• ISA-Dataflow : metadata for bioinformatics workflows

What we need to do :• To connect to other initiatives • To define the perimeter :

• Big changes for bioinformatics facilities

Page 22: E- Biogenouest : a regional Life Sciences initiative for data  integration

Conclusion• Biology becomes a digital science• New technologies with lower costs create a dangerous

situation• A system of systems :

« metadata + collaborative tool + analysis portal »

• Continuum : data centered philosophy« Bring back Biology to the biologist »

Page 23: E- Biogenouest : a regional Life Sciences initiative for data  integration

Questions ? • [email protected]

• http://www.genouest.org• https://www.e-biogenouest.org