45
EBI as a research infrastructure Graham Cameron, EBI

EBI as a research infrastructure Graham Cameron, EBI

Embed Size (px)

Citation preview

Page 1: EBI as a research infrastructure Graham Cameron, EBI

EBI as a research infrastructure

Graham Cameron, EBI

Page 2: EBI as a research infrastructure Graham Cameron, EBI

Heidelberg

Hinxton

Monterotondo

Hamburg

Grenoble

Service Research Training Industry

EMBL

EBI

Page 3: EBI as a research infrastructure Graham Cameron, EBI

Member States of EMBL

• Austria

• Belgium

• Denmark

• Finland

• France

• Portugal

• Spain

• Sweden

• Switzerland

• United Kingdom

• Germany

• Greece

• Israel

• Italy

• The Netherlands

• Norway

                                                                                                                               

                                                                                          

Page 4: EBI as a research infrastructure Graham Cameron, EBI

Hinxton

Service Research Training Industry

EBI

Page 5: EBI as a research infrastructure Graham Cameron, EBI
Page 6: EBI as a research infrastructure Graham Cameron, EBI

Wellcome Trust

Economic & Social Research Council

Council for the Central Laboratory

of the Research Councils

Natural Environment Research Council

Engineering & Physical Sciences Research Council

Particle Physics & Astronomy

Research Council

Biotechnology & Biological Sciences Research Council

Medical Research Council

Arts & Humanities Research Council

~ €3.8 Billion

Page 7: EBI as a research infrastructure Graham Cameron, EBI

We have amassed a wealth of knowledge about the molecular processes of living systems• Biomacromolecules• Biologically active molecules• The behaviour and interactions

of these molecules• The phenotypic effects of

molecular changes• Mutations• Drugs• Nutrients

• The molecular adjuncts of phenotypic changes• Disease• Aging

• Databases• Web access• Tools to explore the information• Systems to capture the

information• Service centres

Page 8: EBI as a research infrastructure Graham Cameron, EBI

DNA

Page 9: EBI as a research infrastructure Graham Cameron, EBI

Protein Sequences

Page 10: EBI as a research infrastructure Graham Cameron, EBI

Expression

Page 11: EBI as a research infrastructure Graham Cameron, EBI

Structures

Page 12: EBI as a research infrastructure Graham Cameron, EBI

PDB code 1DIFHIV-1 Protease/Inhibitor Complex A79285 (Difluoroketone)

molecules interact

Page 13: EBI as a research infrastructure Graham Cameron, EBI

Pathways

Page 14: EBI as a research infrastructure Graham Cameron, EBI

Reactome

EnsEMBLGenome

Annotation

EMBL-BankDNA sequences

UniProtProtein Sequences

Array-ExpressMicroarray

Expression Data

EMSDMacromolecularStructure Data

IntActProtein Interactions

Page 15: EBI as a research infrastructure Graham Cameron, EBI
Page 16: EBI as a research infrastructure Graham Cameron, EBI

Usage

• Basic research• Industry

• Pharma• Diagnostics• Medical device research• Personal care• Nutrition• Agriculture• Forestries• Fishery

• Patent searching and provenance

Page 17: EBI as a research infrastructure Graham Cameron, EBI

Using the information

Not Salt TolerantSalt Tolerant

Disease proneDisease Resistant

Low YieldHigh Yield

DiseasedHealthy

Suppose a gene’s variation seems important

Page 18: EBI as a research infrastructure Graham Cameron, EBI

Using the information

Not Salt TolerantSalt Tolerant

Disease proneDisease Resistant

Low YieldHigh Yield

DiseasedHealthy

Look in databases for similar genes, their products, and functions, structures, interactions and expression patterns. The processes in which they are involved.

Page 19: EBI as a research infrastructure Graham Cameron, EBI

Using the information

Not Salt TolerantSalt Tolerant

Disease proneDisease Resistant

Low YieldHigh Yield

DiseasedHealthy

Can we influence the processes in which they are involved?

Page 20: EBI as a research infrastructure Graham Cameron, EBI

Using the information

Not Salt TolerantSalt Tolerant

Disease proneDisease Resistant

Low YieldHigh Yield

DiseasedHealthy

Can we influence the processes in which they are involved?

Page 21: EBI as a research infrastructure Graham Cameron, EBI

•Working out what in the lab what a gene does could easily be a year’s work

•Searching databases can do it in half an hour

Page 22: EBI as a research infrastructure Graham Cameron, EBI

0

20000

40000

60000

80000

100000

120000

Jun-82

Jun-83

Jun-84

Jun-85

Jun-86

Jun-87

Jun-88

Jun-89

Jun-90

Jun-91

Jun-92

Jun-93

Jun-94

Jun-95

Jun-96

Jun-97

Jun-98

Jun-99

Jun-00

Jun-01

Jun-02

Jun-03

Jun-04

Jun-05

Nucleotide SequenceDatabase Growth

Meg

abas

es

Date

A new se

quence o

nce a

seco

nd

Page 23: EBI as a research infrastructure Graham Cameron, EBI

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

1st99

2nd99

3rd99

4th99

1st00

2nd00

3rd00

4th00

1st01

2nd01

3rd01

4th01

1st02

2nd02

3rd02

4th02

1st03

2nd03

3rd03

4th03

1st04

2nd04

3rd04

4th04

1st05

2nd05

3rd05

Average Web Hits per Day

Including Ensembl

Quarter Year

Ave

rage

Hits

per

Day

Note: Ensembl is a joint project withThe Wellcome Trust Sanger Institute. Equivalent usage data have only beenavailable since 2004.

A few hundre

d thousa

nd

unique users

per month

A milli

on unique users

per year

Page 24: EBI as a research infrastructure Graham Cameron, EBI

European Context

• BioSapiens• EMBRACE• ENFIN

• (and many others)

Page 25: EBI as a research infrastructure Graham Cameron, EBI

Biosapiens

• European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, Cambridge, UK.

• European Molecular Biology Laboratory, Heidelberg, Germany.

• German National Centre for Environment and Health, Neuherberg, Münich, Germany

• Université Libre de Bruxelles, Brussels, Belgium

• Consejo Superior de Investigaciones Cientificas, Madrid, Spain

• Institut Municipal d'Assistència Sanitària, Barcelona, Spain

• Genome Research Ltd, Hinxton, Cambridge, UK.

• Max-Planck Institute for Informatics, Saarbrücken, Germany

• The Hebrew University of Jerusalem, Girat Ram, Israel

• Department of Biochemical Sciences University of Rome "La Sapienza", Rome, Italy

• University of Stockholm, Stockholm, Sweden

• University of Oxford, Oxford, UK.

• University College London, London, UK.

• Radboud University Nijmegen, Nijmegen, The Netherlands

• Swiss Institute of Bioinformatics, Geneva, Switzerland

• Technical University of Denmark, Lyngby, Denmark

• University of Helsinki, Helsinki, Finland

• University of Geneva, Geneva, Switzerland

• Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary

• University of Cologne, Cologne, Germany

• Institut Pasteur, Paris, France

• BioInfo Bank Institute, Poznan, Poland

• Max Planck Institute for Molecular Genetics, Berlin, Germany

• Genoscope, Evry, France

• University of Bologna, Bologna, Italy

• European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, Cambridge, UK

Page 26: EBI as a research infrastructure Graham Cameron, EBI

EMBRACE• European Molecular Biology Laboratory -

European Bioinformatics Institute, Hinxton, Cambridge, UK.

• European Molecular Biology Laboratory, Heidelberg, Germany.

• Institute of Biomedical Technologies, Section Bari, CNR, Bari, Italy

• University of Manchester, UK• Swiss Institute of Bioinformatics, Geneva,

Switzerland• Swedish University of Agricultural Sciences.The

Linnaeus Centre for Bioinformatics, Sweden• Centre National de la Recherche Scientifique,

Clermont-Ferrand and Lyon, France• Centre for Biological Sequence

Analysis,Technical University of Denmark, Lyngby, Denmark

• Centro Nacional de Biotecnologia/Consejo Superior de Investigaciones Cientificas, Madrid, Spain

• University of Stockholm, Stockholm Bioinformatics Centre, Sweden

• Institut National de la Recherche Agronomique, Toulouse, France

• Max Planck Institute for Molecular Genetics, Berlin, Germany

• CSC, the Finnish IT Center for Science, Espoo, Finland

• University College London, London, UK.• The Weizmann Institute, Rehovot, Israel• Centre for Molecular and Biomolecular

Informatics, University of Nijmegen, The Netherlands

• Carretera de Ajalvir, km. 4, 28850 Torrejon de Ardoz, Madrid

Page 27: EBI as a research infrastructure Graham Cameron, EBI

ENFIN

• The European Bioinformatics Institute / The European Molecular Biology Laboratory, Europe

• The University of Dundee UK

• Technical University of Denmark

• University of Rome Tor Vergata Italy)

• Medical Research Council Mammalian Genetics Unit (MRCMGU), UK

• Ludwig Institute for Cancer Research, Uppsala (LICR-UPP), Germany

• The Max Planck Institute, Germany

• University of Helsinki (UH), Iceland

• University College London (UCL), UK

• National Center for Research and Technology, Hellas (CERTH), Greece

• Universitaet zu Koeln (UNIK), Germany• Weizmann Institute (Weizmann), Israel• Egeen (EGEEN), Estonia• Serono Pharmaceutical Research Institute

(SPRI), Switzerland• Consejo Superior de Investigaciones

Científicas (CSIC), Spain• Centre for Integrative Bioinformatics VU

(IBIVU), Netherlands

Page 28: EBI as a research infrastructure Graham Cameron, EBI

Global Picture

• DNA – tripartite international collaboration

(including patent data acquisition)• Protein sequences – Uniprot collaboration• Macromolecular structures – tripartite international

collaboration• Intact international agreements• Reactome – USA Europe collaboration• Etc.

Page 29: EBI as a research infrastructure Graham Cameron, EBI

Flybase

MGD

SGD

BRENDA

Chemicaldata

resources

Medical data resources

Biodiversitydata

resources

IMGT

Pasteur DBs

Eumorphia/Phenotypes

Corebiomolecular

resources

Specialist biomolecular data resource examples

Mutants

Large resources in related disciplines

Model organism resource examples

Mouse Atlas

Page 30: EBI as a research infrastructure Graham Cameron, EBI

Large resources in related disciplines

Biodiversitydata

resources

Flybase

MGD

SGD

BRENDA

Chemicaldata

resources

Medical data resources

IMGT

Pasteur DBs

Eumorphia/Phenotypes

Corebiomolecular

resources

Specialist biomolecular data resource examples

Mutants

Model organism resource examples

Mouse Atlas

Page 31: EBI as a research infrastructure Graham Cameron, EBI

Medical data resources

Corebiomolecular

resources

Page 32: EBI as a research infrastructure Graham Cameron, EBI

Flybase

MGD

SGD

BRENDA

Chemicaldata

resources

Medical data resources

Biodiversitydata

resources

IMGT

Pasteur DBs

Eumorphia/Phenotypes

Corebiomolecular

resources

Specialist biomolecular data resource examples

Mutants

Large resources in related disciplines

Model organism resource examples

Mouse Atlas

Page 33: EBI as a research infrastructure Graham Cameron, EBI

USA

UKGermany

France

Japan

Italy

Spain

Canada

Sweden

Other

Norway

Netherlands)

SwitzerlandBelgium

IsraelAustralia

Taiwan

Denmark Austria

Finland

Web Hits

Page 34: EBI as a research infrastructure Graham Cameron, EBI

EBI Total RunningBudget 2005 = €26 million

EMBL50%

EU22%

USA8%

Other3%Industry

3%

Wellcome Trust7%

UK Research Councils7%

Projected budget 2011 = €43 million

Page 35: EBI as a research infrastructure Graham Cameron, EBI

€ 0

€ 10

€ 20

€ 30

€ 40

€ 50

€ 60

NCBI 2004/5 + PDB EBI 2005 EBI 2011

Mill

ions

Page 36: EBI as a research infrastructure Graham Cameron, EBI

€ 0

€ 500

€ 1,000

€ 1,500

€ 2,000

€ 2,500

€ 3,000

Cost of thedata

NCBI 2004/5 +PDB

EBI 2005 EBI 2011

Mill

ions

Page 37: EBI as a research infrastructure Graham Cameron, EBI

Read-only or dynamic

• There’s nothing particularly difficult about archiving unchanging data• But most aren’t

• Todays best bet• E.g, Ensembl

• Provenance• E.g., patent searching• N.B. Versioning (complex!)

• Cititation

Page 38: EBI as a research infrastructure Graham Cameron, EBI

How much data

• Canonical vs. episodic• Genomes, expression profiles

• Raw vs. processed• Sequence traces• Structure factors

Page 39: EBI as a research infrastructure Graham Cameron, EBI

Custodianship acquisition and ownership

• Widely accepted obligation to deposit data• Depend on the goodwill of the community

• Add “organisation”• Add “services”• Add “value”

Page 40: EBI as a research infrastructure Graham Cameron, EBI

Annotation as added value

• First/second/third party annotation• Computational vs. experimental• Bundled vs. distributed

• (DAS)

Page 41: EBI as a research infrastructure Graham Cameron, EBI

Openness

• We approve of it• Data must be made available as soon as they are

discussed in a publication• Data from “community” projects should be made available

immediately

• Confidentiality issues must be addressed

Page 42: EBI as a research infrastructure Graham Cameron, EBI

Federation

• Monolithic solutions fail• Centralisation yields more than the sum of the parts• Aggregation of institutional repositories is essential

Page 43: EBI as a research infrastructure Graham Cameron, EBI

Slice it vertically or horizontally?

• E.g., the EBI and AstroGrid are domain specific• Would it be better if they were jointly managed by data

experts?

• Standardisation• Mixed success

Page 44: EBI as a research infrastructure Graham Cameron, EBI

Supporting the electronic record of science

• This is more like libraries than research projects• Needs long term commitment• With accountability

• Current funding structures are not well adapted to the task

• Pitching the information providers in competition with their research community is damaging.

Page 45: EBI as a research infrastructure Graham Cameron, EBI

Bioinformatics Infrastructure

• Has captured the data from several billion Euros worth of science

• Serves a community of perhaps a million users• Supports science on which the UK alone spends €3-4 billion a

year• Cuts years of lab work down to hours of computer work• Is crucial to human well being from medicine to agriculture• Sees data volume and usage growing exponentially• Might cost a few tens of millions (at most a couple of percent of

the cost of the science it supports).