Upload
externalevents
View
362
Download
0
Embed Size (px)
Citation preview
EMBL-EBI Building the Database with International Isolates
Guy Cochrane, PhD
Structured
NeutralSustainable
Rapid
Data & Analysis
Requirements
COMPARE
COMPARE: the enabling system for rapid identification, containment and mitigation of emerging infectious diseases and foodborne outbreaks by generation and comparison of genomic information on samples and pathogens across sectors, time and locations, with additional contextual data.
A global platform for the sequence-based rapid identification of pathogens
EMBL European Bioinformatics InstituteGenes, genomes & variation
ArrayExpressExpression Atlas
MetabolightsPRIDE
InterPro Pfam UniProt
ChEMBL ChEBI
Literature & ontologies
Europe PubMed CentralGene OntologyExperimental Factor Ontology
Molecular structuresProtein Data Bank in EuropeElectron Microscopy Data Bank
European Nucleotide Archive1000 Genomes
Gene, protein & metabolite expression
Protein sequences, families & motifs
Chemical biology
Reactions, interactions & pathways
IntActReactome
MetaboLights
SystemsBioModelsEnzyme Portal
BioSamples
Ensembl Ensembl Genomes
European Genome-phenome ArchiveMetagenomics portal
European Nucleotide Archive (ENA)
http://www.ebi.ac.uk/ena/
• Globally comprehensive scientific record and European node of INSDC
• A broad platform for the management, sharing, integration and dissemination of sequence data
• Established in the early 1980s, extended for new technologies and applications
• Connectivity with broader EMBL-EBI resources
• Sequence data foundation• Sustained within EMBL-EBI under EMBL
funding with additional support from EC, UK Research councils, Wellcome Trust, etc.
• Substantial scale: 1.3 petabase pairs across >1 million taxa, 2,000-5,000 active data providers, global consumer userbase
• Rich submission, discovery and retrieval software, tools and services
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Structured data sharing
Data(primary & derived)
Analysis(routine & ad hoc)
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Structured data sharing
Data(primary & derived)
Analysis(routine & ad hoc)
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Structured data sharing
Data(primary & derived)
Analysis(routine & ad hoc)
Storage Compute
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Structured data sharing
Data(primary & derived)
Analysis(routine & ad hoc)
Storage Compute
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Structured data sharing
Data(primary & derived)
Analysis(routine & ad hoc)
Storage Compute
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Data Hubs
Data(primary & derived)
Analysis(routine & ad hoc)
Storage Compute
Data Hubs COMPARE-VMNotebooks
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Data Hubs
• Reporting and Sharing system– Quarantined pre-publication confidential and
public data– Set up for data providers and data consumers– Data / metadata:
• reported through systematic reporting system Webin – interactive and programmatic interfaces
• structured and validated• upon release embargoed data > INSDC
Data reporting
Data reporting
Data reporting
CGE batch upload
Data reporting
dcc_sibelius
dcc_sibelius
Data access
Systematic analysis
COMPARE Data
Reporting
COMPARE Data Archive
COMPARE
Selected workflow
COMPARE Data
Reporting
COMPARE Data Archive
Status of workflows
Jupyter Notebooks
• Live, low barrier exploration
• http://148.6.80.191/menu.html
• https://148.6.80.191/user/demo/notebooks/menu.ipynb
COMPARE reference genomesGUI: http://www.ebi.ac.uk/ena/data/xref/search
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.
Data Portal
Data(primary & derived)
Analysis(routine & ad hoc)
Storage ComputeData Portal
PeopleDTU
Jose Luis Bellod CisnerosMartin Christen Frølund ThomsenJohanne AhrenfeldtRolf Sommer KaasLukasz Dariusz DynowskiOle LundFrank AarestrupJeffrey Skiby
WIGNERJános Márk Szalai-GindlLászló OroszlányDávid VisontaiDezso RibliIstvan Csabai
EMBL-EBINima PaksereshtClara AmidNicole SilvesterMarc RosselloNeil GoodgameSuran JayathilakaAna Luisa ToribioAna Cerdeño TarragaPetra ten HoopenRasko LeinonenGuy Cochrane
EMCRon FouchierMarion KoopmansSaskia SmitsDavid van de VijverMarjolein Poen
FLIMartin BeerAnne PohlmannDirk HoeperClaudia WylezichAriane Belka
APHASharon BrooksAmanda SeekingsJill BanksJavier NunezRichard EllisIan Brown
SSIEva LitrupEva Møller Nielsen