20
Thomas Hankemeier, Amy Harms Netherlands Metabolomics Centre Biomedical Metabolomics Facility Leiden Leiden Academic Centre for Drug Research Leiden University, The Netherlands Data Stewardship and integration of Biomedical OMICS data

Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Thomas Hankemeier, Amy Harms

Netherlands Metabolomics CentreBiomedical Metabolomics Facility Leiden

Leiden Academic Centre for Drug ResearchLeiden University, The Netherlands

Data Stewardship and integration of Biomedical OMICS data

Page 2: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Biologicalquestion

Samplepreparation

Experi-mentaldesign

Data acquisition

Data pre-processing

Biologicalinter-

pretation

Dataanalysis

Samples Raw data List of peaks/Biomolecules(identification)

Relevant biomolecules/ connectivities

&Models

Metabolites

Sampling

Protocol

Metabolomics workflow

Page 3: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Biomedical Metabolomics Facility Leiden• Robust & validated protocols; quality system & trained personal

• > 15,000 samples/year

• Various types of samples: blood, urine, biopsies, cells, etc

• Large number of clinical/preclinical studies with academia, clinics, industry (cardiovascular and metabolic diseases, diabetes, infectious diseases, CNS diseases and nutritional studies)

• Access for academic & clinical researchers & industry(international pharma & nutrition)

Biologicalquestion

Samplepreparation

Experi-mentaldesign

Data acquisition

Data pre-processing

Biologicalinter-

pretation

DataanalysisSampling

Metabolomics FacilityAdvice Advice

www.bmfl.nl

Page 4: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Oxidativestress

Metabolicstress

Inflammatorystress

Page 5: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Biology-driven Global profiling

Validated metabolomics platforms

More details: www.bmfl.nl

Medium polarcentral carbon/

energy metabolism> 200GC-MS Apolar

metabolites> 400LC-MS

Apolar lipids

> 800LC-MS

Polar lipids

> 150LC-MS

Biogenicamines

> 90LC-MS/MS

Endocannabinoids> 40

LC-MS/MS

Oxylipinspro/anti inflammatory

lipid mediators> 120

LC-MS/MS Oxydative/nitrosative stress

> 60LC-MS/MS

> 2500 metabolites>1000 identified> 500 quantitativeVariation < 10%!

Page 6: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Global profiling of lipids using RP-UPLC-TOF MS

PG,PI, PSer, PE

FA

GPCho, SM, GPGro, GPEtn

DG, TG & ChoE

lyso-GPCho, lyso-GPEtn

+ve ESI

-ve ESI

Low energy trace

Waters Synapt qTOF-MSAgilent qTOF MS

Bile acidsFFALPCLPEPIPEPGPCSMDGTGCECER

Castro-Perez et al, J. Proteom. Res, 2010

Human plasma

Page 7: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Data processing: combining targeted & untargeted

• ‘pseudo targeted’ using target list• Identified• Quantitative (if reference compounds available)

• ‘known unknowns’ • ‘unknowns’

• MZextract (Van der Kloet/NMC, new!)Example: lipid profile

TG(52:1)

0 5×100 6 1×100 7 2×100 70

5×100 6

1×100 7

2×100 7

2×100 7

3×100 7

QC SamplesRegular Samples

conventional

unta

rget

ed a

ppro

ach

Comparison quantificationtargeted vs MZ extract

Feature set: several m/z of one analyte

Page 8: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Good Practices

Sample Randomization:• Important to randomize case/control, treated/untreated to avoid

artifacts introduced by changes in instrumental drift• Experimental design dictates the randomization strategy• Within batch variation is lower than between batch, so the batch

design should block related samples to minimize variation

Blinded analysis:• For important clinical studies, the person running the samples

should be blinded to the sample identity• The lab is unblinded for data analysis only after data have been

deposited in a database or with a collaborator

Page 9: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Quality Control ToolsDuring routine analysis, calibration lines, blanks, QCs and are prepared together with the samples. A statistical tool has been developed to apply corrections to the data and to output quality parameters

For data analysis, all peak areas are corrected for internal standard response followed by a QC correction. This tool corrects for instrumental and experimental drift within and over batches. QC-samples (pooled study-samples) bracket ~15 study-samples within a batch.

Page 10: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Assuring Traceability

Make sure that the results that we deliver can be proven and explained not only now but also 5 years from now.

Proper data management should facilitate research based on (existing) research.

An (easy to use) exchange format, using controlled vocabularies/ontologies gives certainty about what was measured and how it was measured.

Researchers need to share information required to reproduce the results (https://biosharing.org/pages/about/). Which means sharing:• SOP’s• Scripts/software to (pre-)process the data• Decisions made, for example why data was discarded• Etc.

Page 11: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Our efforts to assure traceability

Experimental design: Assure reproducible data that can be shared and link to other resources (proteomics, transcriptomics, and genomics)

Traceable data: Starting with ELN coupled to our in-house developed LIMS

Interpretable data: Use external identifiers and controlled vocabularies to present/report data

Data analysis: Freeze data + scripts/algorithms with output, rerun the data analysis pipeline on the same data should produce the same output. Scripts and software should be open (accessible) to understand what happens to data.

We are working hard to deposit our studies in Metabolights (1 live, 3 under curation and 4 more in preparation)

Leiden leads MetabolomeXchange, an international data aggregation and notification service for metabolomics.

Page 12: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

What we tried to make data available:Data support platform (NMC)• Easily access and analyze experimental

metabolomics data with the data support platform (DSP).• a metabolomics data

warehouse• a data processing

infrastructure

Page 13: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

MetabolomeXchange

International data aggregation and notificationservice for metabolomics set up by Leiden

Easy to search forand subscribe to publiclyavailable data sets

Page 14: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

PhenoMeNal Consortium

• H2020 Societal Challenge in Health, Demographic Change, and Well being

• 3 years• 13 partners• 8 Mio Euros• 830 PM

e-infrastructure for the processing, analysis and information-mining of the massive amount of medical molecular phenotyping and genotypingdata that will be generated by metabolomics applications now entering research and clinic.

Page 15: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

The Aim

Data collection QC Data pre-

processingStatistical Analysis

Workflows

Biomedical Data & Metadata

Page 16: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

DTL / FAIR DATA

Findable, Accessible, Interoperable, and Re-usableLeiden University, partner of DTL (Dutch Techcentre forLife Sciences), supports the idea and developments of international FAIR Data principles.

Page 17: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Linking with vendors?

• Discussions are ongoing between Leiden and vendors to see how the experience and expertise gained through many NMC can be used to enhance the their workflow.

• Leiden has been participating in EU funded grants for improving the infrastructure for metabolomics communication.

• Vendors and community are both developing software tools to integrate metabolomics tools in a more system biology approach and we should work together.

Page 18: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

Summary & discussion points

• Workflow and data management crucial and different for each facility and field

• Share good practices

• For Metabolomics:

• Absolute concentrations are key!?

• Benefits for validation and replication!!

• Some main facilities for high throughput!?

• Benefits can be achieved in omics integration; NL/DTL to lead by example?

• Sharing metadata often bottleneck

Page 19: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

www.bmfl.nl and www.metabolomicscentre.nl

Ruud BergerBiochemical interpretation

Amy HarmsLeader

Metabolomics Facility

Jan van der GreefSystems approaches

& SDPPM

Ronan FlemingMetabolic modelling

(guest)

Paul VultoOrgan-on-a-chip &

microfluidics(guest)

Slavik KovalData analysis

Peter LindenburgMetabolomics

technology

Rawi RamautarMetabolomics

technology

Acknowledgement

Page 20: Data Stewardship and integration of Biomedical OMICS data · Robust & validated protocols; quality system & trained personal ... • Important to randomize case/control, treated/untreated

AcknowledgementPhD students: Amar Oedit, Vasu Kantae, Bas Trietsch, Junzeng Fu, Robert-Jan Raterink, Can Gulersonmez, Min He, Nelus Schoeman, Mengmeng Sun, Vincent van Duinen, Rosilene Rossetto-Burgos, Abidemi Junaid, Wei Zhang, Renate BuijinkPost docs: Oskar Gonzalez, Michel van Weeghel, Anne-Charlotte Dubbelman, Petri Kylli, Estefania Moreno-Gordaliza, Marek Noga Technicians: Gerwin Spijksma, Faisa Guled, Anthanasis Giannitsis (clean room), Sabine Bos, Lieke Lamont-de Vries, Hyung Elfrink, Belèn Gonzàlez Amoros, Marian Martinez Zapata, Sandra Pous-Torres, Monique Nieman Scientific Programmer: Michael van VlietMechanical Workshop: Raphael Zwier

www.metabolomicscentre.nl