57
Agenda: Applications of GC/Q-TOF for Metabolomics Jennifer Gushue, Ph.D. Data-Directed Multi-Omics of Biological Pathways Theodore Sana, Ph.D. Integrated Biology is a Software Challenge Norton Kitagawa, Ph.D. September 18, 2012 Metabolomics Society 1

Presentation Title Arial 28pt Bold Agilent Blue · 2016-09-02 · Olive Oil Demand is Growing 6 September 18, 2012 The United States market is expected to surpass $1.8 billion by

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Agenda:

• Applications of GC/Q-TOF for Metabolomics Jennifer Gushue, Ph.D.

• Data-Directed Multi-Omics of Biological Pathways Theodore Sana, Ph.D.

• Integrated Biology is a Software Challenge Norton Kitagawa, Ph.D.

September 18, 2012

Metabolomics Society

1

Diversity

Metabolomics Society

2

September 18, 2012

Diversity

Metabolomics Society

3

September 18, 2012

Diversity

Metabolomics Society

4

September 18, 2012

Agilent LC/MS Instrument Portfolio

Olive Oil Classification using the

Agilent 7200 Series GC/Q-TOF System

UC Davis Olive Center

Stephan Baumann, Agilent Technologies

5 September 18, 2012

Olive Oil Demand is Growing

September 18, 2012 6

The United States market is expected to surpass $1.8

billion by 2013.*

• Increase interest in Mediterranean foods

• Health benefits associate with olive oil: – Reduced risk of coronary heart disease

– Rich in antioxidants and anti-inflammatory compounds**

* Packaged Facts Market Report

** US FDA

Olive Oil Standards

International Olive Council and USDA Standards:

• Chemical tests

• Tasting panel sensory test

Limited understanding of organoleptic smell and taste.

Most Common Olive Oil Defects

Rancidity

A flavor in the olive oil usually accompanied by a greasy

mouth feel. In sensory tests, this greasiness is often

noticed first.

Fusty

It is caused by fermentation in the absence of oxygen; this

occurs within the olives before they are milled. A fusty smell

has been compared to sweaty socks, swampy vegetation,

or too-wet compost heap.

Winey-vinegary

That is caused by fermentation with oxygen, and can be

reminiscent of vinegar or nail polish.

Musty

Caused by moldy olives, it tastes of dusty, musty old

clothes, or the basement floor.

More Demand than Supply

International Olive Council and USDA Standards:

•Chemical tests

• Tasting panel sensory test

Sensory tests are expensive and subjective

they often fail EVOO sensory test

Imported Extra Virgin Olive Oils account for 99% of the

US supply.*

* UC Davis Olive Center

Possible Solution: Chemical Screening

Develop a chemical screen to predict whether an olive

oil will pass the sensory test.

• Allows producers to submit only those olive oils for sensory

testing that have a high probability of passing

• Reduces certification costs

• Increases the quality of the EVOO available in the

marketplace

Olive Oil Characterization

September 18, 2012 11

1. To create a model that could predict

whether olive oil sample would pass

or fail sensory test

2. To find statistically significant olive oil

components that are present at

distinct levels depending on whether

they passed or failed sensory test

7200 Series GC/Q-TOF

• High resolution full acquisition spectra

• Accurate mass measurements

• Fast acquisition of full spectra

• MS/MS mode

– Full spectrum of Product Ions

• With high resolution and accurate mass

• High sensitivity structural elucidation tool

Ideal tool for solving complex analytical problems

September 18, 2012 12

New Removable Ion Source includes repeller, ion volume, extraction lens and dual filaments

September 18, 2012 13

Hot Quartz Monolithic Quadrupole analyzer identical to the 7000 Quadrupole MS/MS

September 18, 2012 14

Hexapole Collision Cell accelerates ion through the cell to enable faster generation of

high-quality MS/MS spectra without cross-talk

September 18, 2012 15

Dual-Stage Ion Mirror improves second-order time focusing for high mass resolution

Pre dual-stage ion mirror Post dual-stage ion mirror

Proprietary INVAR flight tube sealed in a vacuum-insulated shell eliminates thermal

mass drift due to temperature changes to maintain excellent mass accuracy, 24/7.

September 18, 2012 16

4 GHz ADC electronics enable a high

sampling rate (32 Gbit/s) that improves the

resolution, mass accuracy, and sensitivity

for low-abundance samples. Dual gain

amplifiers simultaneously process

detector signals through both low-gain and

high gain channels, extending the dynamic

range to 105.

4 GHz ADC Electronics enable a high sampling rate (32 Gbit/s) that improve the resolution, mass

accuracy, and sensitivity for low-abundance samples.

September 18, 2012 17

Analog-to-digital (ADC) Detector:

Unlike time-to-digital (TDC) detectors

which record single ion events, ADC

detection records multiple ion events,

allowing very accurate mass assignments

over a wide mass range and dynamic

range of concentrations.

EXPERIMENTAL DESIGN

September 18, 2012 18

Experimental Design

September 18, 2012 19

Pass/Fail

• Olive oil samples had been

subjected to sensory test and

classified as passed or failed.

• Samples were analyzed on the

7200 GC/Q-TOF. Data was

acquired in both EI and PCI

modes.

• MassHunter Qual was used for

deconvolution and Library

Searches.

• Mass Profiler Professional (MPP)

was used for statistical evaluation

of the data including construction of

class prediction model to correctly

predict whether the sample would

pass or fail the sensory test

EI/CI – MassHunter Qual and MPP

MassHunter Qualitative Analysis Deconvolution and Library Searches

September 18, 2012 20

Identification of Compounds using Library Search

Mass Profiler Professional Data filtering, PCA, ANOVA, Volcano Plot

September 18, 2012 21

• Mass Profiler Professional (MPP) was used for statistical evaluation of the data

including construction of class prediction model to correctly predict whether the

sample would pass or fail the sensory test

RESULTS

September 18, 2012 22

EVOOs that Pass Sensory Evaluation

This is why we need powerful data analysis software!

EVOOs that Pass and Fail Sensory Evaluation

442 unique compounds

were distinguished by

chromatographic

deconvolution, most of

which occur only once or

twice and are filtered out by

MPP.

The table shows how many of these 442

compounds were actually found in each sample.

Olive Oil Characterization: Data Filtering

PCA shows how the pass/fail data clusters.The samples that failed the

sensory test are marked in red and the ones that passed are blue.

Principal Component Analysis is used to Visualize

failed

passed

The Volcano Plot (on the right) shows fold-change for each entity on the x-

axis and significance on the y-axis.

Compounds accumulated

in the samples that failed

the sensory test.

Fold Change Analysis

Raw Data Verification

It pays to go back to the raw data to visually inspect MPP (multivariate

statistical) results. Here we see the raw data verification of the peak at

27.54 minutes.

Building the Classification Model

September 18, 2012 29

Training the model with data:

• Two data classes where established with the compounds

(markers) increased in the failed samples

• Create a model that predicts whether an olive oil sample will

pass the sensory test

x

x

x

x

Testing the Model

All samples correctly predicted. The samples that were not

used for building the prediction model are listed with the training

parameter set as ‘None’.

x

x

x

x

Testing the Model

Commercial unit mass EI spectral libraries can be searched using accurate mass

EI GC/Q-TOF data to identify compounds

Compound spectrum

NIST library spectrum

Compound spectrum

(accurate mass)

EI

Library Searching

C12H17

5.11 ppm C9H11

-3.58 ppm

C8H9

-2.63 ppm

C10H13

0.93 ppm

α-Cubebene, full scan

C15H24

α-Cubebene: MS/MS

Precursor: 204

CE: 10 eV

(replib) α-Cubebene

40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 2300

50

100

41

55

69 77

8191

105

119

133147

161

175 189

204

Accurate masses of ion fragments are consistent with molecular formula

MS/MS Analysis

Molecular Structure Correlator

Odor Profile of Compounds Increased in

Failed Samples

Searching the flavor company catalogs by CAS number provided

odor profile information on the up-regulated compounds.

1Bedoukian Research 2The Good Scents Company

Proposed NIST ID Formula CAS Odor

n-Hexadecanoic acid

C16H32O

2 57-10-3 Faint Oily1

Octadecanoic acid, ethyl

ester

C20H40O

2 111-61-5 Waxy2

Squalene C30H50 111-02-4 Floral2

α-Cubebene C15H24 17699-14-8 Herbal2

Summary: A model that predicts the classification of extra virgin olive oils was constructed using

data from the 7200 GC/Q-TOF and the MassHunter software suite.

September 18, 2012 36

High Resolution provides the required

selectivity

Mass Accuracy is essential for

defining compounds

MS/MS is critical of structural

confirmation of unknowns

Comprehensive Software is vital for

turning MS data into pertinent and

relevant results X

Data-Directed Multi-Omics of Biological

Pathways

Theodore Sana, PhD

Senior Scientist

Life Sciences Group, Agilent Technologies

Metabolomics and Integrated Biology Workflow

6550 QTOF Ion Funnel Technology:

• General HW overview

• Untargeted, unlabeled metabolite workflow

• Metabolite Identification

New Software Tools:

• Personal Compound Databases & Library (PCDL)

• Pathway Metabolite Database Creator (PMDC)

• Molecular Structure Correlator (MSC)

• MS/MS Library

Data driven multi-omics analysis for Pathway Mapping and Targeted Proteomics:

– GeneSpring

– Pathway Analysis Capabilities

Jet Stream Technology Hexabore Inlet Capillary Dual Ion Funnel

Generating more ions Sampling more ions Focusing more ions

6550 Q-TOF: NEW TECHNOLOGY IN ION SAMPLING IMPROVES

SENSITIVITY

Metabolites Detected

6520

6550

(Courtesy of Prof. Nicola Zamboni, ETH, Zurich) 8-fold increase in coverage

Drastically Improved Metabolite Coverage of

Central Carbon Metabolome Map

Increasing Your Confidence in Compound

Identification

1. MS/MS library and retention time matching

2. MS/MS library matching

3. Molecular Structure Correlation (MSC): a tool that uses

compound MS/MS spectral information to predict

bonds/compound structure

4. Accurate Mass + RT: Database matching accurate mass

with isotope pattern matching and Retention Time

5. Database matching, with isotope pattern matching

6. Database matching using accurate mass measurement:

Untargeted and/or Targeted (PMDC) mining of data

Confid

ence

Metabolomics – Untargeted & Targeted Data

Mining

“Untargeted Data Acquisition” – TOF / QTOF

Profiling of “unknowns”; all detectable metabolites; relative quantitation

– Untargeted data mining: • Samples are typically unlabeled

• Naïve or discovery based analysis (novel biomarkers/signatures)

• Finds hundreds/thousands of metabolites

• Metabolites need to be “identified”

• Map metabolites onto biological pathways

– Targeted data mining:

• Analyze data files using a database of compound names and target formulas (PCDL)

• Map results onto pathways

METLIN Personal Compound Database (PCD)

An accurate mass LC-MS database

• Based on public METLIN database

• Metabolomics specific database

• Contains > 50,000 compounds

• ~8000 lipids from LipidMaps

• 679 metabolites have an Agilent

provided retention time

• Customizable by user

• Works with other Agilent software

• MassHunter Qual

• ID Browser

• PCDL Manager

9/18/2012

Untargeted Data Mining: MassHunter Molecular Feature

Extraction (MFE), a Naïve Peak Finding algorithm

Find compound signals

• Find co-eluting ions that are related

• Include isotopes (13C, 15N, 2H, 18O)

• Include adducts, such as Na+ or K+

• Include dimers, such as (2M+H)+

• Create a compound chromatogram

(ECC)

• Sum all ion signals into one value

(Feature)

• Create a compound spectra

• Report results as retention time

and neutral mass

• Fully automated processing

• Create data file for export

Neutral Mass = 113.0589

“Targeted” Data Mining of data acquired in Untargeted

mode: Pathway to Metabolite Database Creator (PMDC)

Convert pathway metabolite information into Agilent personal compound database

• Select one or more pathways

• Remove redundant metabolites

PMDC

Search Pathway Database

Search Pathway

•Text search using:

• Match pathways - Name in

pathway

• Reaction partners – Compounds

relate to the typed compound

Add Pathways to Create New Database

Add Pathways

•Select pathways from matching

pathway list

•Press add pathway

•Create database

•Redundant compounds are removed

METLIN Personal Compound Database & Library

(PCDL)

METLIN PCD plus an accurate mass LC-MS/MS library (QTOF)

• MS/MS spectra from mono-isotopic ion

• MS/MS spectra are collected in ESI positive and negative ion mode

• Fragmentation data is collected at three collision energies: 10, 20 and 40

• MS/MS spectra are curated for quality

• Fragment ions are confirmed

• Fragment ions are mass corrected

• Noise ions removed

• Manually reviewed

• MS/MS searches use MassHunter Qual

• MS/MS Library contains > 2000 compounds

10, 20, 40 eV

MS/MS spectral difference matching for Sample vs Library:

2x10

0

0.5

Cpd 1: -ESI Product Ion (0.444-0.528 min, 4 Scans) Frag=140.0V [email protected] (558.0644[z=1] -> **) …

558.0635346.0542

78.9593210.9986 408.0103290.9646 522.0420

2x10

-1

0

Cpd 1: -ESI Product Ion (0.444-0.528 min, 4 Scans) Frag=140.0V [email protected] (558.0644[z=1] -> **) …

558.0635346.0542

78.9593 210.9986 408.0103290.9646 522.0420

2x10

0

0.5

N1-(5-Phospho-D-ribosyl)-AMP C15H23N5O14P2 - Product Ion Frag=140.0V [email protected] Metlin_AM …

558.0644346.0545

78.9591210.9997 408.0099290.9676 522.0433

Counts vs. Mass-to-Charge (m/z)

50 100 150 200 250 300 350 400 450 500 550 600

Forward

Reverse

MassHunter MS/MS Structural Correlation (MSC)

Search database of known compounds using empirical formula or mass

• Database matches must have compound structures

• Assign fragment ions to substructures of the proposed parent structure

• Assign probability to fragment forming

• Calculate a probability the proposed structure fits the MS/MS data

• Need to confirm via standards

Search ChemSpider – 30,000,000 entries

MS/MS Fragmentation

Structure Explained

GX

mRNA

Alternative Splicing

microRNA

Genome-wide association

Copy Number Variation

MPP

MS-Proteomics

MS-Metabolomics

Integrated Biology

Joint Pathway Analysis

Computational Network Discovery

NGS

SureSelect Target Enrichment

Whole Genome Sequencing

DNA Variation

Chromosomal Rearrangements

RNA-Seq

Gene Fusion Detection

Alternative Splicing

GeneSpring:

A Bioinformatics Suite of Integrated Modules

9/18/2012

LC/MS

GC/MS

Microarrays Biological

Pathways

MassHunter Qual/Quant

ChemStation AMDIS

Feature Extraction GeneSpring Platform

Data Driven multi–omics technologies & Pathway mapping

Alignment to Reference Genome NGS

52

MOA Results: Venn Diagram of Enriched Pathways

53

Multi-Dataset Visualization & Analysis

Supported Metabolite Databases:

1. KEGG

2. HMDB

3. LMP

4. ChEBI

5. CAS

Protein Databases:

1. Swiss-Prot

2. UniProt

3. UniProt/TrEMBL

Gene Databases:

Entrez Gene, GenBank, Ensembl, EC Number,

RefSeq, UniGene, HUGO, HGNC, EMBL

BridgeDb resolves the mapping problem

between databases for small molecules,

genes, or protein identifiers

Gene, Protein,

and Metabolite ID

Map entities

Table of compounds

BridgeDb Pathway

Db

Met1 CAS KEGG1

Met2 ------ ChEBI2

Met3 KEGG3 ------

KEGG1

ChEBI

KEGG3

Joint Pathway Analysis Tyrosine Metabolism

Microarray and

Metabolite

Data Overlay

Microarray or

Metabolite Data

Results

Heatmap of all pathway entities, dynamically linked to pathway

selection for comparative analysis

Active tab

Genes

Propose new experiments based on

pathway analysis

1. Re-mine originally acquired (or legacy)

untargeted metabolomics data based

on pathway analysis—create db

2. Design new experiments (metabolite,

protein or genes) based on pathway

results interpretation

Pathway-directed re-mining of metabolite data

Build custom metabolite

database

PCDL

Export protein IDs to Peptide

Selector for targeted MS/MS

Spectrum Mill

Upload select pathway genes

for custom microarray or NGS

design

eArray

Acknowlegements

Steve Fischer (Metabolomics & Proteomics Marketing Manager)

Norton Kitagawa (ID Browser, LC/MS: MPP software)

D. Benjamin Gordon (GeneSpring-IB architecture/PMDC)

Joe Roark and the MH Qual software engineering team

Christine Miller (Proteomics Applications)

Dave Peterson (GC/MS: MPP software)

Michael Janis, Michael Rosenberg (GeneSpring Technical Marketing)

Jayati Ghosh and Ashutosh (GeneSpring R&D)

Allan Kuchinsky for Cytoscape Plug-in, plus Agilent Labs team

Kyu Rhee (Weill Cornell Medical Center)