Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Agenda:
• Applications of GC/Q-TOF for Metabolomics Jennifer Gushue, Ph.D.
• Data-Directed Multi-Omics of Biological Pathways Theodore Sana, Ph.D.
• Integrated Biology is a Software Challenge Norton Kitagawa, Ph.D.
September 18, 2012
Metabolomics Society
1
Olive Oil Classification using the
Agilent 7200 Series GC/Q-TOF System
UC Davis Olive Center
Stephan Baumann, Agilent Technologies
5 September 18, 2012
Olive Oil Demand is Growing
September 18, 2012 6
The United States market is expected to surpass $1.8
billion by 2013.*
• Increase interest in Mediterranean foods
• Health benefits associate with olive oil: – Reduced risk of coronary heart disease
– Rich in antioxidants and anti-inflammatory compounds**
* Packaged Facts Market Report
** US FDA
Olive Oil Standards
International Olive Council and USDA Standards:
• Chemical tests
• Tasting panel sensory test
Limited understanding of organoleptic smell and taste.
Most Common Olive Oil Defects
Rancidity
A flavor in the olive oil usually accompanied by a greasy
mouth feel. In sensory tests, this greasiness is often
noticed first.
Fusty
It is caused by fermentation in the absence of oxygen; this
occurs within the olives before they are milled. A fusty smell
has been compared to sweaty socks, swampy vegetation,
or too-wet compost heap.
Winey-vinegary
That is caused by fermentation with oxygen, and can be
reminiscent of vinegar or nail polish.
Musty
Caused by moldy olives, it tastes of dusty, musty old
clothes, or the basement floor.
More Demand than Supply
International Olive Council and USDA Standards:
•Chemical tests
• Tasting panel sensory test
Sensory tests are expensive and subjective
they often fail EVOO sensory test
Imported Extra Virgin Olive Oils account for 99% of the
US supply.*
* UC Davis Olive Center
Possible Solution: Chemical Screening
Develop a chemical screen to predict whether an olive
oil will pass the sensory test.
• Allows producers to submit only those olive oils for sensory
testing that have a high probability of passing
• Reduces certification costs
• Increases the quality of the EVOO available in the
marketplace
Olive Oil Characterization
September 18, 2012 11
1. To create a model that could predict
whether olive oil sample would pass
or fail sensory test
2. To find statistically significant olive oil
components that are present at
distinct levels depending on whether
they passed or failed sensory test
7200 Series GC/Q-TOF
• High resolution full acquisition spectra
• Accurate mass measurements
• Fast acquisition of full spectra
• MS/MS mode
– Full spectrum of Product Ions
• With high resolution and accurate mass
• High sensitivity structural elucidation tool
Ideal tool for solving complex analytical problems
September 18, 2012 12
New Removable Ion Source includes repeller, ion volume, extraction lens and dual filaments
September 18, 2012 13
Hot Quartz Monolithic Quadrupole analyzer identical to the 7000 Quadrupole MS/MS
September 18, 2012 14
Hexapole Collision Cell accelerates ion through the cell to enable faster generation of
high-quality MS/MS spectra without cross-talk
September 18, 2012 15
Dual-Stage Ion Mirror improves second-order time focusing for high mass resolution
Pre dual-stage ion mirror Post dual-stage ion mirror
Proprietary INVAR flight tube sealed in a vacuum-insulated shell eliminates thermal
mass drift due to temperature changes to maintain excellent mass accuracy, 24/7.
September 18, 2012 16
4 GHz ADC electronics enable a high
sampling rate (32 Gbit/s) that improves the
resolution, mass accuracy, and sensitivity
for low-abundance samples. Dual gain
amplifiers simultaneously process
detector signals through both low-gain and
high gain channels, extending the dynamic
range to 105.
4 GHz ADC Electronics enable a high sampling rate (32 Gbit/s) that improve the resolution, mass
accuracy, and sensitivity for low-abundance samples.
September 18, 2012 17
Analog-to-digital (ADC) Detector:
Unlike time-to-digital (TDC) detectors
which record single ion events, ADC
detection records multiple ion events,
allowing very accurate mass assignments
over a wide mass range and dynamic
range of concentrations.
Experimental Design
September 18, 2012 19
Pass/Fail
• Olive oil samples had been
subjected to sensory test and
classified as passed or failed.
• Samples were analyzed on the
7200 GC/Q-TOF. Data was
acquired in both EI and PCI
modes.
• MassHunter Qual was used for
deconvolution and Library
Searches.
• Mass Profiler Professional (MPP)
was used for statistical evaluation
of the data including construction of
class prediction model to correctly
predict whether the sample would
pass or fail the sensory test
EI/CI – MassHunter Qual and MPP
MassHunter Qualitative Analysis Deconvolution and Library Searches
September 18, 2012 20
Identification of Compounds using Library Search
Mass Profiler Professional Data filtering, PCA, ANOVA, Volcano Plot
September 18, 2012 21
• Mass Profiler Professional (MPP) was used for statistical evaluation of the data
including construction of class prediction model to correctly predict whether the
sample would pass or fail the sensory test
442 unique compounds
were distinguished by
chromatographic
deconvolution, most of
which occur only once or
twice and are filtered out by
MPP.
The table shows how many of these 442
compounds were actually found in each sample.
Olive Oil Characterization: Data Filtering
PCA shows how the pass/fail data clusters.The samples that failed the
sensory test are marked in red and the ones that passed are blue.
Principal Component Analysis is used to Visualize
failed
passed
The Volcano Plot (on the right) shows fold-change for each entity on the x-
axis and significance on the y-axis.
Compounds accumulated
in the samples that failed
the sensory test.
Fold Change Analysis
Raw Data Verification
It pays to go back to the raw data to visually inspect MPP (multivariate
statistical) results. Here we see the raw data verification of the peak at
27.54 minutes.
Building the Classification Model
September 18, 2012 29
Training the model with data:
• Two data classes where established with the compounds
(markers) increased in the failed samples
• Create a model that predicts whether an olive oil sample will
pass the sensory test
All samples correctly predicted. The samples that were not
used for building the prediction model are listed with the training
parameter set as ‘None’.
x
x
x
x
Testing the Model
Commercial unit mass EI spectral libraries can be searched using accurate mass
EI GC/Q-TOF data to identify compounds
Compound spectrum
NIST library spectrum
Compound spectrum
(accurate mass)
EI
Library Searching
C12H17
5.11 ppm C9H11
-3.58 ppm
C8H9
-2.63 ppm
C10H13
0.93 ppm
α-Cubebene, full scan
C15H24
α-Cubebene: MS/MS
Precursor: 204
CE: 10 eV
(replib) α-Cubebene
40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 2300
50
100
41
55
69 77
8191
105
119
133147
161
175 189
204
Accurate masses of ion fragments are consistent with molecular formula
MS/MS Analysis
Odor Profile of Compounds Increased in
Failed Samples
Searching the flavor company catalogs by CAS number provided
odor profile information on the up-regulated compounds.
1Bedoukian Research 2The Good Scents Company
Proposed NIST ID Formula CAS Odor
n-Hexadecanoic acid
C16H32O
2 57-10-3 Faint Oily1
Octadecanoic acid, ethyl
ester
C20H40O
2 111-61-5 Waxy2
Squalene C30H50 111-02-4 Floral2
α-Cubebene C15H24 17699-14-8 Herbal2
Summary: A model that predicts the classification of extra virgin olive oils was constructed using
data from the 7200 GC/Q-TOF and the MassHunter software suite.
September 18, 2012 36
High Resolution provides the required
selectivity
Mass Accuracy is essential for
defining compounds
MS/MS is critical of structural
confirmation of unknowns
Comprehensive Software is vital for
turning MS data into pertinent and
relevant results X
Data-Directed Multi-Omics of Biological
Pathways
Theodore Sana, PhD
Senior Scientist
Life Sciences Group, Agilent Technologies
Metabolomics and Integrated Biology Workflow
6550 QTOF Ion Funnel Technology:
• General HW overview
• Untargeted, unlabeled metabolite workflow
• Metabolite Identification
New Software Tools:
• Personal Compound Databases & Library (PCDL)
• Pathway Metabolite Database Creator (PMDC)
• Molecular Structure Correlator (MSC)
• MS/MS Library
Data driven multi-omics analysis for Pathway Mapping and Targeted Proteomics:
– GeneSpring
– Pathway Analysis Capabilities
Jet Stream Technology Hexabore Inlet Capillary Dual Ion Funnel
Generating more ions Sampling more ions Focusing more ions
6550 Q-TOF: NEW TECHNOLOGY IN ION SAMPLING IMPROVES
SENSITIVITY
Metabolites Detected
6520
6550
(Courtesy of Prof. Nicola Zamboni, ETH, Zurich) 8-fold increase in coverage
Drastically Improved Metabolite Coverage of
Central Carbon Metabolome Map
Increasing Your Confidence in Compound
Identification
1. MS/MS library and retention time matching
2. MS/MS library matching
3. Molecular Structure Correlation (MSC): a tool that uses
compound MS/MS spectral information to predict
bonds/compound structure
4. Accurate Mass + RT: Database matching accurate mass
with isotope pattern matching and Retention Time
5. Database matching, with isotope pattern matching
6. Database matching using accurate mass measurement:
Untargeted and/or Targeted (PMDC) mining of data
Confid
ence
Metabolomics – Untargeted & Targeted Data
Mining
“Untargeted Data Acquisition” – TOF / QTOF
Profiling of “unknowns”; all detectable metabolites; relative quantitation
– Untargeted data mining: • Samples are typically unlabeled
• Naïve or discovery based analysis (novel biomarkers/signatures)
• Finds hundreds/thousands of metabolites
• Metabolites need to be “identified”
• Map metabolites onto biological pathways
– Targeted data mining:
• Analyze data files using a database of compound names and target formulas (PCDL)
• Map results onto pathways
METLIN Personal Compound Database (PCD)
An accurate mass LC-MS database
• Based on public METLIN database
• Metabolomics specific database
• Contains > 50,000 compounds
• ~8000 lipids from LipidMaps
• 679 metabolites have an Agilent
provided retention time
• Customizable by user
• Works with other Agilent software
• MassHunter Qual
• ID Browser
• PCDL Manager
9/18/2012
Untargeted Data Mining: MassHunter Molecular Feature
Extraction (MFE), a Naïve Peak Finding algorithm
Find compound signals
• Find co-eluting ions that are related
• Include isotopes (13C, 15N, 2H, 18O)
• Include adducts, such as Na+ or K+
• Include dimers, such as (2M+H)+
• Create a compound chromatogram
(ECC)
• Sum all ion signals into one value
(Feature)
• Create a compound spectra
• Report results as retention time
and neutral mass
• Fully automated processing
• Create data file for export
Neutral Mass = 113.0589
“Targeted” Data Mining of data acquired in Untargeted
mode: Pathway to Metabolite Database Creator (PMDC)
Convert pathway metabolite information into Agilent personal compound database
• Select one or more pathways
• Remove redundant metabolites
PMDC
Search Pathway Database
Search Pathway
•Text search using:
• Match pathways - Name in
pathway
• Reaction partners – Compounds
relate to the typed compound
Add Pathways to Create New Database
Add Pathways
•Select pathways from matching
pathway list
•Press add pathway
•Create database
•Redundant compounds are removed
METLIN Personal Compound Database & Library
(PCDL)
METLIN PCD plus an accurate mass LC-MS/MS library (QTOF)
• MS/MS spectra from mono-isotopic ion
• MS/MS spectra are collected in ESI positive and negative ion mode
• Fragmentation data is collected at three collision energies: 10, 20 and 40
• MS/MS spectra are curated for quality
• Fragment ions are confirmed
• Fragment ions are mass corrected
• Noise ions removed
• Manually reviewed
• MS/MS searches use MassHunter Qual
• MS/MS Library contains > 2000 compounds
10, 20, 40 eV
MS/MS spectral difference matching for Sample vs Library:
2x10
0
0.5
Cpd 1: -ESI Product Ion (0.444-0.528 min, 4 Scans) Frag=140.0V [email protected] (558.0644[z=1] -> **) …
558.0635346.0542
78.9593210.9986 408.0103290.9646 522.0420
2x10
-1
0
Cpd 1: -ESI Product Ion (0.444-0.528 min, 4 Scans) Frag=140.0V [email protected] (558.0644[z=1] -> **) …
558.0635346.0542
78.9593 210.9986 408.0103290.9646 522.0420
2x10
0
0.5
N1-(5-Phospho-D-ribosyl)-AMP C15H23N5O14P2 - Product Ion Frag=140.0V [email protected] Metlin_AM …
558.0644346.0545
78.9591210.9997 408.0099290.9676 522.0433
Counts vs. Mass-to-Charge (m/z)
50 100 150 200 250 300 350 400 450 500 550 600
Forward
Reverse
MassHunter MS/MS Structural Correlation (MSC)
Search database of known compounds using empirical formula or mass
• Database matches must have compound structures
• Assign fragment ions to substructures of the proposed parent structure
• Assign probability to fragment forming
• Calculate a probability the proposed structure fits the MS/MS data
• Need to confirm via standards
Search ChemSpider – 30,000,000 entries
MS/MS Fragmentation
Structure Explained
GX
mRNA
Alternative Splicing
microRNA
Genome-wide association
Copy Number Variation
MPP
MS-Proteomics
MS-Metabolomics
Integrated Biology
Joint Pathway Analysis
Computational Network Discovery
NGS
SureSelect Target Enrichment
Whole Genome Sequencing
DNA Variation
Chromosomal Rearrangements
RNA-Seq
Gene Fusion Detection
Alternative Splicing
GeneSpring:
A Bioinformatics Suite of Integrated Modules
9/18/2012
LC/MS
GC/MS
Microarrays Biological
Pathways
MassHunter Qual/Quant
ChemStation AMDIS
Feature Extraction GeneSpring Platform
Data Driven multi–omics technologies & Pathway mapping
Alignment to Reference Genome NGS
52
Multi-Dataset Visualization & Analysis
Supported Metabolite Databases:
1. KEGG
2. HMDB
3. LMP
4. ChEBI
5. CAS
Protein Databases:
1. Swiss-Prot
2. UniProt
3. UniProt/TrEMBL
Gene Databases:
Entrez Gene, GenBank, Ensembl, EC Number,
RefSeq, UniGene, HUGO, HGNC, EMBL
BridgeDb resolves the mapping problem
between databases for small molecules,
genes, or protein identifiers
Gene, Protein,
and Metabolite ID
Map entities
Table of compounds
BridgeDb Pathway
Db
Met1 CAS KEGG1
Met2 ------ ChEBI2
Met3 KEGG3 ------
KEGG1
ChEBI
KEGG3
Joint Pathway Analysis Tyrosine Metabolism
Microarray and
Metabolite
Data Overlay
Microarray or
Metabolite Data
Results
Heatmap of all pathway entities, dynamically linked to pathway
selection for comparative analysis
Active tab
Genes
Propose new experiments based on
pathway analysis
1. Re-mine originally acquired (or legacy)
untargeted metabolomics data based
on pathway analysis—create db
2. Design new experiments (metabolite,
protein or genes) based on pathway
results interpretation
Pathway-directed re-mining of metabolite data
Build custom metabolite
database
PCDL
Export protein IDs to Peptide
Selector for targeted MS/MS
Spectrum Mill
Upload select pathway genes
for custom microarray or NGS
design
eArray
Acknowlegements
Steve Fischer (Metabolomics & Proteomics Marketing Manager)
Norton Kitagawa (ID Browser, LC/MS: MPP software)
D. Benjamin Gordon (GeneSpring-IB architecture/PMDC)
Joe Roark and the MH Qual software engineering team
Christine Miller (Proteomics Applications)
Dave Peterson (GC/MS: MPP software)
Michael Janis, Michael Rosenberg (GeneSpring Technical Marketing)
Jayati Ghosh and Ashutosh (GeneSpring R&D)
Allan Kuchinsky for Cytoscape Plug-in, plus Agilent Labs team
Kyu Rhee (Weill Cornell Medical Center)