Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Designed Biofluid Mixtures Allow
Feature-Wise Evaluation of Metabolic
Profiling Analytical Platforms
Toby Athersuch
Isaac Newton Institute for Mathematical Sciences
30/08/2011
MRC-HPA Centre for Environment and Health
• NIH definition:– “a characteristic that is objectively measured and evaluated as an indicator of normal biologic processes,
pathogenic processes, or pharmacologic responses to a therapeutic intervention “
Biomarkers Definitions Working Group (2001).
• Where do we find them?– Many sampling source (urine / plasma / tissue / breath / etc.)
– Combinations of measurements
• Considerations for sampling sources:– Urine Time-average systemic excretion
– Blood „Snapshot‟ of system
– Tissue Likely site of injury/concern/action/effect
• Biomarkers can potentially:– Report more specifically on type of exposure
– Give a better idea on the exposure that has actually occurred
– Reflect a variety of stages in the initiation and progression of disease, and efficacy of treatment
3
Biomarkers
• Finding biomarkers– EXPOSURE
• External exposure
• Internal dose
– BIOLOGICALLY EFFECTIVE DOSE
– EFFECT• Biological response / early biological effect
• Altered biological structure or function
• Disease onset
– SUSCEPTIBILITY• Underlying characteristics that facilitate/modulate exposure and response
• Adapted from: Committee on Biological Markers of the National Research Council. Biological markers in
environmental health research. Environ. Health Perspect. 74: 3-9, 1987.
Internal
Dose
Biologically
Effective
Dose
Early
Biological
Effect
Altered
Structure /
Function
Clinical
DiseaseExposure
Susceptibility
Biomarker
Marker progression
Influence of individual
susceptibility on progression
Key
Biomarkers
• Measuring many (10‟s – 1000‟s) of biomolecules at once– Largely untargeted approach
– (Hopefully) good coverage of molecular classes of interest (e.g. metabolites, gene transcripts, etc.)
– Minimise assumptions
– Useful for hypothesis generation – getting a list of potential candidates for further validation
• The „profile‟ may be considered as a whole– Combinations of molecules may respond in concert
– Potentially more sensitivity and specificity
• Approach across „omics‟ sciences have numerous common elements– Differ according to level/type of biological information being investigated and associated methodologies
• Generate a multiparametric dataset
• Apply multivariate statistics to derive putative biomarkers
5
Omics and Global Profiling
Metabonomics
“…the quantitative measurement of the dynamic multiparametric
metabolic response of living systems to pathophysiological stimuli or
genetic modification…” - MetabonomeNicholson JK, Lindon JC, Holmes E. 1999. 'Metabonomics': understanding the metabolic responses of living systems
to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica.
29:1181-1189.
Metabolomics
“...the complete set of metabolites / low-molecular-weight
intermediates, which are context dependent, varying according to the
physiology, developmental or pathological state of the cell, tissue,
organ or organism…” - MetabolomeFiehn O. 2002. Metabolomics – the link between genotypes and phenotypes. Plant Mol. Biol. 48(1-2) 155-171.
Metabolic Profiling
General term for „omics‟ metabolite measurements MRC-HPA Centre for Environment & Health
Some Definitions
• Profiles can be any description of the small molecule composition of a „biological
sample‟
– For each sample measure as many different molecules as possible
– Measure in as unbiased a manner as possible
– Usually derived using spectroscopy
• Can simultaneously generate (multiple) hypotheses and test them
– No a priori assumptions needed about what is important (untargeted analysis)
– Can be used to focus on specific classes or set of metabolites (targeted)
– Multivariate analysis makes this process efficient for biomarker discovery
– Combinations of biomarkers can be more sensitive and specific than one
• Profiles must be:
– Amenable to quantitative interpretation
– Reproducible (relative to „biological‟ variation)
• Profiles need not be:
– Comprehensive (limited by instrumental considerations)
– Fully resolved and annotatedMRC-HPA Centre for Environment & Health
Metabolic Profiling
PreparationCollection Analysis Data Analysis
Outline of Analysis Strategy
1. Generate profiles that reflect/report on metabolic state of biological system under study
2. Use multivariate analysis / pattern recognition to indicate profile features related to intervention/disease…
3. Identify metabolites responsible for these profile features – putative biomarkers
4. Generate testable hypotheses to validate putative biomarkers
5. Use profiles or features to derive classification models
MRC-HPA Centre for Environment & Health
Mass Spectrometry
NMR Spectroscopy Chromatography
Single quad MS
Triple quad MS
Time-of-flight MS
Ion mobility MS
Ion trap MS
FTMS
Gas chromatography
Capillary electrophoresis
UPLC
HPLC
SPE
1D NMR
Multidimensional NMR
Magic angle spinning NMR
Flow injection NMR
Capillary NMR
Hyphenation
High ThroughputPreparation Robotics
Automation
LIMS
MRC-HPA Centre for Environment & Health
Analytical Platforms for Metabolic Profiling
• Spectra of biofluids (urine, plasma, bile, CSF) are typically very complex, with many
overlapped peaks
• Spectral assignment is complex, but necessary in drug metabolism work and
metabonomics
H ppm
600 MHz 1H NMR spectrum of rat urine
Biofluid NMR for Metabolic Profiling
• 1D 1H NMR data amenable to analysis
– Full resolution• e.g. 32 K points / spectrum
• high computational load
• Potentially affected by minor chemical shift
variation
H ppm
– Integrated regions• Arbitrary / targeted
• Ability to incorporate moderate
chemical shift variation
• Less specific output from data analysis
Region
1 2 3 4
Sampl
e 1
Sampl
e 2
Sampl
e 3
…
Spectral Data for Multivariate Analysis
• 1D 1H NMR data amenable to analysis
– Targeted analysis• Fit model spectra of well-characterised metabolites to target spectrum
• Quantification possible through reference to internal standard
Spectral Data for Multivariate Analysis
Example: „Targeted Profiling‟ spectral fitting using Chenomx NMR Suite:
Spectrum to be fitted
Metabolite 1
Metabolite 2
Sumline
MRC-HPA Centre for Environment & Health
Summary
Designed Mixtures in Biofluid Analysis
Background:
Traditional „spike-in‟ experiments used for the assessment of precision and accuracy of
analyte measurements, are adversely affected by variation in the background matrix
composition. Such approaches my underestimate the risk of matrix effects confounding
accurate measurement of major metabolites
Aim:
To evaluate and illustrate how mixing intact biofluids according to a know, suitable
experimental design can help:
- evaluate individual measurements
- compare performance
- enable optimisation
… of quantitative spectral analysis tools under conditions that better approximate a real
metabonomics experiment.
Summary
From: NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/, Updated 6/2010
Pilot Study – Human Urine
STUDY DESIGN - Simplex Lattice
A {q, m} simplex-lattice design for q components consists of points defined by the following
coordinate settings:
The proportions assumed by each component take the m+1 equally spaced values from 0 -1,
xi = 0, 1/m, 2/m, ... , 1 for i = 1, 2, ... , q
All of the possible combinations of the proportions described are sampled in the
experimental.
STRATEGY
If a metabolite is accurately measured by targeted analysis from this data set, then the
variation in the measurements will fit a linear regression to this experimental design. In this
study we have compared a ‘targeted profiling’ approach using Chenomx NMR Suite with a
‘targeted bucketing’ approach.
Experimental - 1
Pilot Study – Human Urine
Schematic showing the experimental design.
Pooled urines taken from a previously
completed human nutritional intervention were
mixed in known proportions
Human urine
(condition A)
Human urine
(condition B)
Human urine
(control)
-10000
0
10000
-60000 -50000 -40000 -30000 -20000 -10000 0 10000 20000 30000 40000 50000 60000
t[2]
t[1]
TJA_NHLI_GA_110711_DM_SIMCA_v1.M8 (PCA-X)
t[Comp. 1]/t[Comp. 2]
Colored according to Obs ID (Primary)
R2X[1] = 0.830682 R2X[2] = 0.0653338
Ellipse: Hotelling T2 (0.95)
001b
001a
002a
002b
003a003b
004b004a
005b005a
006a
006b
007b
007a
008a008b
009b
009a
010a
010b
SIMCA-P+ 11.5 - 23/08/2011 11:39:16
Results - 1
Pilot Study – Human Urine
Student (1st) attempt at preparing „real‟ biofluid samples
PCA revealed relatively weak concordance with experimental design
Experimental - 2
Pilot Study - Standard MixturesThree different standard compounds were mixed in
known proportions
Glycine Citrate
Hippurate
Results - 2
Internal
Standard
Citrate
Glycine
Pilot Study - Standard MixturesSpectra are sparse
No spectral overlap between compounds
Hippurate
Hippurate
Poor shimming Adequate shimming
Results - 2
-10000
0
10000
-20000 -10000 0 10000 20000
t[2]
t[1]
TJA_STD_GA_110810_DM_SIMCA.M8 (PCA-X)
t[Comp. 1]/t[Comp. 2]
Colored according to Obs ID (Primary)
R2X[1] = 0.574136 R2X[2] = 0.303233
Ellipse: Hotelling T2 (0.95)
001c001a
001b
002b
002a002c
003c003b003a
004a004b004c
005b005c005a
006b006a
007a007b
007c
008a008c008b
009c
009b009a
010a010b010c
SIMCA-P+ 11.5 - 23/08/2011 16:20:16
-10000
0
10000
-20000 -10000 0 10000 20000
t[2]
t[1]
TJA_STD_GA_110808_DM_SIMCA.M6 (PCA-X)
t[Comp. 1]/t[Comp. 2]
Colored according to Obs ID (Primary)
R2X[1] = 0.534172 R2X[2] = 0.295828
Ellipse: Hotelling T2 (0.95)
001c001a001b
002b002a002c
003c003b
003a
004a004b004c
005b005c
005a
006b006a
006c
007a007b007c
008a008c008b
009c009b009a
010a010b010c
SIMCA-P+ 11.5 - 23/08/2011 16:18:23
Results - 2
-14000
-12000
-10000
-8000
-6000
-4000
-2000
0
2000
4000
6000
8000
10000
12000
14000
-10000 0 10000
t[2]
t[1]
TJA_STD_GA_ALL_DM_SIMCA.M7 (PCA-X)
t[Comp. 1]/t[Comp. 2]
Colored according to Obs ID (Primary)
R2X[1] = 0.525247 R2X[2] = 0.295904
Ellipse: Hotelling T2 (0.95)
0
1
001c
005b
008a
005c
002b
010a
003c
006b
009c
004a
009b
007a
010b
002a
003b
007b
002c
004b
007c
003a
001a
010c
005a
001b
006a
009a
008c
004c
008b
006c
1001c
1005b
1008a
1005c
1002b
1010a
1003c
1006b
1009c
1004a
1009b
1007a
1010b
1002a
1003b
1007b
1002c
1004b
1007c
1003a
1001a
1010c
1005a
1001b
1006a
1009a
1008c
1004c
1008b
SIMCA-P+ 11.5 - 23/08/2011 16:41:22
Pilot Study – Standard Mixtures
Comparison of all samples by PCA revealed minimal contribution from shimming artifact
Explained by this being the result of a similar change to all measured signals.
Poor shimming
Adequate shimming
Experimental - 3
Study - Different Urines
Schematic showing the experimental design.
Three different urines with substantially different
metabolic profiles were mixed in known proportions
Rat urine
(day)
Rat urine
(night)
Human urine
Experimental - 3
EXPERIMENTAL METHODS
NMR Acquisition
Quantitative 1D 1H NMR spectra were obtained
600 MHz
NOESYPR1D pulse sequence
Sum of 128 free
Spectral width of 20 ppm.
Analytical triplicates were used for each biofluid sample
NMR Data Processing
Line broadening of 1 Hz prior to Fourier transformation.
Phase and baseline distortions
Internally to the TSP-d4 resonance relative to DSS (1H 0.00).
Targeted Profiling
Quantification of individual urinary metabolites
Chenomx NMR Suite 4.6 (Chenomx Inc., Edmonton, Canada).
Reference spectra from the Chenomx 600 MHz library
Relative concentrations of each metabolite present determined.
Targeted Bucketing
Spectral regions were defined for each metabolite of interest
The integral area of these regions was calculated
Pattern Recognition
Partial least squares regression
- Metabolite concentration data (X)
- Experimental design matrix (Y)
Goodness-of-fit (R2) and goodness-of-prediction (Q2) estimates
Targeted Fitting
Targeted Integration
Mixture
Proportions
Design Mixtures
Spectra
PLS
Results - 3
PLS
Com
pone
nt
2
PLS Component 1
12
4
5
3
7
6
98
10
SampleStudy – Different Urine
PLS scores for “Targeted Profiling”
method.
Show a relatively tight fit to the
experimental design
Samples clustered in replicates
Several metabolites that were
affected by overlap or could not be
accurately quantified identified
R2 < 0.6 = 5 of 54
R2 < 0.8 = 14 of 54
MRC-HPA Centre for Environment & Health
1,3
-Dim
eth
ylu
rate
1,6
-Anhydro
-ß-D
-glu
cose
N,N
-Dim
eth
ylfo
rma
mid
e
Trim
eth
yla
min
e N
-oxid
e-0.2
-0.1
-0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.01
-Me
thyln
ico
tinam
ide
2-H
yd
roxyis
obuty
rate
2-O
xo
glu
tara
te2
-Oxo
isocapro
ate
3-H
yd
roxyis
ovale
rate
3-I
ndoxyls
ulfate
Ace
tate
Ala
nin
eA
llan
toin
Arg
inin
e
Be
tain
eC
ho
line
cis
-Aco
nitate
Citra
teC
rea
tinin
eD
ime
thyla
min
e
Eth
anol
Eth
anola
min
eF
orm
ate
Fu
co
se
Fu
ma
rate
Glu
cose
Gly
cin
eG
lyco
late
Gu
anid
oaceta
te
Hip
pura
teIs
ole
ucin
eL
acta
te
Le
ucin
eM
alo
nate
Me
thanol
Me
thylm
alo
nate
Me
thyls
uccin
ate
N,N
-Dim
eth
ylg
lycin
eN
-Ace
tylg
lycin
eO
-Ph
osphocholin
e
Oxa
laceta
teP
henyla
cety
lgly
cin
eP
yru
va
te
Su
ccin
ate
Ta
uri
ne
Th
reonin
e
tra
ns-A
conitate
Trig
on
elli
ne
Try
pto
phan
Tyro
sin
eU
racil
Ure
aV
alin
eX
ylo
se
PLS model goodness-of-fit (R2 ) and goodness-of-prediction (Q2 ) metrics for individual metabolite concentrations.
Results - 3
Results - 3
Study – Different Urines
Gross urine concentration varies considerable between samples as a consequence of highly
variable dilution factor.
To remove/reduce this effect from the spectra being models, probabilistic quotient
normalisation (PQN, Dieterle et al. 2006) was applied.
The normalised spectra were regressed against the experimental design.
Remaining R2 values should therefore better reflect how well the metabolite is measured
across all samples, without the influence of overall urine dilution.
Results - 3
-0.2
-0.1
-0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1,3
-Dim
eth
ylu
rate
1-M
eth
yln
icotinam
ide
2-H
ydro
xyis
ob
uty
rate
2-O
xoglu
tara
te
2-O
xois
ocapro
ate
3-H
ydro
xyis
ova
lera
te
3-I
ndoxyls
ulfate
Aceta
te
Ala
nin
e
Alla
nto
in
Arg
inin
e
Beta
ine
Cholin
e
cis
-Aconitate
Citra
te
Cre
atinin
e
Dim
eth
yla
min
e
Eth
anol
Eth
anola
min
e
Form
ate
Fucose
Fum
ara
te
Glu
cose
Gly
cin
e
Gly
co
late
Guanid
oaceta
te
Hip
pura
te
Isole
ucin
e
Lacta
te
Leucin
e
Malo
nate
Meth
anol
Meth
ylm
alo
nate
Meth
yls
uccin
ate
N,N
-Dim
eth
ylform
am
ide
N,N
-Dim
eth
ylg
lycin
e
N-A
cety
lgly
cin
e
O-P
hosphocholin
e
Oxala
ceta
te
Phenyla
cety
lgly
cin
e
Pyru
vate
Succin
ate
Taurine
Thre
onin
e
trans-A
conitate
Trigonelli
ne
Trim
eth
yla
min
e N
-oxid
e
Try
pto
phan
Tyro
sin
e
Ura
cil
Ure
a
Valin
e
Xylo
se
1,6
-Anhydro
-ß-D
-glu
cose
PLS model goodness-of-fit (R2 ) and goodness-of-prediction (Q2 ) metrics for individual metabolite concentrations.
Targeted Profiling
Results - 3
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1,3
-Dim
eth
ylu
rate
1,6
-Anhydro
-ß-D
-glu
cose
1-M
eth
yln
icotinam
ide
2-H
ydro
xyis
ob
uty
rate
2-O
xoglu
tara
te
2-O
xois
ocapro
ate
3-H
ydro
xyis
ova
lera
te
3-I
ndoxyls
ulfate
Aceta
te
Ala
nin
e
Alla
nto
in
Arg
inin
e
Beta
ine
Cholin
e
cis
-Aconitate
Citra
te
Cre
atinin
e
Dim
eth
yla
min
e
Eth
anol
Eth
anola
min
e
Form
ate
Fucose
Fum
ara
te
Glu
cose
Gly
cin
e
Gly
cola
te
Guanid
oaceta
te
Hip
pura
te
Isole
ucin
e
Lacta
te
Leucin
e
Malo
nate
Meth
anol
Meth
ylm
alo
nate
Meth
yls
uccin
ate
N,N
-Dim
eth
ylform
am
ide
N,N
-Dim
eth
ylg
lycin
e
N-A
cety
lgly
cin
e
O-P
hosphocholin
e
Oxala
ceta
te
Phenyla
cety
lgly
cin
e
Pyru
vate
Succin
ate
Taurine
Thre
onin
e
trans-A
conitate
Trigonelli
ne
Trim
eth
yla
min
e N
-oxid
e
Try
pto
phan
Tyro
sin
e
Ura
cil
Ure
a
Valin
e
Xylo
se
PLS model goodness-of-fit (R2 ) and goodness-of-prediction (Q2 ) metrics for individual metabolite concentrations.
Targeted Integration
Results - 3
Study – Different Urines
Consequences of Normalisation
For the analysis of the „Targeted Profiling‟ approach, variation in most metabolites still
strongly associated with the differences between the original samples,
A greater number of metabolites with lower performance were identified:
R2 < 0.6 = 16 of 54
R2 < 0.8 = 31 of 54
Comparison of Approaches
An equivalent analysis using a „Targeted Integration‟ approach to represent the same
metabolites indicated that such a lower distribution of
R2 < 0.6 = 19 of 54
R2 < 0.8 = 38 of 54
MRC-HPA Centre for Environment & Health
Discussion
Study – Different Urines
By being able to predict the relationship between measurements on different samples, each
with a different background, we were able to estimate the reliability of quantification –
based on explained variance (R2) - on a metabolite-by-metabolite basis, and to refine the
set of metabolites that we could safely interpret from the urinary data.
There was broad agreement between „Targeted Profiling‟ and „Targeted Integration‟
approaches.
Targeted Profiling has an advantage in quantifying metabolites with peaks in overlapped or
baseline-dominated spectral regions.
Targeted Integration has the advantage of being very rapid, and requiring minimal
repetitive, time-consuming spectral fits to be made (a process that has not been entirely
automated yet).
Using designed biofluid mixtures provides a rational basis for exploiting information from
several samples whereas spectral deconvolution is typically applied to one spectrum at a
time. Thus, the strategy may have general benefits for quantitative analysis in biofluid
metabolic profiling.
MRC-HPA Centre for Environment & Health
References
References
1. Weljie AM, Newton J, Mercier P, Carlson E, Slupsky C M: Targeted profiling: quantitative
analysis of 1H NMR metabolomics data. Anal. Chem. 2006. 78, 4430-4442.
2. Eriksson L, Johansson E, Kettaneh-Wold N, Wold S. Multi- and megavariate data analysis.
Principles and applications. Umetrics AB, Umea, Sweden. 2001.
3. Dieterle F, Ross A, Schlotterbeck G, Senn H: Probabilistic quotient normalization as robust
method to account for dilution of complex biological mixtures. Application in 1H NMR
metabonomics. Anal. Chem. 2006. 78, 4281-4290.
Evaluate Simple Mixtures
No matrix
Spiked into matrix
Variety of Matrices
Implementation in Future Studies
Start routinely adding in mixtures of biofluids incorporating known experimental groups
(e.g. control vs treatment A vs treatment B)
Investigate Higher-Dimensionality Mixtures
Investigate which experimental designs fit best within a routine profiling context
Suggestions welcome!
MRC-HPA Centre for Environment & Health
Future Work
• Metabolic profiles are information-rich phenotypes that can be characterised by
spectroscopic platforms such as NMR
• Such platforms provide information on an array of metabolites simultaneously.
• Multivariate analysis of the spectral sets can help derive biomarkers in a variety of
contexts (e.g. toxicology, efficacy, disease progression, diet, etc).
• Measurements are often made on complex biofluids with varying background
interference from the underlying matrix. Characterising how this can influence the
measurements across a sample set can help inform the analyst of their reliability.
• Using a simplex lattice design, we evaluated the performance of NMR spectroscopy
to characterise >50 common metabolites using two alternative methods for metabolite
quantification.
• We conclude that inclusion of designed mixtures in routine metabolic profiling work
may improve our understanding of which putative markers are reliable.
MRC-HPA Centre for Environment & Health
Summary
Biomolecular Medicine• Prof Jeremy Nicholson
• Prof Elaine Holmes
• Prof John Lindon
• Dr Hector Keun
• Dr Tim Ebbels
• Dr Muireann Coen
• Dr Jake Bundy
• Dr James Ellis
• Dr Rachel Cavill
• Dr Olaf Beckonert
• Dr Jake Pearce
• Dr Anas Kamleh
Epidemiology & Biostatistics• Prof Paul Elliott
• Prof Paolo Vineis
• Dr Anna Hansell
• Dr Claus Svendsen
• Dr David Spurgeon
• Dr Marta Blangiardo
• Dr John Gulliver
IARC• Prof Chris Wild
• Dr Mazda Jenab
• Prof Pierre Hainaut
U. Maastricht• Dr Theo de Kok
U. Utrecht• Dr Roel Vermeulen
NHRF• Soterios Kyrtopoulos
SeRMN• Dr Teodor Parella
• Dr Miriam Perez-Trujillo
• Mr Sergi Gil
Acknowledgements and Collaborators
Chenomx• Aalim Weljie
• Jack Newton
• Shahid Malik
http://www.environment-health.ac.uk
Funding and Support