http://wagerlab.colorado.edu Neuroimaging meta-analysis: Pitfalls and emerging solutions Tor D. Wager Department of Psychology and Neuroscience and The Institute for Cognitive Science The University of Colorado, Boulder
PowerPoint PresentationTor D. Wager
Department of Psychology and Neuroscience and The Institute for
Cognitive Science
The University of Colorado, Boulder
http://wagerlab.colorado.edu
PET FMRI
http://wagerlab.colorado.edu
Wager, Lindquist, & Hernandez, 2007
http://wagerlab.colorado.edu
Consistency
• Sensitivity
• Borrowing strength from prior studies to both increase effect
sizes and provide unbiased measures of them
• Specificity
across a variety of candidate psychological conditions
http://wagerlab.colorado.edu
http://wagerlab.colorado.edu
Pitfalls New Tools Inference Why meta- analysis?
1. Why meta-analysis: The promise 2. What do you want to know?
Making the right inference 3. Pitfalls and solutions: Inferences on
consistent activation 4. Expanding the toolbox: New kinds of
inferences 5. The Limits of Meta-Analysis: The proof is in the
pudding
Pudding
http://wagerlab.colorado.edu
Inference
• Most common type of inference: Activation consistency
• Is there consistent activation across studies during Task x? •
Are there any overlapping brain regions activated by Tasks x and y?
• Are there significant differences in activation consistency for
Tasks x vs. y?
• What would you like to conclude from your meta-analysis?
http://wagerlab.colorado.edu
SCANLab
Working memory Executive WM Long-term memory
Inhibition Task switching
Wager et al., in press; Van Snellenberg & Wager, 2010
http://wagerlab.colorado.edu
• Most common type of inference: Activation consistency
• This type of inference does not tell us many things we want to
know!
• What would you like to conclude from your meta-analysis?
• Other types of inference
• Spatial inference: - Where is the epicenter of activation for
Task x? - Do tasks Tasks x and y activate the same locations or
spatial patterns?
- This is DIFFERENT from above!! -
• Decoding/’reverse inference’: • What psychological process is
implied by activation in region/pattern
A?
http://wagerlab.colorado.edu
Pitfalls New Tools Inference Why meta- analysis?
1. Why meta-analysis: The promise 2. What do you want to know?
Making the right inference 3. Pitfalls and solutions: Inferences on
consistent activation 4. Expanding the toolbox: New kinds of
inferences 5. The Limits of Meta-Analysis: The proof is in the
pudding
Pudding
http://wagerlab.colorado.edu
Pitfall #1: Failure to assess generalization across studies
Desired inference: • Is there consistent activation across studies
during Task x?
Problems: • We typically analyze peak coordinates • Grouping peaks
together only allow us to make inferences about new peaks • Studies
report different numbers of peaks, some more, some fewer • Results
can be dominated by one/a few studies!
http://wagerlab.colorado.edu
Peak coordinates Combined across studies
Kernel convolution Density kernel
Apply significance threshold
Significant results
Density kernel: Chein, 1998; Phan et al., 2002; Wager et al., 2007;
Lindquist et al. 2012 Gaussian density kernel + ALE: Turkeltaub et
al., 2002; Laird et al., 2005; others
Ignores the fact that some studies report more peaks than
others!
http://wagerlab.colorado.edu
- Spherical convolution: Interpretable metric (contrast counts) -
Weighting by sample size - Weighting by fixed/random effects (and
other quality metrics)
Wager, Lindquist, & Kaplan 2007; Eickhoff et al. 2010
http://wagerlab.colorado.edu
Pitfall #2: Improper accounting for biases
Desired inference: • Is there consistent activation across studies
during Task x?
Problems: • Studies vary dramatically in sample sizes • Some
procedures are more valid than others (e.g., fixed vs. random
effects) • Some studies simply report more peaks (software
differences; this is arbitrary!!)
Solutions? • No correction is inefficient (low power) – want large
studies to dominate • Weighting by square root of sample size is
good – proportional to standard
error • Weighting by random vs. fixed effects is good (random =
more weight, but no
theoretically optimal weight) • Weighting by Z-scores is probably a
bad idea.
http://wagerlab.colorado.edu
Pitfall #2: Improper accounting for biases
Desired inference: • Is there consistent activation across studies
during Task x?
Modified Galbraith plots: Activation in likely ‘true positive’
regions (working memory)
Wager et al. 2009, Neuroimage
• Z-scores are (much) higher for fixed effects!
• Very weak relationship between sample size and Z-score, esp. for
random effects studies
• Small studies have HIGHER variance, which means MORE VARIANCE in
Z- scores across the brain, which means MORE PEAKS overall and
higher Z- scores in false positive regions!
• Weighting by Z-score or number of peaks is probably a bad
idea.
http://wagerlab.colorado.edu
Meta-analysis 2.0: Inference across studies Multilevel kernel
density analysis
- Spherical convolution: Interpretable metric (contrast counts) -
Weighting by sample size - Weighting by fixed/random effects (and
possibly other quality metrics)
- If we permute peaks within studies, those that report many
coordinates will dominate
- Loss of efficiency, small-sample-size studies dominate - So: We
do ‘blob’-level permutation
Wager, Lindquist, & Kaplan 2007; Eickhoff et al. 2010
http://wagerlab.colorado.edu
Pitfall # 3: Erroneous spatial inference
- Where is the epicenter of activation for Task x? - Do tasks Tasks
x and y activate the same locations or
spatial patterns?
• How much overlap is enough?
• We can focus on either the commonalities or the distinctions –
not meaningful because there is no formal null hypothesis test
here.
• Looking at amount of overlap in thresholded maps will not cut
it!!!
Overlapping fMRI activity in 4 types of attention shifting
(N = 40; p < .05 corrected) Wager et al. 2005, CABN
http://wagerlab.colorado.edu
Spatial inference
- Where is the epicenter of activation for Task x? - Do tasks Tasks
x and y activate the same locations or
spatial patterns?
• Peak coordinates are distributed in space • Need spatial
confidence intervals on where
Spatial confidence regions for approach (green) vs. avoidance (red)
in emotion studies Wager et al. 2003, Neuroimage
http://wagerlab.colorado.edu
Spatial inference: Spatial tests
- Where is the epicenter of activation for Task x? - Do tasks Tasks
x and y activate the same locations or
spatial patterns?
Two approaches:
1. Spatial MANOVA test/discriminant analysis on coordinates 2.
Spatial models, inference on distribution of coordinates
(Always need to consider the right unit of analysis: study-level,
not peak level!)
Overlapping fMRI activity
Location switch (blue) Attribute switch (yellow) Rule switch
(cyan)
Wager et al. 2004, Neuroimage
Also: 3-D Kolmogorov-Smirnov test Murphy, Nimmo-Smith, &
Lawrence ( )
http://wagerlab.colorado.edu
Spatial inference: Spatial models
- Where is the epicenter of activation for Task x? - Do tasks Tasks
x and y activate the same locations or
spatial patterns? Kang et al. 2011: Bayesian generative model
• Peaks drawn from ‘activation centers’ for
studies • Activation centers across studies drawn
from ‘population centers’ • Explicit spatial modeling,
population
inference
Kang et al. 2011, JASA
http://wagerlab.colorado.edu
http://wagerlab.colorado.edu
• “Reverse inference” • Formal assessment of the probability of a
task type T given a set of
activation data (e.g., on/off activation values in a set of voxels)
• Can provide estimates of sensitivity and specificity for T
given:
• (1) A defined set of tasks for k tasks • (2) Activation data for
j voxels, Aj = 1 or 0 • (3) A classification model (e.g., Naïve
Bayes; Yarkoni et al., 2011; Bayesian
Spatial Point Process Model (Kang et al. 2011, JASA; 2012)
Reverse Inference
Kross et al. 2011, PNAS N = 40 – 180 studies per task
P(Pain | SII) = 0.87
P(Task = Pain) given (SII Activation = Yes) = 0.87 Positive
predictive value of SII activation
http://wagerlab.colorado.edu
http://wagerlab.colorado.edu/tools
Multidimensional scaling Graphs
WIKI: http://wagerlab.colorado.edu/wiki/fmri_tools_documentation
Naïve Bayes classifier for meta- analysis: meta_NBC.m in MKDA
tools
Neurosynth.org
http://wagerlab.colorado.edu
• 29
http://wagerlab.colorado.edu
• “Reverse inference”: • Requires a classification model (e.g.,
Naïve Bayes; Yarkoni et al.,
2011; Bayesian Spatial Point Process Model (Kang et al. 2011, JASA;
2012)
• Neurosynth:
• One-way chi-square test comparing P(A) vs. P(not A) given Task k
to P(A) vs. P(not A) overall
Activation (A) in this voxel is more likely given Task = k,” “Task
is more likely to be k than the average task given A”
• Details: m-estimator to smooth voxels with few activations
[esp.]
towards base rate of 0.5 (e.g., Mitchell 1996; Yarkoni et al. 2011)
• p(Aj=1|Tk=1) = ( ΣiAijTik + mp )/ ( ΣiTik + m ) • A reflects
activation, T reflects the term (e.g., ”person"), j indexes
voxel, k indexes task, and i indexes study.
“reverse inference”
Neurosynth.org Activation coordinates from ~10,000 studies Top hits
for this pattern:
Noxious, heat, somatosensory, painful, sensation, stimulation,
muscle, temperature
Romantic rejection
Kross et al. 2011, PNAS
Pitfall #4: Not remembering the shortcomings of what we’re
doing
http://wagerlab.colorado.edu
Pitfalls New Tools Inference Why meta- analysis?
1. Why meta-analysis: The promise 2. What do you want to know?
Making the right inference 3. Pitfalls and solutions: Inferences on
consistent activation 4. Expanding the toolbox: New kinds of
inferences 5. The Limits of Meta-Analysis: The proof is in the
pudding
Pudding
http://wagerlab.colorado.edu
Bayesian Spatial Point Process model: New opportunities
• Model joint likelihood of set of reported peak activations points
conditional on emotion category
Three levels in generative model (Kang et al. 2011, JASA): - Level
3: Population centers conditioned on emotion category - Level 2:
Study-level activation centers, distributed around population
centers with Gaussian covariance - Level 1: Observed data are peaks
within studies, distributed around
study-level centers
Model of study (level 2) and population (level 3)
centers Classification model
The model
- Estimation: - Spatial birth and death function for population
centers - Markov Chain Monte Carlo (MCMC) estimation of
posterior
distribution
- Classification model: - Based on posterior probability of each
emotion label given
Model of study (level 2) and population (level 3)
centers Classification model
• Model joint likelihood of set of reported peak activations points
conditional on emotion category
Result: A single, generative model for activation/co- activation in
each emotion category
http://wagerlab.colorado.edu
Classification based on brain pattern
Actual
http://wagerlab.colorado.edu
A rich model: Multiple features
Anger Disgust Fear Happy Sad
• Model can be ‘interrogated’ flexibly • Draw samples from the
posterior, examine co-activation and
other properties (e.g., graph theoretic measures)
Marginal intensity functions (maps) are only one aspect…
http://wagerlab.colorado.edu
Motor
Parietal
Cortex Basal ganglia Cerebellum
Emotions are distinguishable based on cortical activity
profiles
http://wagerlab.colorado.edu
‘Constellations’ for each emotion type Anger Disgust Fear Happy
Sad
Vis
http://wagerlab.colorado.edu
Pitfalls New Tools Inference Why meta- analysis?
1. Why meta-analysis: The promise 2. What do you want to know?
Making the right inference 3. Pitfalls and solutions: Inferences on
consistent activation 4. Expanding the toolbox: New kinds of
inferences 5. The Limits of Meta-Analysis: The proof is in the
pudding
Pudding
http://wagerlab.colorado.edu
• Links between mind and brain
• Better a priori models of brain function • Maps that are
diagnostic of task type • Better classification/decoding in new
studies
• Clinical predictions
Meta-analytic maps
http://wagerlab.colorado.edu
Classification of mental states using brain maps
Group 1 (N = 79): Strong vs. mild pain Group 2 (N = 94): Working
memory vs. rest Group 3 (N = 108): Negative vs. neutral
pictures
A P N T
Yarkoni et al. 2011
Decoding of individual subjects
• Region or pattern of interest - Replicability - Sensitivity -
Specificity
• Pitfalls - Unit of analysis: Improper
generalization - Problematic weighting/input (Z-
• New models: Value of
Neuroimaging: A cumulative science
Neuroimaging: A cumulative science
Meta-analysis has its own pitfalls…
Meta-analysis: Roadmap
Inference: Unasked questions
Meta-analysis v1.0: Does not generalize across studies
Meta-analysis 2.0: Inference across studiesMultilevel kernel
density analysis
Pitfall #2: Improper accounting for biases
Pitfall #2: Improper accounting for biases
Meta-analysis 2.0: Inference across studiesMultilevel kernel
density analysis
Pitfall # 3: Erroneous spatial inference
Spatial inference
Neurosynth.org
The model
‘Constellations’ of co-activation for each emotion type
‘Constellations’ for each emotion type
Meta-analysis: Roadmap
Classification of mental states using brain maps
Summary: How meta-analysis can be useful
Slide Number 43