Bayesian inference in EEG/MEG©rémies_Mattout... · Bayesian inference in EEG/MEG Workshop on Bayesian approaches in neuroscience –Karolinska Institute –2nd of Frebruary 2016

Bayesian inference in EEG/MEG

Workshop on Bayesian approaches in neuroscience – Karolinska Institute – 2nd of Frebruary 2016

Jérémie MattoutLyon Neuroscience Research Center

“Will it ever happen that mathematicians will know enough about the physiology of the brain, andneurophysiologists enough of mathematical discovery, for efficient cooperation to be possible”

Jacques Hadamard (mathematician, 1865-1963)

Previously…“What I cannot create, I cannot understand”

Richard Feynman (Nobel laureate in Physics, 1918-1988)

• Bayesian or Probabilistic modelling: accounts for knowledge/uncertainty in the process of interest

• Generative or phenomenological models

𝒙 = 𝒇 𝒙, 𝜽𝒆𝒗𝒐𝒍, 𝒖

𝒚 = 𝒈 𝒙, 𝜽𝒐𝒃𝒔 + 𝜺

Hidden states Unknown parameters

Observations(data features)

Inputs

Ex. Dynamic causal models (DCM)

evolution

observation Fixed

Variable

Data

Y

𝜺𝜽𝒐𝒃𝒔

𝒙

𝜽𝒆𝒗𝒐𝒍

f

g

graphicalrepresentation

Previously…“All models are wrong but some are useful”

George E. P. Box (statistician, 1919-2013)



• Inference on models

Model M : your scientific hypothesis put into predictive equations (functions f and g + prior assumptions)

• Usually approximated (FE, AIC, BIC)• Trades accuracy and complexity• Model comparison through Bayes Factors and model posteriors• Models can only be compared on the same data

MYP

MPMYPMYP

,,

Likelihoodpriors

Evidence

Previously…



• Inference on models

• Inference on model parameters

MYP

MPMYPMYP

,,

Most plausible model or model family

Posterior

• Inference at the group level

Outline

• 1 - EEG/MEG Source reconstruction

• 2 - Inferring learning mechanisms from EEG/MEG responses

• 3 - Online Bayesian inference and design optimization

1 – EEG/MEG source reconstructionAn ill-posed inverse problem

Inverse Problem

Y

Likelihood Prior

),|( MYp )|( Mp

Generative model M

Data

Parameters

No unique solution!

Additional informationis needed

Posterior Evidence

)|( MYp),|( MYp

Bayesian inference

1 – EEG/MEG source reconstructionTypical model families

• Biophysical model

Y: sensor data (incl. noise)

),|( MYp

Geometrical and physical propertiesof the head tissues

Static (forward) model family

• Source model

Equivalent current dipole (ECD) model family Distributed sources or Imaging model family

: dipole locations,orientations & intensity

)|( Mp : number of dipoles,initial location

: dipole intensities

)|( Mp : number of dipoles,fixed locations

1 – EEG/MEG source reconstructionDetailed example: the distributed source model


𝒀 = 𝑲. 𝜽 + 𝜺

Evoked response: Nsens x Time

Forward operator or lead-field matrix: Nsens x Nsources

Unknown source dynamics: Nsources x Time

Additive noise

Cortical mesh (3D surface)

• Likelihood

𝒑 𝒀 𝜽,𝑴 = 𝑵(𝑲. 𝜽, 𝑪𝜺)

𝒑 𝜺 𝑴 = 𝑵(𝟎, 𝑪𝜺)

The proba. of a largenoise is small

• Prior𝒑 𝜽 𝑴 = 𝑵(𝟎, 𝑪𝜽)

1 – EEG/MEG source reconstructionDetailed example: the distributed source model


𝒀 = 𝑲. 𝜽 + 𝜺

Evoked response: Nsens x Time

Forward operator or lead-field matrix: Nsens x Nsources

Unknown source dynamics: Nsources x Time

Additive noise

Cortical mesh (3D surface)

• Likelihood

𝒑 𝒀 𝜽,𝑴 = 𝑵(𝑲. 𝜽, 𝑪𝜺)

𝒑 𝜺 𝑴 = 𝑵(𝟎, 𝑪𝜺)

The proba. of a largeintensity is small

• Prior𝒑 𝜽 𝑴 = 𝑵(𝟎, 𝑪𝜽)

1 – EEG/MEG source reconstructionCommonly used priors

• Priors𝒑 𝜽 𝑴 = 𝑵(𝟎, 𝑪𝜽)

Nsources x Nsources

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

i.i.d or Minimum norm

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

Single dipole

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Smoothness (like LORETA)

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

fMRI based

246810121416

2

4

6

8

10

12

14

160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Lead-field based(Beamformer or MSP)

Philips et al. 2005Mattout et al. 2005 & 2006Wipf and Nagarajan 2009

1 – EEG/MEG source reconstructionComparing priors

5010

015

020

025

030

035

040

0-0

.2

-0.1

5

-0.1

-0.0

50

0.050.

1

0.15

estim

ated

res

pons

e -

cond

ition

1

at -

54, -

27, 1

0 m

m

time

ms

PP

M a

t 188

ms

(100

per

cent

con

fiden

ce)

512

dipo

les

Per

cent

var

ianc

e ex

plai

ned

11.2

1 (1

0.26

)

log-

evid

ence

= 5

3.1

5010

015

020

025

030

035

040

0-0

.25

-0.2

-0.1

5

-0.1

-0.0

50

0.050.

1

0.150.

2

0.25

estim

ated

res

pons

e -

cond

ition

1

at 5

0, -

26, 1

1 m

m

time

ms

PP

M a

t 188

ms

(100

per

cent

con

fiden

ce)

512

dipo

les

Per

cent

var

ianc

e ex

plai

n ed

99.7

8 (9

1.33

)

log-

evid

ence

= 1

12.8

5010

015

020

025

030

035

040

0-0

.03

-0.0

2

-0.0

10

0.01

0.02

0.03

estim

ated

res

pons

e -

cond

ition

1

at 5

7, -

26, 9

mm

time

ms

PP

M a

t 188

ms

(100

per

cent

con

fiden

ce)

512

dipo

les

Per

cent

var

ianc

e ex

plai

ned

99.1

8 (9

0.78

)

log-

evid

ence

= 9

9.9

Predicted

-0.025

-0.02

-0.015

-0.01

-0.005

0

0.005

0.01

0.015

0.02

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

Explained variance

Predicted

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

Predicted

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

246810121416

2

4

6

8

10

12

14

160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Predicted

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

5010

015

020

025

030

035

040

0-4-3-2-101234

x 10

-3es

timat

ed r

espo

nse

- co

nditio

n 1

at 6

5, -

21, 1

1 m

m

time

ms

PP

M a

t 188

ms

(100

per

cent

con

fiden

ce)

512

dipo

les

Per

cent

var

ianc

e ex

plai

ned

98.4

9 (9

0.16

)

log-

evid

ence

= 9

3.1

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

Measured

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

11 % 96% 97% 98%

Alternative priors

𝒀 : observed data

𝜽 : estimatedsource intensities

𝒀 : predicted data

0

20

40

Log-evidence

1 – EEG/MEG source reconstructionEmpirical Bayes: multiple sparse priors (MSP)

𝑸𝟏 𝑸𝟐 𝑸𝟑

…

𝑸𝒏

𝑪𝜽 = 𝝀𝟏. 𝑸𝟏 + 𝝀𝟐. 𝑸𝟐 + 𝝀𝟑. 𝑸𝟑 +⋯+ 𝝀𝒏. 𝑸𝒏

Model (Hyper)parameters

0 100 200 300 400 500 600 700-8

-6

-4

-2

0

2

4

6

8

estimated response - condition 1

at 36, -18, -23 mm

time ms

PPM at 229 ms (100 percent confidence)

512 dipoles

Percent variance explained 99.15 (65.03)

log-evidence = 8157380.8

0 100 200 300 400 500 600 700-2

-1.5

-1

-0.5

0

0.5

1

1.5

2


at 46, -31, 4 mm

time ms


512 dipoles



0 100 200 300 400 500 600 700-2

-1.5

-1

-0.5

0

0.5

1

1.5

2


at 46, -31, 4 mm

time ms


512 dipoles



0 50 100 15010

2

104

106

108

Iteration

Free energy

Accuracy

Complexity

0 100 200 300 400 500 600 700-2

-1.5

-1

-0.5

0

0.5

1

1.5

2


at 46, -31, 4 mm

time ms


512 dipoles



AccuracyFree EnergyCompexity

Inference (iterative) process

Friston et al. 2008

1 – EEG/MEG source reconstructionModel comparison on other dimensions

Mattout et al. 2007

Spherical head model Realistic surfacic model (BEM)

Biophysics

Anatomy

Mesh resolution (High / Low)Dipole oreintation (fixed / free)

Sources

Henson et al. 2009

Henson et al. 2007Model comparison on real MEG data

• No evidence in favor of individual vs. Inverse-norm mesh• Evidence in favor of BEM head model• Evidence in favor of high + fixed vs. low + free

1 – EEG/MEG source reconstructionGraphical representation of MSP

Fixed

Variable

Data

...

Y

,η Ω

...

Directed Acyclic Graph (DAG) representation

𝑸𝟐𝑸𝟏

𝝀𝒊𝑪𝜽 𝑪𝜺𝝀𝒊𝜺

𝑸𝟏𝜺 𝑸𝟐

𝜺

𝜺

K

𝜽

Friston et al. 2008

1 – EEG/MEG source reconstructionModel extension to group level inference

Fixed

Variable

Data

...

,η Ω

...


𝑸𝟐𝑸𝟏

𝝀𝒊𝑪𝜽 𝑪𝟏,𝜺𝝀𝟏,𝒊𝜺

𝑸𝟏,𝟏𝜺 𝑸𝟏,𝟐

𝜺

𝜽𝟏

Litvak and Friston 2008

𝑲𝟏 𝑲𝟐

𝜺𝟏 𝜽𝟐 𝜺𝟐

𝒀𝟏 𝒀𝟐

...𝑸𝟐,𝟏𝜺 𝑸𝟐,𝟐

𝜺

𝑪𝟐,𝜺𝝀𝟐,𝒊𝜺

A two-step procedure:- Estimating the group prior variance- Estimating the individual source intensities

1 – EEG/MEG source reconstructionModel extension to EEG-MEG data fusion

Fixed

Variable

Data

...

,η Ω


𝑸𝟐𝑸𝟏

𝝀𝒊𝑪𝜽

Henson et al. 2011

𝑲𝑬𝑬𝑮 𝑲𝑴𝑬𝑮𝒀𝑬𝑬𝑮 𝒀𝑴𝑬𝑮

𝜽 𝜺𝑴𝑬𝑮

...𝑸𝑴𝑬𝑮,𝟏𝜺 𝑸𝑴𝑬𝑮,𝟐

𝜺

𝑪𝑴𝑬𝑮,𝜺𝝀𝑴𝑬𝑮,𝒊𝜺

𝜺𝑬𝑬𝑮

...𝑸𝑬𝑬𝑮,𝟏𝜺 𝑸𝑬𝑬𝑮,𝟐

𝜺

𝑪𝑬𝑬𝑮,𝜺𝝀𝑬𝑬𝑮,𝒊𝜺

1 – EEG/MEG source reconstructionAnother model family

Dynamic causal model (DCM) family

• Source model

• Biophysical model

1

2

4

3

24

u

Golgi Nissl

internal granular

layer

internal pyramidal

layer

external pyramidal

layer

external granular

layer

EP

EI

IIModel parameters:- Source intrinsic parameters

(e.g. synaptic gain)- Extrinsic coupling- Source orientation

𝒙 = 𝒇 𝒙, 𝜽𝒆𝒗𝒐𝒍, 𝒖

𝒚 = 𝒈 𝒙, 𝜽𝒐𝒃𝒔 + 𝜺

1 – EEG/MEG source reconstructionAn MMN example

Characterizing a group response in terms of modulations of effective connectivity

……S S S D S S

Classical auditory oddball paradigm

S: standard toneD: (frequency) deviant tone

rIFG

rSTG

rA1

lSTG

lA1

/

Garrido et al. 2009

Boly et al. 2011

Ctrls: ControlsMCS: Minimally Conscious StateVS: Vegetative State

2 – Inferring learning mechanisms from EEG/MEG data


• In line with the Bayesian brain hypothesis and its neuronal instantiation (predictive coding), it has been suggested that evoked responses reflect prediction error

Friston 2005

• Deviance responses should be modulated by predictability

• Single trial modulations of evoked responses should reflect perceptual learning

Neurophysiological correlates of implicit perceptual learning


Lecaignard et al. 2015

Effect of predictability

US

PS

US - PS

USPS

US: Unpredictable sequencePS: Predictable sequence

• Deviance responses should be modulated by predictabilityNeurophysiological correlates of implicit perceptual learning



M1

M2

M3

M4

M5

Bayesian learnerwith different forgetting

parameter value

M6

M7

M8

Change detection

models

Null model (noise)

Alternative models

Meta-Bayesian approach

Ostwald et al. 2012

Data features: individual single-trial source activity in clusters estimated from EEG-MEG group inversion

Lecaignard et al. in prep (a)

𝒙 = 𝒇 𝒙, 𝜽, 𝒖 𝒚 = 𝒈 𝒙,𝝋 + 𝜺



Preliminary results

Model comparison in a single subject, single block (condition US)

time time

Relative Free-energies or Bayes Factors w.r.t null model (M8)

M1

M2

M3

M4

M5

Bayesian learnerwith different forgetting

parameter value

M6

M7

M8

Change detection

models

Null model (noise)

Lecaignard et al. in prep (b)VBA toolbox, Daunizeau et al. 2014

3 – Online Bayesian inference and design optimization


Fine neurobiological and/or computational models of evoked responses and their modulations may tell us more about information processing in the brain

Given such models, we may be in a position to disentangle between alternative hypothesis about ongoing perceptual inference and learning in a given subject/patient

However, what design to choose? How to optimize it for each individual?

Basic idea

To make use of our ability to process data in real-time in order to optimize design parameters online, for a given individual, in the aim of selecting the most likely model both accurately and fast

Wald 1947, Cavagnaro et al. 2009, Myung et al. 2013

Motivation for Adaptive Design Optimization

3 – Online Bayesian inference and design optimizationGeneral principle of Adaptive Design Optimization

Sanchez et al. 2014

3 – Online Bayesian inference and design optimizationA simple example: behavioural task of word memorization

Cavagnaro et al. 2009, Myung et al. 2013

Y (observations) = number of recalled words

u (design variable) = Lag time

Models of “forgetting”

Predictions(Low uncertainty about model parameters)

EXP

POW

Predictions(High uncertainty about model parameters)

Experimental context

Hypothesis

3 – Online Bayesian inference and design optimizationOptimization criterion

u = 40

u = 10

Y

p(Y|H1 ,u)

p(Y|H2 ,u)

p(e|u)u* = argmin {p(e|u)}

u

…P(Y|H)

H1 : EXP

H2 : POW

u = 1

Minimizing the risk of model selection error

Danizeau et al. 2011

3 – Online Bayesian inference and design optimizationExtension to more complex (DCM-like) models

Sanchez et al. 2014

Generative model of the

“world”

ERPs

>>

Evolution function f

),,( evoluxfx ),( obsxgyObservation function g

+

Extension to more complex, multidimensional models

Danizeau et al. 2011

3 – Online Bayesian inference and design optimizationDisentangling between learning models

Sanchez et al. 2014

A perceptual learning model family: hierarchical Gaussian filters

Evolution function f

),,( evoluxfx ),( obsxgy

Observer function g

+Category (S or D)

(x1 = u)

Probability of D(u = 1)

Volatility

x1(t)

x2(t)

ω

x3(t)

ϑ

κ

KL-divergencebetween prior and post beliefs

Mathis et al. 2011 & 2014


Sanchez et al. 2014

Models ω valuesκ values

(if κ = 0, no 3rd level)ϑ values

Ability to track

events probabilities

Ability to track

environmental volatility

M1 -Inf 0 - No No

M2 - 5 0 - Low learning No

M3 - 4 0 - High learning No

M4 - 5 1 0.2 Low learning Yes

M5 - 4 1 0.2 High learning Yes

Deviant

Standard

Simulations: alternative models and predictionsDisentangling between learning models


Sanchez et al. 2014

Disentangling between learning models



Ability to track


Ability to track


M1 -Inf 0 - No No





Deviant

Standard

Simulations: alternative models and predictions


Sanchez et al. 2014




Ability to track


Ability to track


M1 -Inf 0 - No No





Deviant

Standard



Sanchez et al. 2014




Ability to track


Ability to track


M1 -Inf 0 - No No





Deviant

Standard



Sanchez et al. 2014




Ability to track


Ability to track


M1 -Inf 0 - No No





Deviant

Standard



Sanchez et al. 2014


Static designs to be compared with ADO

Classic -stable

Classic -volatile

p(u=1) = 0.2 0.1 0.3 0.1 0.2


Sanchez et al. 2014

Results on Monte Carlo simulations

Model 1 Model 2 Model 3 Model 4 Model 5

0

20

40

60

80

100

Optimal

Volatile Classical

Stable Classical

% of non-conclusive experiments

Trials beforereaching conclusion

Model 1 Model 2 Model 3 Model 4 Model 5

0

50

100

150

200

250

300

350

Optimal

Volatile Classical

Stable Classical


Conclusion

• 1 - EEG/MEG Source reconstruction- Bayesian inference on models and parameters- Group level inference- EEG/MEG data fusion

• 2 - Inferring learning mechanisms through EEG/MEG responses- meta-Bayesian inference- Model comparison

• 3 - Online Bayesian inference and design optimization

Acknowledgments

Olivier BertrandGareth BarnesAnne CaclinJean DaunizeauKarl FristonRik HensonFrançoise LecaignardEmmanuel MabyGaëtan Sanchez

Documents

Bayesian inference in EEG/MEG©rémies_Mattout... · Bayesian inference in EEG/MEG Workshop on Bayesian approaches in neuroscience –Karolinska Institute –2nd of Frebruary 2016