24
Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models Rebecca A. Hutchinson (1) Tom M. Mitchell (1,2) (1) Computer Science Department, Carnegie Mellon University (2) Machine Learning Department, Carnegie Mellon University Statistical Analyses of Neuronal Data (SAND4), May 30, 2008

Rebecca A. Hutchinson (1) Tom M. Mitchell (1,2)

Embed Size (px)

DESCRIPTION

Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models. Rebecca A. Hutchinson (1) Tom M. Mitchell (1,2) (1) Computer Science Department, Carnegie Mellon University (2) Machine Learning Department, Carnegie Mellon University - PowerPoint PPT Presentation

Citation preview

Page 1: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Modeling fMRI data generated by overlapping cognitive processes

with unknown onsets using Hidden Process Models

Rebecca A. Hutchinson (1)

Tom M. Mitchell (1,2)

(1) Computer Science Department, Carnegie Mellon University(2) Machine Learning Department, Carnegie Mellon University

Statistical Analyses of Neuronal Data (SAND4), May 30, 2008

Page 2: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Hidden Process Models

• HPMs are a new probabilistic model for time series data.

• HPMs are designed for data that is:– generated by a collection of latent processes that

have overlapping spatial-temporal signatures.– high-dimensional, sparse, and noisy.– accompanied by limited prior knowledge about when

the processes occur.

• HPMs can simultaneously recover the start times and spatial-temporal signatures of the latent processes.

Page 3: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

t

td1

dN

Process 1:

t

td1

dN

Process P:

d1

dN

Prior knowledge:

An instance of Process 1 begins in this window.

An instance of Process P begins in this window.

An instance of either Process 1 OR Process P begins in this window.

There are a total of 6 processes in this window of data.

Example

Page 4: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Simple Case: Known Timing

T

D

=

1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 00 0 1 0 1 0 0 1 00 0 0 0 0 1 0 0 1… … …

p1 p3p2

p1

p3

p2

D

W(1)

W(2)

W(3)

Y

Apply the General Linear Model: Y=XW

Data Y Convolution Matrix X Unknown parameters W

[Dale 1999]

Page 5: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Challenge: Unknown Timing

T

D

=

1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 00 0 1 0 1 0 0 1 00 0 0 0 0 1 0 0 1… … …

p1 p3p2

p1

p3

p2

D

W(1)

W(2)

W(3)

Y

Uncertainty about the processes essentially makes the convolution matrix a random variable.

Page 6: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

fMRI Data

Sign

al

Am

plitu

de

Time (seconds)

Hemodynamic Response

Neural activity

Features: 10,000 voxels, imaged every second.Training examples: 10-40 trials (task repetitions).

Page 7: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Goals for fMRI

• To track cognitive processes over time. – Estimate process hemodynamic responses.– Estimate process timings.

• Allowing processes that do not directly correspond to the stimuli timing is a key contribution of HPMs!

• To compare hypotheses of cognitive behavior.

Page 8: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Our Approach

• Model of processes contains a probability distribution over when it occurs relative to a known event (called a timing landmark).

• When predicting the underlying processes, use prior knowledge about timing to limit the hypothesis space.

Page 9: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Study: Pictures and Sentences

• Task: Decide whether sentence describes picture correctly, indicate with button press.

• 13 normal subjects, 40 trials per subject.• Sentences and pictures describe 3 symbols: *,

+, and $, using ‘above’, ‘below’, ‘not above’, ‘not below’.

• Images are acquired every 0.5 seconds.

Read Sentence

View Picture Read Sentence

View PictureFixation

Press Button

4 sec. 8 sec.t=0

Rest

[Keller et al, 2001]

Page 10: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Process 1: ReadSentence Response signature W:

Duration d: 11 sec. Offsets : {0,1} P(): {0,1}

One configuration c of process instances 1, 2, … k: (with prior c)

Predicted mean:

Input stimulus :

1

Timing landmarks : 21

2

Process instance: 2 Process h: 2 Timing landmark: 2

Offset O: 1 (Start time: 2+ O)

sentencepicture

v1v2

Process 2: ViewPicture Response signature W:

Duration d: 11 sec. Offsets : {0,1} P(): {0,1}

v1v2

Processes of the HPM:

v1

v2

+ N(0,1)

+ N(0,2)

Page 11: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

HPM FormalismHPM = <H,C,,>

H = <h1,…,hH>, a set of processes (e.g. ReadSentence)

h = <W,d,,>, a processW = response signature

d = process duration

= allowable offsets

= multinomial parameters over values in

C = <c1,…, cC>, a set of configurations

c = <1,…,L>, a set of process instances = <h,,O>, a process instance (e.g. ReadSentence(S1))

h = process ID = timing landmark (e.g. stimulus presentation of S1)

O = offset (takes values in h)

= <1,…,C>, priors over C

= <1,…,V>, standard deviation for each voxel[Hutchinson et al, 2006]

Page 12: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Encoding Experiment Design

Configuration 1:

Input stimulus :

Timing landmarks :

21

ViewPicture = 2

ReadSentence = 1

Decide = 3

Configuration 2:

Configuration 3:

Configuration 4:

Constraints Encoded:

h(1) = {1,2}h(2) = {1,2}h(1) != h(2)o(1) = 0o(2) = 0h(3) = 3o(3) = {1,2}

Processes:

Page 13: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Inference• Over configurations

• Choose the most likely configuration, where:

• C=configuration, Y=observed data, =input stimuli, HPM=model

Page 14: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Learning

• Parameters to learn:– Response signature W for each process– Timing distribution for each process – Standard deviation for each voxel

• Expectation-Maximization (EM) algorithm to estimate W and .– E step: estimate a probability distribution over

configurations.– M step: update estimates of W (using reweighted

least squares) and (using standard MLEs) based on the E step.

– After convergence, use standard MLEs for

Page 15: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Uncertain Timings• Convolution matrix models several choices for

each time point.

1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 00 1 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0… … … 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 10 0 0 0 0 1 0 1 00 0 0 0 0 1 0 0 1... … …

P D

t=1t=1t=2t=2…t=18t=18t=18t=18…

T’>T

SConfigurations for each row:

3,41,23,41,2…3412…

Page 16: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Uncertain Timings

1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 00 1 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0… … …

P D

e1e2e3e4…

S

Y=

W

3,41,23,41,2…

Configurations: Weights:

e1 = P(C=3|Y,Wold,old,old) + P(C=4|Y,Wold,old,old)

• Weight each row with probabilities from E-step.

Page 17: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)
Page 18: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)
Page 19: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)
Page 20: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Potential Processes

Can group these many ways to form different HPMs

Page 21: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Comparing HPMS

Cross-validated data log-likelihood. All values are *106.

Participant

Page 22: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Are we learning the right number of processes?

For each training set, the table shows the average (over 30 runs) test set log-likelihood of each of 3 HPMs (with 2, 3, and 4 processes) on each of 3 synthetic data sets (generated with 2, 3, and 4 processes).

Each cell is reported as mean ± standard deviation.

NOTE: All valuesin this table are *105.

Page 23: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

Ongoing Research

• Regularization for process response signatures (adding bias for temporal and/or spatial smoothness, spatial priors, spatial sparsity).

• Modeling process response signatures with basis functions.

• Allowing continuous start times (decoupling process starts from the data acquisition rate)

• A Dynamic Bayes Net formulation of HPMs.

Page 24: Rebecca A. Hutchinson  (1) Tom M. Mitchell  (1,2)

References

• Dale, A.M., Optimal experiment design for event-related fMRI, 1999, Human Brain Mapping, 8, 109-114.

• Hutchinson, R.A., Mitchell, T.M., & Rustandi, I., Hidden Process Models, 2006, Proceedings of the 23rd International Conference on Machine Learning, 433-440.

• Keller, T.A., Just, M.A., & Stenger, V.A., Reading span and the time-course of cortical activation in sentence-picture verification, 2001, Annual Convention of the Psychonomic Society.