69
MULTI-VARIATE/MODALITY IMAGE ANALYSIS Duygu Tosun-Turgut, Ph.D. Center for Imaging of Neurodegenerative Diseases Department of Radiology and Biomedical Imaging [email protected]

MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Embed Size (px)

Citation preview

Page 1: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

MULTI-VARIATE/MODALITY IMAGE ANALYSIS Duygu Tosun-Turgut, Ph.D. Center for Imaging of Neurodegenerative Diseases Department of Radiology and Biomedical Imaging [email protected]

Page 2: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Curse of dimensionality • A major problem in imaging statistics is the curse of dimensionality.

•  If the data x lies in high dimensional space, then an enormous amount of data is required to learn distributions or decision rules.

Page 3: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Curse of dimensionality • One way to deal with dimensionality is to assume that we know the form of the probability distribution.

• For example, a Gaussian model in N dimensions has N + N(N-1)/2 parameters to estimate.

• Requires O(N2) data to learn reliably. This may not be practical.

Page 4: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Typical solution: divide into sub-problems

• Each sub-problem relates single voxel to clinical variable

• Known as voxel-wise, univariate, or pointwise regression

• Approach popularized by SPM (Friston, 1995)

• Dependencies between spatial variables neglected!!

Page 5: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Dimension reduction • One way to avoid the curse of dimensionality is by projecting the data onto a lower-dimensional space.

• Techniques for dimension reduction: • Principal Component Analysis (PCA) •  Fisher’s Linear Discriminant • Multi-dimensional Scaling. •  Independent Component Analysis (ICA)

Page 6: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

A dual goal • Find a good representation

• The features part

• Reduce redundancy in the data • A side effect of “proper” features

Page 7: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

A “good feature” • “Simplify” the explanation of the input

• Represent repeating patterns • When defined makes the input simpler

• How do we define these abstract qualities? • Onto the math …

Page 8: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Linear features

Z = WX

↑ "#

$%

Weight Matrix

samples→

=↑

"

#

'''

$

%

(((

Feature Matrix

dimensions→

↑ "#

$%

Input Matrix

samples→

feat

ures

feat

ures

dim

ensi

ons

Page 9: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

A 2D example Z=WX=

z1T

z2T

!

"

##

$

%

&&=

w1T

w2T

!

"

##

$

%

&&

x1T

x2T

!

"

##

$

%

&&

Scatter plot of same data

X1

X2

X1

X2

Matrix representation of data

Page 10: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Defining a goal • Desirable feature features • Give “simple” weights • Avoid feature similarity

• How do we define these?

Page 11: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

One way to proceed • “Simple weights”

• Minimize relation of the two dimensions • Decorrelate:

• “Feature similarity” • Same thing! • Decorrelate:

z1Tz2 = 0

w1Tw2 = 0

Page 12: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Diagonalizing the covariance • Covariance matrix

• Diagonalizing the covariance suppresses cross-dimensional co-activity

•  if z1 is high, z2 won’t be…

Cov(z1, z2 )=z1Tz1 z1

Tz2z2Tz1 z2

Tz2

!

"

##

$

%

&&/ N

Cov(z1, z2 )=z1Tz1 z1

Tz2z2Tz1 z2

Tz2

!

"

##

$

%

&&/ N = 1 0

0 1

!

"#

$

%&= Ι

Page 13: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Problem definition • For a given input X

• Find a feature matrix W

• So that the weights decorrelate

• Solving for diagonalization!

WX( ) WX( )T = NΙ ⇒ ZZT = NΙ

⇒WCov(X)WT = Ι

Page 14: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Solving for diagonalization Covariance matrices are positive definite • Therefore symmetric

•  have orthogonal eigenvectors and real eigenvalues

• and are factorizable by:

where U has eigenvectors of A in its columns Λ=diag(λi), where λi are the eigenvalues of A

UTAU=Λ

Page 15: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Decorrelation • An implicit Gaussian assumption

•  N-D data has N directions of variance

Page 16: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Undoing the variance •  The decorrelating matrix W contains two vectors that

normalize the input’s variance

Page 17: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Resulting transform •  Input gets scaled to a well behaved Gaussian with unit

variance in all dimensions

Input data Transformed data (weights)

Page 18: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

A more complex case • Having correlation between two dimensions

•  We still find the directions of maximal variance •  But we also rotate in addition to scaling

Input data Transformed data (weights)

Page 19: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

One more detail • So far we considered zero-mean inputs

•  The transforming operation was a rotation

•  If the input mean is not zero bad things happen! •  Make sure that your data is zero-mean!

Input data Transformed data (weights)

Page 20: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Principal component analysis • This transform is known as PCA

• The features are the principal components • They are orthogonal to each other • And produce orthogonal (white) weights

• Major tool in statistics • Removes dependencies from multivariate data

• Also known as the KLT • Karhunen-Loeve transform

Page 21: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

A better way to compute PCA • The Singular Value Decomposition way • Relationship to eigendecomposition

•  In our case (covariance input A), U and S will hold the eigenvectors/values of A

• Why the SVD? • More stable, more robust, fancy extensions

U,S,V[ ] = SVD(A)⇒ A=USVT

Page 22: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Dimensionality reduction • PCA is great for high dimensional data

• Allows us to perform dimensionality reduction • Helps us find relevant structure in data • Helps us throw away things that won’t matter

Page 23: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

What is the number of dimensions? •  If the input was M dimensional, how many dimensions do we keep? • No solid answer (estimators exist)

• Look at the singular-/eigen-values •  They will show the variance of each component, at

some point it will be small

Page 24: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Example • Eigenvalues of 1200 dimensional video data

•  Little variance after component 30 •  We don’t need to keep the rest of the data

Dimension reduction occurs by ignoring the directions in which the covariance is small.

Page 25: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Recap… PCA decorrelates multivariate data, finds useful components, reduces dimensionality. • PCA is only powerful if the biological question is related to the highest variance in the dataset

•  If not other techniques are more useful : Independent Component Analysis

Page 26: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Limitations of PCA • PCA may not find the best directions for discriminating

between two classes. •  1st eigenvector is best for representing the probabilities. •  2nd eigenvector is best for discrimination.

Page 27: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Fisher’s linear discriminant analysis (LDA) •  Performs dimensionality reduction “while preserving as much of the

class discriminatory information as possible”.

•  Seeks to find directions along which the classes are best separated.

•  Takes into consideration the scatter within-classes but also the scatter between-classes.

z1 z2

Page 28: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

LDA via PCA •  PCA is first applied to the data set to reduce its dimensionality.

•  LDA is then applied to find the most discriminative directions.

x1x2xN

!

"

#####

$

%

&&&&&

PCA' →''''

y1y2yK

!

"

#####

$

%

&&&&&

y1y2yK

!

"

#####

$

%

&&&&&

LDA' →''''

z1z2zC−1

!

"

#####

$

%

&&&&&

Page 29: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

PCA fails…

•  Focus on uncorrelated and Gaussian components

•  Second-order statistics •  Orthogonal transformation

•  Focus on independent and non-Gaussian components

•  Higher-order statistics •  Non-orthogonal transformation

Page 30: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Statistical independence •  If we know something about x, that should tell us nothing

about y.

0 0.5 10

0.5

1

1.5

0 0.5 10

0.2

0.4

0.6

0.8

1

Dependent Independent

Page 31: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Concept of ICA •  A given signal (x) is generated by linear mixing (A) of

independent components (s) •  ICA is a statistical analysis method to estimate those

independent components (z) and mixing rule (W)

WAsWxz ==

s A x W z

s1 s2 s3 sM

Aij x1 x2 x3 xM

Wij z1 z2 z3 zM

Observed Unknown

Blind source

separation

Page 32: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Example: PCA and ICA

⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡=⎥

⎤⎢⎣

2

1

2221

1211

2

1

ss

aaaa

xx

Model:

Page 33: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

How it works?

Page 34: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Rational of ICA •  Find the components Si that are as independent as

possible in the sens of maximizing some function F(s1,s2,…,sk) that measures indepedence

• All ICs (except 1) should be non-Gaussian

•  The variance of all ICs is 1

•  There is no hierarchy between ICs

Page 35: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

How to find ICs ? • Many choices of objective function F • Mutual information

• Use the kurtosis of the variables to approximate the distribution function

•  The number of ICs is chosen by the user

)()...()(),...,,(),...,,(

2211

2121

kk

kk sfsfsf

sssfLogsssfMI ∫=

Page 36: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Difference with PCA •  It is not a dimensionality reduction technique.

•  There is no single (exact) solution for components (in R: FastICA, PearsonICA, MLICA)

•  ICs are of course uncorrelated but also as independent as possible

• Uninteresting for Gaussian distributed variables

Page 37: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Feature Extraction in ECG data (Raw Data)

Page 38: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Feature Extraction in ECG data (PCA)

Page 39: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Feature Extraction in ECG data (Extended ICA)

Page 40: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Feature Extraction in ECG data (flexible ICA)

Page 41: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Application domains of ICA

•  Image denoising • Medical signal processing – fMRI, ECG, EEG •  Feature extraction, face recognition • Compression, redundancy reduction • Watermarking • Clustering •  Time series analysis (stock market, microarray data) •  Topic extraction • Econometrics: Finding hidden factors in financial data

Page 42: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Gene clusters [Hsiao et al. 2002] • Each gene is mapped to a

point based on the value assigned to the gene in the 14th (x-axis), 15th (y-axis) and 55th (z-axis) independent components: - Red - enriched with liver-

specific genes - Orange – enriched with

muscle-specific genes - Green – enriched with vulva-

specific genes - Yellow – genes not annotated

as liver-, muscle-, or vulva-specific

Page 43: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Segmentation by ICA Image, I ICA Filter Bank

With n filters I1, I2,…, In

Clustering Segmented image, C

Above is an unsupervised setting. Segmentation (i.e., classification in this context) can also be performed by a supervised method on the output feature images I1, I2 , …, In.

A texture image Segmentation

Jenssen and Eltoft, “ICA filter bank for segmentation of textured images,” 4th International symposium on ICA and BSS, Nara, Japan, 2003

These are filter responses

Page 44: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Recap: PCA vs LDA vs ICA • PCA : Proper for dimension reduction

•  LDA : Proper for pattern classification if the number of training samples of each class are large

•  ICA : Proper for blind source separation or classification using ICs when class id of training data is not available

Page 45: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Multimodal Data Collection

EEG fMRI

T1/T2

Other*

DTI

Brain Function (spatiotemporal,

task-related)

Brain Structure (spatial)

fMRI fMRI EEG EEG

Covariates (Age, etc.)

Task 1-N Task 1-N

Genetic Data (spatial/chromosomal)

SNP Gene

Expression

Page 46: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Joint ICA • Variation on ICA to look for components that appear jointly

across features or modalities

ΧModality 1 ΧModality 2"#$

%&'= Α× SModality 1 SModality 2

"#$

%&'

=

Page 47: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

jICA Results: Multitask fMRI One component

Significant at p<0.01

Calhoun VD, Adali T, Kiehl KA, Astur RS, Pekar JJ, Pearlson GD. (2006): A Method for Multi-task fMRI Data Fusion Applied to Schizophrenia. Hum.Brain Map. 27(7):598-610.

Page 48: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

jICA Results: Joint Histograms

Calhoun VD, Adali T, Kiehl KA, Astur RS, Pekar JJ, Pearlson GD. (2006): A Method for Multi-task fMRI Data Fusion Applied to Schizophrenia. Hum.Brain Map. 27(7):598-610.

Page 49: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Parallel ICA: Two Goals

A1

X1

S2 S1

W2 W1

X2

Data1: (fMRI) Data2: (SNP)

A2

Identify Hidden Features

Identify Hidden Features

Identify Linked Features

J. Liu, O. Demirci, and V. D. Calhoun, "A Parallel Independent Component Analysis Approach to Investigate Genomic Influence on Brain Function," IEEE Signal Proc. Letters, vol. 15, pp. 413-416, 2008.

MAX : {H(Y1) + H(Y2) }, <Infomax> 21i 2j2

1 21i 2j

Cov(a ,a )g( )=Correlation(A ,A ) =

Var(a )×Var(a )⋅Subject to: arg max g{W1,W2 ŝ1,ŝ2 },

Page 50: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Example 2: Default Mode Network & Multiple Genetic Factors

fMRI Component t-map fMRI Component Mixing Coefficients (ρ=0.31)

SNPs Component

rs1011313 rs10503929 rs10936143 rs2150157 rs2236797

rs4784320 rs6136 rs6578993 rs7702598 rs7961819

rs3757934 rs3772917 rs8027035

J. Liu, G. D. Pearlson, A. Windemuth, G. Ruano, N. I. Perrone-Bizzozero, and V. D. Calhoun, "Combining fMRI and SNP data to investigate connections between brain function and genetics using parallel ICA," Hum.Brain Map., vol. 30, pp. 241-255, 2009.

Page 51: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Partial least squares (PLS) • Understanding relationships between process & response variables

Page 52: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

PLS fundamentals

X Space Y Space

Planes Projections

X1

X2

X3

Y1

Y2

Y3

Page 53: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

PLS is the multivariate version of regression • PLS uses two different PCA models, one for the X’s and

one for the Y’s, and finds the links between the two.

• Mathematically, the difference is as follows: •  In PCA, we are maximizing the variance that is explained by the

model. •  In PLS, we are maximizing the covariance.

Page 54: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Conceptually how PLS works? A step-wise process: • PLS finds a set of orthogonal components that :

•  maximize the level of explanation of both X and Y •  provide a predictive equation for Y in terms of the X’s

•  This is done by: •  fitting a set of components to X (as in PCA) •  similarly fitting a set of components to Y •  reconciling the two sets of components so as to maximize

explanation of X and Y

• PLS finds components of X that are also relevant to Y •  Latent vectors are components that simultaneously decompose X

and Y •  Latent vectors explain the covariance between X and Y

Page 55: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

PLS – the “Inner Relation” •  The way PLS works visually is by “tweeking” the two PCA

models (X and Y) until their covariance is optimised. It is this “tweeking” that led to the name partial least-squares.

All 3 are solved simultaneously via numerical methods

Page 56: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Canonical correlation analysis (CCA) •  CCA was developed first by H. Hotelling.

•  H. Hotelling. Relations between two sets of variates. Biometrika, 28:321-377, 1936.

•  CCA measures the linear relationship between two multidimensional variables.

•  CCA finds two bases, one for each variable, that are optimal with respect to correlations.

•  Applications in economics, medical studies, bioinformatics and other areas.

Page 57: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Canonical correlation analysis (CCA) •  Two multidimensional variables

•  Two different measurement on the same set of objects •  Web images and associated text •  Protein (or gene) sequences and related literature (text) •  Protein sequence and corresponding gene expression •  In classification: feature vector and class label

•  Two measurements on the same object are likely to be correlated. •  May not be obvious on the original measurements. •  Find the maximum correlation on transformed space.

Page 58: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Canonical correlation analysis (CCA)

TXXW

Correlation

TYYW

measurement transformation Transformed data

Page 59: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Problem definition •  Find two sets of basis vectors, one for x and the other for

y, such that the correlations between the projections of the variables onto these basis vectors are maximized.

><→ ywy y ,

: and yx ww

Given

Compute two basis vectors

Page 60: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Problem definition •  Compute the two basis vectors so that the correlations

of the projections onto these vectors are maximized.

60

Page 61: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Least squares regression l Estimate the parameters β based on a set of training data:

(x1, y1)…(xN, yN)

l Minimize residual sum of squares

2

1 10)( ∑ ∑

= =⎟⎟

⎜⎜

⎛−−=

N

i

p

jjiji xyRSS βββ

( ) yTT XXX 1ˆ −=β poor numeric properties!

Page 62: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Subset selection •  We want to eliminate unnecessary features •  Use additional penalties to reduce coefficients: coefficient

shrinkage •  Ridge Regression

•  Minimize least squares s.t.

•  The Lasso •  Minimize least squares s.t.

•  Principal Components Regression •  Regress on M < p principal components of X

•  Partial Least Squares •  Regress on M < p directions of X weighted by y

∑=

≤p

jj s

1

|| β

∑=

≤p

jj s

1

Page 63: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Prostate Cancer Data Example

Page 64: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Error comparison

Page 65: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Shrinkage methods: Ridge regression

Page 66: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Shrinkage methods: Lasso regression

Page 67: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

A unifying view • We can view all the linear regression techniques under a

common framework

λ includes bias, q indicates a prior distribution on β •  λ=0: least squares •  λ>0, q=0: subset selection (counts number of nonzero parameters) •  λ>0, q=1: the lasso •  λ>0, q=2: ridge regression

⎪⎭

⎪⎬⎫

⎪⎩

⎪⎨⎧

+⎟⎟

⎜⎜

⎛−−= ∑∑ ∑

== =

p

j

qj

N

i

p

jjiji xy

1

2

1 10 ||minargˆ βλβββ β

Page 68: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Family of shrinkage regression

Page 69: MULTI-VARIATE /MODALITY IMAGE ANALYSIScind.ucsf.edu/sites/cind.ucsf.edu/files/wysiwyg/education/... · MULTI-VARIATE /MODALITY IMAGE ANALYSIS ... Curse of dimensionality ... • For

Atrophy associated with delayed memory

Spatial correlation

Ridge regression