Independent Component Analysis and Unsupervised Learning · • Independent component analysis...

Independent Component Analysisand Unsupervised Learning

Jen-Tzung Chien

TABLE OF CONTENTS

1. Independent Component Analysis

2. Case Study I: Speech Recognition• Independent voices• Nonparametric likelihood ratio ICA

3. Case Study II: Blind Source Separation• Convex divergence ICA• Nonstationary Bayesian ICA• Online Gaussian process ICA

4. Summary

Introduction• Independent component analysis (ICA) is essential for

blind source separation.

• ICA is applied to separate the mixed signals and find the independent components.

• The demixed components can be grouped into clusterswhere the intra-cluster elements are dependent and inter-cluster elements are independent.

• ICA provides unsupervised learning approach to acoustic modeling, signal separation and many others.

APSIPA DL: Independent Component Analysis and Unsupervised Learning 3

Blind Source Separation• Cocktail-party problem

• Goal−Unknown: A and s−Reconstruct the source signals via demixing matrix W−Mixture matrix A is assumed to be fixed.

Independent Component Analysis

• Three assumptions−sources statistically independent− independent component nongaussian distribution−mixing matrix square matrix

ICA Objective Function

APSIPA DL: Independent Component Analysis and Unsupervised Learning

ICA Learning Rule

• ICA demixing matrix can be estimated by optimizing an objective function via gradient descent algorithm or natural gradient algorithm

TABLE OF CONTENTS

4. Summary

ICA for Speech Recognition

• Mismatch between training and test data always exists. Adaptation of HMM parameters is important.

• Eigenvoice (PCA) versus Independent Voice (ICA) −PCA performs a linear de-correlation process − ICA extracts the higher-order statistics

][ ][ ][][ 2121 MM eEeEeEeeeE

][ ][ ][][ 2121rM

rr sEsEsEsssE

uncorrelation PCA

higher-order correlations are zero ICA

Sparseness & Information Redundancy

• The degree of sparseness in distribution of the transformed signals is proportional to the amount of information conveyed by the transformation.

• Sparseness measurement− fourth-order statistics (kurtosis) nongaussianity

• Information redundancy reduction using ICA is higher than that using PCA.

3][][)kurt( 224 sEsEs

Eigenvoices versus Independent Voices

Evaluation of Kurtosis

0 5 10 15 20 25 30 355

Voice index

Independent voiceEigenvoice

Word Error Rates on Aurora2

K: number of components L: number of adaptation sentencesK=10 K=15

No adaptation Eigenvoice L=5 Eigenvoice L=10 Eigenvoice L=15 Independent voice L=5 Independent voice L=10 Independent voice L=15

TABLE OF CONTENTS

4. Summary

Test of Independence• Given the demixing signals , the null &

alternative hypotheses are defined as

• If y is Gaussian distributed, we are testing whether the correlation between and is equal to zero, i.e. or

Likelihood Ratio

• LR serves as the test statistics which measures the confidence for against .

• LR is a measure of independence forand can act as an objective

function for finding ICA demixing matrix.• However, it is not allowed to assume Gaussianity

for ICA problem.

Nonparametric Approach

• Let each sample be transformed by .

• Instead of assuming Gaussianity, we apply the kernel density estimation

using Gaussian kernel

• Kernel centroid is given by

Nonparametric Likelihood Ratio

• NLR objective function

with multivariate Gaussian kernel

ICA Learning Procedure

Output W

• Log likelihood ratio for null and alternative hypotheses

• Maximizing with respect to , , we obtain

NLR-ICA

K-means clustering

Training data

HMM 1 HMM 2 HMM M

Test dataSpeech recognizer

Hidden Markov model training

Recognitionresult

Viterbi alignment

Segment-based supervectorcollection for a subword unit

Lexicon

Cluster 1 Cluster 2 Cluster M

Segment-Based Supervector

Aligned utterance

1sx2sx

Nsx...

1Tx Tx

1x 2x 1Tx TxX

Segment-basedsupervector matrix

TABLE OF CONTENTS

4. Summary

ICA Objective Function

Independent Component Analysis

NegentropyDivergence Measure

MaximumLikelihood Kurtosis

Kullback-Leiblier (KL) Divergence

Euclidean Divergence

Cauchy Schwartz

Divergence

Convex Divergence-Divergence

23APSIPA DL: Independent Component Analysis and Unsupervised Learning

Mutual Information & KL Divergence

• Mutual information between two variables and is defined by using the Shannon entropy .

• It can be formulated as the KL divergence or relative entropy between the joint distribution and the product of marginal distribution

where .

Divergence Measures

• Euclidean divergence

• Cauchy-Schwartz divergence

• -divergence

Divergence Measures

• f-divergence

• Jensen-Shannon divergence

where . Entropy is a concave function.

Convex Function

)(f : convex function

• A convex function should meet the Jensen’s inequality

• A general convex function is defined by

Convex Divergence• By assuming equal weight , we have

• When , C-DIV is derived as a case with convex function

Different Divergence Measures

Convex Divergence ICA

• C-ICA learning algorithm

• Nonparametric C-ICA is established by using Parzen window density function.

Simulated Experiments

• A parametric demixing matrix

• Two sources: super-Gaussian and sub-Gaussiandistribution

• Kurtosis−Source 1: -1.13, source 2: 2.23

sincossincos

otherwise,0

],[,21

)( 11111

22 exp

KL-DIV

C-DIV alpha=1 C-DIV, alpha= -1

Learning Curves

Experiments on Blind Source Separation

• One music signal and two speech signals from two male speakers were sampled from ICA’99 BSS Test Sets at http://sound.media.mit.edu/ica-bench/

• Mixing matrix

• Evaluation metric−signal-to-interference ratio (SIR)

3.07.03.02.08.03.03.02.08.0

t t 12

10log10)dB(SIR sys

PC-ICA NC-ICA

Comparison of Different Methods

TABLE OF CONTENTS

3. Case Study II: Blind Source Separation• Convex divergence ICA • Nonstationary Bayesian ICA• Online Gaussian process ICA

4. Summary

• Real-world blind source separation−number of sources is unknown−BSS is a dynamic time-varying system−mixing process is nonstationary

• Why nonstationary?−Bayesian method using ARD can determine the changing number

of sources− recursive Bayesian for online tracking of nonstationary conditions−Gaussian process provides a nonparametric solution to represent

temporal structure of time-varying mixing system.

Why Nonstationary Source Separation?

Nonstationary Mixing Systems

• Time-varying mixing matrix• Source signals may abruptly appear or disappear

)(21ta

)(23ta

)1(21ta

'1d )1(

22 tt aa

)2(22ta

0)2(21 ta

Nonstationary Bayesian (NB) Learning

• Maximum a posteriori estimation of NB-ICA parameters and compensation parameters

updating

Prior Updating

Learning epoch t

Prior Updating

Learning epoch t+1

Learning epoch t

Model Construction

• Noisy ICA model

• Likelihood function of an observation

• Distribution of model parameters

−source

−mixing matrix

−noise

Prior & Marginal Distributions

• Prior distributions−precision of noise

−precision of mixing matrix

• Marginal likelihood of NB-ICA model

Automatic Relevance Determination

• Detection of source signals

−number of sources can be determined

Compensation for Nonstationary ICA

• Prior density of compensation parameter−conjugate prior (Wishart distribution)

Graphical Model for NB-ICA

)(lA)()( ll Hα

)(lM)(lB

)1( lρ )1( lV

)1( lu

)1( lω)(l

)1( lΦ

)1( lmΦ

)1( lrΦ

Experiments

• Nonstationary Blind Source Separation− ICA'99 http://sound.media.mit.edu/ica-bench/

• Scenarios−state of source signals: active or inactive−source signals or sensors are moving: nonstationary mixing matrix

Source Signals and ARD Curves

Blue: first source signalRed: second source signal

TABLE OF CONTENTS

3. Case Study II: Blind Source Separation• Convex divergence ICA • Nonstationary Bayesian ICA • Online Gaussian process ICA

4. Summary

Online Gaussian Process (OLGP)

• Basic ideas

− incrementally detect the status of source signals and estimate the corresponding distributions from online observation data

− temporal structure of time-varying mixing coefficients are characterized by Gaussian process.

−Gaussian process is a nonparametric model which defines the priordistribution over functions for Bayesian inference.

Model Construction

• Noisy ICA model• Likelihood function

• Distribution of model parameters− source

− noise

Gaussian Process

• Mixing matrix− is generated by the latent function

− GP is adopted to describe the distribution of

− are hyperparameters of kernel function

Graphical Model for OLGP-ICA

)(lts)(l

)1( laR)1( l

)1( lu

)1( lω )(ltε

)1( lsR

)1( lsM

)1( lsλ

)1( lsρ

)1( laΞ

)1( laΛ

Experimental Setup

• Nonstationary source separation using source signals from−http://www.kecl.ntt.co.jp/icl/signal/

• Nonstationary scenarios−status of source signals: active or inactive−source signals or sensors are moving: nonstationary mixing matrix

Male Music

Female

Comparison of Different Methods

• Signal-to-interference ratios (SIRs) (dB)

VB-ICA BICA-HMM Switching-ICA

OnlineVB-ICA

OLGP-ICA

Demixed signal 1 7.97 9.04 12.06 11.26 17.24

Demixed signal 2 -3.23 -1.5 -4.82 4.47 9.96

Summary• We presented speaker adaptation method based on

independent voices by fulfilling ICA perspective.• A nonparametric likelihood ratio ICA was proposed

according to hypothesis test theory. • A convex divergence was developed as an optimization

metric for ICA algorithm.• A nonstationary Bayesian ICA was proposed to deal with

nonstationary mixing system.• An online Gaussian process ICA was presented for

nonstationary and temporally correlated source separation.

• ICA methods could be extended to solve nonnegative matrix factorization and single-channel separation.

References• J.-T. Chien and B.-C. Chen, “A new independent component analysis for speech

recognition and separation”, IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1245-1254, 2006.

• J.-T. Chien, H.-L. Hsieh and S. Furui, “A new mutual information measure for independent component analysis”, in Proc. ICASSP, pp. 1817-1820, 2008.

• H.-L. Hsieh, J.-T. Chien, K. Shinoda and S. Furui, “Independent component analysis for noisy speech recognition”, in Proc. ICASSP, pp. 4369-4372, 2009.

• H.-L. Hsieh and J.-T. Chien, “Online Bayesian learning for dynamic source separation”, in Proc. ICASSP, pp. 1950-1953, 2010.

• H.-L. Hsieh and J.-T. Chien, “Online Gaussian process for nonstationary speech separation”, in Proc. INTERSPEECH, pp. 394-397, 2010.

• H.-L. Hsieh and J.-T. Chien, “Nonstationary and temporally-correlated source separation using Gaussian process”, in Proc. ICASSP, pp. 2120-2123, 2011.

• J.-T. Chien and H.-L. Hsieh, “Convex divergence ICA for blind source separation”, IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no. 1, pp. 290-301, 2012.

Thanks to

H.-L. Hsieh K. Shinoda S. Furui

jtchien@nctu.edu.tw

Independent Component Analysis and Unsupervised Learning · • Independent component analysis...

Documents

Image Fusion schemes using ICA basesutopia.duth.gr/nmitiano/pdf/book_fusion.pdf · (ICA) and Topographic Independent Component Analysis bases for image fusion. The bases are trained

ICA and Dual Regression Practical · ICA and Dual Regression Practical Independent Component Analysis (ICA) is a tool that we can use to decompose FMRI data into spatially independent

Independent component analysis for biomedical signalsweb.khu.ac.kr/~tskim/PatternClass Lec Note 23-4 ICA... · 2015-12-02 · Independent component analysis for biomedical signals

Techniques & Applications Independent component analysis ...u.cs.biu.ac.il/~louzouy/courses/seminar/ica.pdf · Independent component analysis (ICA) is a method for automatically identifying

Independent Component Analysis and Blind Signal Separation · Independent Component Analysis and Blind Signal Separation Fifth International Conference, ICA 2004 Granada, Spain, September

Mälardalen University Press Licentiate Theses No. 204 ...806958/FULLTEXT01.pdf · template subtraction, independent component analysis (ICA), wavelet transform, wavelet-ICA, artificial

Independent Component Analysis of Divergence Eye Movements › ~alvarez › Conf Papers › ICA of... · 2005-08-10 · Independent Component Analysis of Divergence Eye Movements

Independent Component Analysis Independent Component Analysis

Independent Components Analysis. What is ICA? “Independent component analysis (ICA) is a method for finding underlying factors or components from multivariate

Recognizing Faces with PCA and ICA · baseline system. In particular, we compare principal component analysis (PCA) to independent component analysis (ICA), as implemented by the

Independent Component Analysis and Its Applicationsfmriserver.ucsd.edu/ttliu/be280a_12/BE280A12_eeg3.pdf · 2012-11-14 · Independent Component Analysis ICA is a method to recover

What is ICA?web.eng.fiu.edu/mcabre05/DATA FOR PROJECTS/ICA... · ICA 3 Blind Signal Separation (BSS) or Independent Component Analysis (ICA) is the identification & separation of

Efficient independent component analysis · EFFICIENT INDEPENDENT COMPONENT ANALYSIS 2827 The outline of the paper is as follows. In Section 2 we analyze ICA as a semi-parametric

Independent Component Analysis (ICA)shireen/pdfs/tutorials/Elhabian_ICA09.pdf · Definition of ICA •This statistical model is called independent component analysis, or ICA model

Nonlinear independent component analysis: A principled ... · Deep Learning Independent component analysis Nonlinear ICA Connection to VAE’s Nonlinear independent component analysis:

Introduction to Machine Learning CMU-10701bapoczos/slides/ICA.pdf20. Independent Component Analysis Barnabás Póczos . 2 Contents ICA model ICA applications ICA generalizations ICA

Multimedia Research (MR) · independent component analysis (ICA) [6], principal component analysis (PCA) [7], linear combination and regression[1], adaptive filters, neural networks

Independent Component Analysis (ICA) for the … Component Analysis (ICA) for the extraction of protein profiles from MALDI-TOF MS spectra Gopal Peddinti T.61-6070 Modeling of Proteomics

Independent Component Analysis Using the ICA Procedure · 2019-06-05 · Paper SAS2997-2019 Independent Component Analysis Using the ICA Procedure Ning Kang, SAS Institute Inc., Cary,

Independent Component Analysis Computing Independent ...maov/classes/vision08/... · Independent Component Analysis • PCA finds the directions that uncorellate • ICA / Blind Source