The Use of Sparsity Hypothesis for Source Separation

Prasad Sudhakar, PhDPost-doctoral researcher, ICTEAM/ELEN, UCL

prasad.sudhakar@uclouvain.be

13 October, 2011

SPS Seminar Series, ICTEAM/ELEN, UCL

Context

MixturesSources Mixing process

Source separation

Unmixing process N Source estimates

M < N : Underdetermined

M Mixtures

Acoustic mixing model

Filter of length L

N Sources

M convolutive mixtures

Blind filter estimation and source separation

Needs further hypothesis

Source Estimation

Filter Estimation

• Mathematically, ill posed and hence impossible to solve

• Source separation: Known mixtures Unknown sources

• Filter estimation: Known mixtures Unknown filters

Enabling hypotheses

• Independent Component Analysis (ICA)• Hypothesis: Sources are statistically independent • Objective: Estimate the sources by maximising independence: Ex. Minimise mutual information.

• Non-negative Matrix Factorisation (NMF)• Hypothesis: Mixtures can be factored into two non-negative matrices• Objective: Seek factor matrices which well approximates the mixtures

• Sparse Component Analysis (SCA)• Hypothesis: Sources and/or mixing filters are sparse• Objective: Seek sparse sources and/or filters from mixtures

Filter types

Instantaneous Anechoic SparseFully

convolutive

Sparse sources• Many audio sources are sparse in the time-frequency domain

• Short-Time-Fourier-Transform (STFT): commonly used analysis tool

STFTSparse

Can Sparsity still help?

Mixing complexity

Landscape of methods

Source sparsity

State of the art: Sparse Component

Analysis (SCA)

Instantaneous/Anechoic

Fully convolutive

N <= M

Sparse

State of the art: Cross-Relation (CR)

based approaches

State of the art: 1) Independent

ComponentAnalysis (ICA)

2) Non-negative MatrixFactorisation

What is sparsity?

• Sparsity: few significant coefficients compared to the size of the signal

Discrete, sensitive to noise

• Relaxed convex norm:

• True sparsity measure:

k - sparse vector

Relevance of sparsity

• Helps reduce problem complexity

• Aids data compression: data can be sparse in a transform domain

• e.g. DCT of natural images, STFT of audio sources, etc.

• Of late, a popular prior for solving linear inverse problems: sparse recovery problems

• compressed sensing

• Suppose we need to solve

• Under-determined, non-uniqueness of solution

• Prior: Solution is sparse

Sparsity for linear underdetermined systems

• Ideally:

• Sparse approximation:

Algorithmic families for sparse recovery

• Principle: Iteratively estimate the non-zero coefficients

• Sub-optimal and heuristic

• Easy on computations

• Theoretically not very well understood

• Ex.: Matching Pursuit, OMP, StOMP, etc.

Convex relaxation Greedy methods

• Principle: Replace norm by convex norms

• Provides provably optimal solutions in several settings

• Computation intensive• Well rounded theory:

•Thanks to ‘Compressed Sensing’

• Ex.: Basis pursuit, BPDN, etc.

Sparse component analysis

Estimating sparse sources with known parameters

• Suppose the mixing matrix A is known, then

• Likewise, if symbolises convolutive mixtures

What if the mixing coefficients are unknown?

• Model: Instantaneous mixtures of sparse sources

Instantaneous mixtures of sparse sources

• Model: Sparse and disjoint sources

Sparse and disjoint sources

Instantaneous mixture

References•A. Jourjine et al., 2000•P. Bofill & M. Zibulevsky, 2001•M. Zibulevsky & B.A. Pearlmutter, 2001

• Suppose the sets are known, use time-frequency masking and invert transform to estimate the sources.

Use geometric ideas• How to obtain the sets ?

Colours: Sources

Lengths: Scaling

Use of scatter plots to identify (Stereo M=2)

• Knowledge of can then be used for: 1) Filter estimation and 2) Separation

eA ⇡ A�gPg• Up to a permutation and scaling:

Scatter plot

Generalisations and improvements

• Anechoic mixtures:

• The mixing coefficients also involve a phase parameter

• In stereo case, only one phase needs to be estimated (fixing the other)

• Extension to quasi-disjoint sources

• In a generic setting, one can allow at most M-1 sources to be active at a given TF location

• Statistical confidence measures to obtain the sets

xi(t) =NX

(aij ⇥ sj)(t) �! xj(�, f) ⇡NX

aij(�, f)sj(�, f), 1 j M

• Convolutive mixtures

• Narrowband approximation through a suitable transform, like STFT

• One convolutive problem to M complex instantaneous problems

Can we use the techniques developed for instantaneous mixtures?

Convolutive mixtures: Permutation and scaling

Estimated mixingparameters

• Frequency dependent scaling:

Scatter plot

Conclusions

• Scatter plots alone are not sufficient

• Need permutation and scaling correction

• Time / frequency domain ICA approaches also suffer

• Each frequency bin is arbitrarily scaled and permuted

• If : estimated filter coefficients, then

Consequences

Sparse filters for permutationcorrection

Relevance of sparse filters

An example of an underwater communication channel

• Few reflection paths = time domain sparsity- Underwater acoustics- Communications engineering

Wide band CDMA- Biomedical

Ultrasound imaging

Picture courtesy: Berger et al., 2009

Permutation correction problem

• Setting: Permutation only

Permutation correction

• Claim: Suppose the filters are sufficiently sparse, thenPermutation corrected filters have least norm

and hence, permutations can be recovered by minimisation

: Set of permuted frequencies

• Problem:

Success

Permutation correction by minimisation

• Conclusion: Sparsity prior on filters helps solve permutation problem, in the absence of scaling.!

ATD• - True time-domain filter matrix; - Estimated filter matrix

• Recovery Performance measure:

Scaling correction

Single input - two output setting

• Model:

1. Normalisation of solution2. Possibly non-unique solution

• Issues:

• Conclusion: Needs a prior

• Consequence: Removes TF dependent source scaling

Matrix form:

Double Toeplitz matrix

Cross-relation (CR)• Property:

References•H. Liu et al., 1994•G.Xu et al., 1995

Formulation of sparse filter estimation problem

Normalisation

• Issue: How to exploit this approach in multiple source setting?

• In standard sparse recovery problems, minimisation seeks sparse solutions

• Convex and can be solved using standard packages

• Is noise aware

where and

• Filter recovery problem formulation

References•A. Aïssa-El-Bey et al., 2008

Scatter plots

Instantaneous/Anechoic

Fully convolutive

N <= M

Sparse

State of the art: Cross-Relation (CR)

based approaches

Source sparsity

State of the art: Sparse Component

Analysis (SCA)

Revisiting the landscape

• Permutations• Scaling

Multiple filter estimation framework

Multiple sparse filter estimation

References•A. Aïssa-El-Bey et al., 2009

Multiple filter estimation using time-domain disjointness and CR

• Suppose there exists intervals where only one source is active

• The mixtures contain intervals where only one source contributes

• Mixtures in those intervals satisfy time-domain CR

• Filters can be estimated by solving the sparse recovery problem

What if the sources are not disjoint?

• Given the mixtures, if we know which source is active at which TF locations

• and build a matrix or for each source such that

• then we can express the CR in the TF domain in two different forms:

1) narrowband (NB) approximation2) wideband (WB) formulation

Multiple filter estimation using time-frequency (TF) domain disjointness and CR

Narrowband and Wideband CR

CR-NB CR-WB

No narrowband approximation

Narrowband approximation

CR-TDWe have:

Narrowband CR Wideband CR

Given:

Multiple filter estimation using TF domain CR

B = B⌦j

NB B = B⌦j

WBwhere or and

• Further, use the matrix B to solve the following and estimate the filters

• A single N filter estimation problem is reduced to N single filter estimation problem

A two stage framework

• Filter estimation: using generic toolboxes for convex programming• Clustering: difficult problem, dictates the performance of filter estimation

• For each source

1) Time-frequency clustering:

2) Filter estimation:

Solve the optimisation problem to estimate the filters

Identify the time-frequency regions where only one source is active

Experiments with controlled setting

Sparse filter

Guitar

Instantaneous• Setup:

• Main issue: Blind clustering of TF points where only source 2 is active

• Remove the points corresponding to instantaneously mixed source

Experiments: Blind clustering

• Remove the points corresponding to source 1

• Remaining points correspond to source 2

• STFT magnitude of a mixture

• Use DEMIX or similar approach to identify the regions where source 1 is active

Reference• Arberet et. al., 2010

Performance measure

• SNR measure of the estimated filters

• Takes care of global shift and global scale ambiguity inherent to problem formulation

Results

• Debiasing: Extract valid support and readjust the coefficients by performing minimisation

• Wideband method with debiasing outperforms state of the art by at least 10 dB

Reference•C. Knapp & G. Carter, 1976

Summary and perspectives

Summary

• Yes: framework for multiple filter estimation problem, central to convolutive source separation

Sparsity hypothesis

• can it be used to solve problems beyond standard linear inverse problems?

• by combining the notions of time-frequency domain sparsity of sources and time-domain sparsity of filters• Empirically results show the ability of the two stage framework to estimate filters

Perspectives (1/2)

• Anechoic approximations using DOA information.!

• Central issue with the framework: blind clustering• Ideas from anechoic settings?

• Cluster initialisation using filter approximations?

Clustering

+Noise

FilterEstimation

• From filter estimation to source estimation

Perspectives (2/2)

• Connections with subspace learning

• Sparse vector orthogonal to matrix B. • Subspaces characterised by sparse vectors.

• Exploiting sparsity in non-standard domains• Connections with synthesis and analysis priors

• Exploiting structured sparsity

• Theoretical analysis of the filter estimation framework• Understanding identifiability and recovery conditions

1) P. Sudhakar. Sparse models and convex optimisation for convolutive blind source separation. PhD thesis, University of Rennes 1, France, February 2011.

2) A. Benichoux, P. Sudhakar, R. Gribonval. Well-posedness of the frequency permutation problem in sparse filter estimation with lp minimization. In SPARS’11, Jun 2011.

3) P. Sudhakar, S. Arberet and R. Gribonval. Sparse models for multiple mixing filter estimation from stereo convolutive mixtures. Submitted to IEEE TALSP, June 2011.

4) S. Arberet, P. Sudhakar, and R. Gribonval. A wideband doubly-sparse approach for MITO sparse filter estimation. In proceedings of ICASSP 2011, May 2011.

5) P. Sudhakar, S. Arberet, and R. Gribonval. Double Sparsity: Towards Blind Estimation of Multiple Channels. In proceedings of LVA/ICA, 2010, St. Malo, France.

6) P. Sudhakar and R. Gribonval. A sparsity-based method to solve the permutation indeterminacy in frequency domain convolutive blind source separation. In proceedings of ICA, 2009, Paraty, Brazil.

Some relevant publications

•Clipart from: http://www.clker.com/

Thanks to my collaborators

• Remi Gribonval, METISS, INRIA Rennes-Bretagne Atlantique, France remi.gribonval@inria.fr

• Simon Arberet, LTS2, EPFL, Switzerland simon.arberet@epfl.ch

•CR at a given point

•For a given frame index

•If there are frames, then define

where is the forward Fourier matrix of size FxF

•it satisfies:

Structure of NB matrix

Projection of convoluted sequences

Lemma 1: Let be a bounded real valued signal, let be a finite real valued signal and let be a finite signal, possibly complex,

Structure of WB matrix

TF domain CR

If is a STFT dictionary of one sample shift, then

By lemma 1

Structure of WB matrix

If we have

then we can define which satisfies

Time-frequency disjointness in NB formulation

Consider

If then

Time-frequency disjointness in WB formulation

Consider

If then

Note that

By lemma 1

Oracle clustering

True filters satisfy the CR for source k

Blind clustering

Cluster initialisation using filter approximations

Clustering

+Noise

FilterEstimation

Permutation correction: Disjoint time supports

Let be filters with mutually disjoint

supports and let be the filters obtained

after frequency domain permutations at frequency indices in , then

Theorem 1:

• Independent of

1) Sparsity

2) Number of permutations

Let and be two sparse filters and let and be

the filters obtained after frequency domain permutations at frequency

indices in , then

Theorem 2:

Permutation correction: Sparse filters

• Doesn’t assume disjoint supports

• Gives a regime of and for which minimisation recovers permutations

• Inequality result comes from Theorem 1

• Equality condition implies global permutation

• Conclusion: Under assumed conditions

• Permuted filters have larger norm than the corresponding true filters

• True filters can be uniquely recovered by minimisation

Permutation correction: Equality case

Further, equality in the above equation implies

EITHER I)

OR II)

b) If and have disjoint supports, then

Variation of norm against permutations

• Performance measure:

• Objective: To assess whether permutations increase norm of the filters

• Conclusion: Empirically, bigger the number of permutations, larger the increase in norm

• Results:

Sparsity in the thesis

In this work, we use the sparsity hypothesis twice

1. Source sparsity in the time-frequency domain

2. Filter sparsity in the time domain

Plan of the talk

1. Tools

i. Sparse component analysis

ii. Cross-relation based approaches

2. Permutation correction using sparsity

3. Framework for multiple filter estimation

4. Summary and perspectives

Time-frequency masking

Instantaneous mixture

• Model: Sparse and disjoint sources

Sparse and disjoint sources

• How to identify ?

• Consequence: just need to know which source is active at which TF locations for source localisation and separation

Use scatter plots

References•A. Jourjine et al., 2000•P. Bofill & M. Zibulevsky, 2001•M. Zibulevsky & B.A. Pearlmutter, 2001

Clustering

• Goal: To assess the overall performance in a realistic setting

• Blind clusteringAssumes all sources except one are

instantaneously mixed

• In this work, two kinds of experiment are done

Experiments with synthetic data Experiments with audio data

• Goal: To assess the performance of the filter estimation step

• Oracle clustering:• uses the knowledge of true filters (ground truth) • depends on a threshold

Experiments with synthetic data

• Source modelSum of sinusoids with Gaussian envelopes

of random lengths

• Study the effect of 1) STFT window size F

2) Clustering threshold

on the filter recovery performance, using oracle clustering

• Sparse filters of length L = 256

• Number of sources N = 3

Results: Effect of STFT window size

• NB approach gets better as window size increases relative to filter length

• WB approach performs better when window size is less or equal to filter length

Results: Effect of clustering threshold CR-NB CR-WB

• NB approach degrades when threshold increases: due to lesser number of observations

• WB approach performs better when threshold increases: due to accurate CR

Window size = 1024 Window size = 128

The Use of Sparsity Hypothesis for Source Separation · 2012. 3. 14. · • compressed sensing. 12...

Documents

ReFACTor Column-Sparsity - arXiv

Contributions to the Derivation and Well-posedness Theory

Some well-posedness and general stability results in

Learning With Dynamic Group Sparsity

Generalized Levitin-Polyak Well-Posedness of Vector

WELL-POSEDNESS OF ONE-DIMENSIONAL KORTEWEG

Sparsity Models - Tsinghuabigeye.au.tsinghua.edu.cn › DragonStar2012 › docs › sparsity.pdf · Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models

Branch Detection and Sparsity Estimation in Matlab EuroAd Workshop - Marina... · Branch Detection and Sparsity Estimation in ... Sparsity estimation for Jacobian function y = y(x(1),x

Testing Fourier dimensionality and sparsity › ~odonnell › papers › testing-sparsity.pdf · have sparsity 1, and the rest have sparsity 0. A similar argument to the one used

Structured Sparsity in Natural Language Processing · 2021. 2. 13. · Why Sparsity is Desirable in NLP Occam’s razor and interpretability. The bet on sparsity (Friedman et al.,

Learning with sparsity-inducing norms

Statistical Learning with Sparsity

Well Posedness of Perfectly Matched or Dissipative

Well-posedness and scalarization in vector optimization · over, we relate vector well-posedness and well-posedness of an appro-priate scalarization. This approach allows us to show

On the well-posedness, equivalency and low …scientiairanica.sharif.edu/article_21594_95d49167195ec...On the well-posedness, equivalency and low-complexity translation techniques

FUNCTIONAL SPARSITY: GLOBAL VERSUS LOCALkaib.people.cofc.edu/research/Sinica_FunSparse.pdf · described two types of sparsity: global and local sparsity. In particular, if f is zero

Probabilistic global well-posedness of the energy-critical ...op4/2015Princeton.pdf · Probabilistic global well-posedness of the energy-critical defocusing nonlinear wave equation

Well-posedness for 1D Compressible Navier-Stokes/Cahn

Regularity, singularity and well-posedness

EPFL workshop on sparsity