Robust music signal separation based on supervised nonnegative matrix factorization with prevention...
30
Robust Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Prevention of Basis Sharing Daichi Kitamura, Hiroshi Saruwatari, Kosuke Yagi, Kiyohiro Shikano ( Nara Institute of Science and Technology, Japan ) Yu Takahashi, Kazunobu Kondo ( Yamaha Corporation, Japan ) IEEE International Symposium on Signal Processing and Information Technology December 12-15, 2013 - Athens, Greece Session T.B3: Speech – Audio - Music
Robust music signal separation based on supervised nonnegative matrix factorization with prevention of basis sharing
1. Robust Music Signal Separation Based on Supervised
Nonnegative Matrix Factorization with Prevention of Basis Sharing
Daichi Kitamura, Hiroshi Saruwatari, Kosuke Yagi, Kiyohiro Shikano
Nara Institute of Science and Technology, Japan Yu Takahashi,
Kazunobu Kondo Yamaha Corporation, Japan IEEE International
Symposium on Signal Processing and Information Technology December
12-15, 2013 - Athens, Greece Session T.B3: Speech Audio -
Music
2. Outline 1. Research background 2. Conventional method
Nonnegative matrix factorization Supervised nonnegative matrix
factorization Problem of conventional method: basis sharing 3.
Proposed method Penalized supervised nonnegative matrix
factorization Orthogonality penalty Maximum-divergence penalty 4.
Experiments Two-source case Four-source case 5. Conclusions 2
3. Outline 1. Research background 2. Conventional method
Nonnegative matrix factorization Supervised nonnegative matrix
factorization Problem of conventional method: basis sharing 3.
Proposed method Penalized supervised nonnegative matrix
factorization Orthogonality penalty Maximum-divergence penalty 4.
Experiments Two-source case Four-source case 5. Conclusions 3
4. Sound signal separation decomposes target source from an
observed mixed signal. Speech and noise, specific instrumental
sound, etc. Typical method for sound signal separation is treated
in the time-frequency domain. Background Extract! Time Frequency
Spectrogram First tone Second tone Separation 4
5. Outline 1. Research background 2. Conventional method
Nonnegative matrix factorization Supervised nonnegative matrix
factorization Problem of conventional method: basis sharing 3.
Proposed method Penalized supervised nonnegative matrix
factorization Orthogonality penalty Maximum-divergence penalty 4.
Experiments Two-source case Four-source case 5. Conclusions 5
6. Nonnegative matrix factorization (NMF) is a sparse
representation algorithm. can extract significant features from the
observed matrix. It is difficult to cluster the bases as specific
sources. Nonnegative matrix factorization [Lee, et al., 2012]
Amplitude Amplitude Observed matrix (spectrogram) Basis matrix
(spectral patterns) Activation matrix (Time-varying gain) Time :
Number of frequency bins : Number of time frames : Number of bases
Time Frequency Frequency 6 Basis
7. SNMF utilizes some sample sounds of the target. Construct
the trained basis matrix of the target sound. Decompose into the
target signal and other signal. Supervised NMF (SNMF) [Smaragdis,
et al., 2007] Separation process Optimize Training process
Supervised basis matrix (spectral dictionary) Sample sounds of
target signal 7Fixed Ex. Musical scale Target signal Other
signalMixed signal
8. Problem of SNMF Basis sharing problem in SNMF There is no
constraint between and . Other bases may also have the target
spectral patterns. The estimated target signal loses some of the
target signal. The cost function is only defined as the distance
between 8 Estimated target signal Estimated other signals Target
signal If also have the target basis and .
9. Basis sharing problem: example of SNMF 9 Separated by SNMF
Mixed signal Only the target signal (oracle)
10. Basis sharing problem: example of SNMF 10 Only the target
signal (oracle) Separated by SNMF Mixed signal
11. Basis sharing problem: example of SNMF 11 Separated by SNMF
Separated signal (estimated) The estimated signal loses some of the
target components because of the basis sharing problem.
12. Outline 1. Research background 2. Conventional method
Nonnegative matrix factorization Supervised nonnegative matrix
factorization Problem of conventional method: basis sharing 3.
Proposed method Penalized supervised nonnegative matrix
factorization Orthogonality penalty Maximum-divergence penalty 4.
Experiments Two-source case Four-source case 5. Conclusions 12
13. Proposed method In SNMF, other basis matrix may have the
same spectral patterns with supervised basis matrix . Propose to
make as different as possible from by introducing a penalty term in
the cost function. 13 Target signal Other signalMixed signal Fixed
Optimize as different as possible from . Basis sharing problem
Penalized SNMF (PSNMF)
14. Decomposition model and cost function 14 Decomposition
model: Cost function in SNMF: Generalized divergence function:
-divergence [Eguchi, et al., 2001] Supervised basis matrix
(fixed)
15. Decomposition model and cost function 15 Introduce a
penalty term We propose two types of penalty terms. Cost function
in PSNMF: Decomposition model: Cost function in SNMF: Supervised
basis matrix (fixed)
16. Orthogonality penalty Orthogonality penalty is the
optimization of that minimizes the inner product of matrices and .
If includes the similar basis to , becomes larger. All the bases
are normalized as one. Introduce a weighting parameter . 16
17. Maximum-divergence penalty Maximum-divergence penalty is
the optimization of If includes the similar basis to , the
divergence becomes smaller. All the bases are normalized as one.
Introduce a weighting parameter and sensitivity parameter . 17 that
maximizes the divergence between and .
18. Derivation of optimal variables in PSNMF Derive the optimal
variables . Auxiliary function method Optimization scheme that uses
the upper bound function. Design the auxiliary function for and as
and . Minimize the original cost functions by minimizing the
auxiliary functions indirectly. 18
19. Derivation of optimal variables in PSNMF The second and
third terms become convex or concave function w.r.t. value. Convex:
Jensens inequality Concave: tangent line inequality 19 where
20. Derivation of optimal variables in PSNMF Always becomes the
convex function Convex: Jensens inequality 20 : auxiliary
variable
21. Derivation of optimal variables in PSNMF Auxiliary
functions and are designed as The update rules for optimization are
obtained by 21 , and .
22. Update rules for optimization of PSNMF Update rules with
orthogonality penalty 22 where,
23. Update rules for optimization of PSNMF Update rules with
maximum-divergence penalty 23 where,
24. Outline 1. Research background 2. Conventional method
Nonnegative matrix factorization Supervised nonnegative matrix
factorization Problem of conventional method: basis sharing 3.
Proposed method Penalized supervised nonnegative matrix
factorization Orthogonality penalty Maximum-divergence penalty 4.
Experiments Two-source case Four-source case 5. Conclusions 24
25. Produced four melodies using a MIDI synthesizer. Used the
same MIDI sounds of the target instruments containing two octave
notes as a supervision sound. Evaluation in two-source case and
four-source case. There are 12 combinations in the two-source case,
and 4 patterns in the four-source case. Experimental conditions 25
Training sound Two octave notes that cover all the notes of the
target signal.
26. Evaluation scores [Vincent, 2006] Source-to-distortion
ratio (SDR) SDR indicates the total quality of separated signal.
Experimental conditions Observed signal Mixed 2 or 4 signals as the
same power Training signal The same MIDI sounds of the target
signal containing two octave notes Divergence criteria All
combinations of Number of bases Supervised bases : 100 Other bases
: 50 Parameters Experimentally determined Methods Conventional
SNMF, Proposed PSNMF 26
29. Example of separation (Cello & Oboe) 29 Separated by
SNMF Cello signal Mixed signal Separated by PSNMF (Ortho.)
30. Conclusions Conventional supervised NMF has a basis sharing
problem that degrades the separation performance. We propose to add
a penalty term, which forces the other bases to become uncorrelated
with supervised bases, in the cost function. Penalized supervised
NMF can achieve the high separation accuracy. 30 Penalized
supervised NMF Thank you for your attention!