57
Music Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013

Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

Music Information Retrieval in PolyphonicMixtures

Steve Tjoa

MIR WorkshopCCRMA, Stanford University

June 27, 2013

Page 2: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Quick Review

What are the three main components of any classificationsystem?

What are some useful features for MIR?

What are some problems and applications addressed by MIR?

Page 3: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Music Transcription

From this song...

get the “piano roll”:

e3

f3

f#3

g3

g#3

a3

hb3

h3

c4

c#4

d4

eb4

e4

f4

f#4

g4

g#4

a4

hb4

h4

c5

c#5

d5

eb5

e5

f5

f#5

g5

g#5

a5

hb5

h5

c6

c#6

d6

Time (ticks)

Pitch

Beatles - No Reply

100 200 300 400 500 600 700 800 900 1000

140

150

160

170

180

190

200

210

Page 4: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Music Transcription

From this song... get the “piano roll”:

e3

f3

f#3

g3

g#3

a3

hb3

h3

c4

c#4

d4

eb4

e4

f4

f#4

g4

g#4

a4

hb4

h4

c5

c#5

d5

eb5

e5

f5

f#5

g5

g#5

a5

hb5

h5

c6

c#6

d6

Time (ticks)

Pitch

Beatles - No Reply

100 200 300 400 500 600 700 800 900 1000

140

150

160

170

180

190

200

210

Page 5: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Music Source Separation

Isolate, amplify, or suppress a musical voice/instrument.

Example: From these beats...

isolate the kick drum andsnare drum.

Page 6: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Music Source Separation

Isolate, amplify, or suppress a musical voice/instrument.

Example: From these beats... isolate the kick drum andsnare drum.

Page 7: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.A Really Special Tool

Nonnegative Matrix Factorization (NMF):

Given X nonnegative, find W and H, both nonnegative, thatminimize some distance d(X,WH).

Easy! And it works.

Meaningful to humans.

Widely used.

Page 8: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.A Really Special Tool

Nonnegative Matrix Factorization (NMF):

Given X nonnegative, find W and H, both nonnegative, thatminimize some distance d(X,WH).

Easy! And it works.

Meaningful to humans.

Widely used.

Page 9: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Why NMF?

Energy of musical events are nonnegative.

Page 10: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Brief Refresher: Matrix Multiplication

[1 2

] [ ab

]= a+ 2b

[34

] [a b c

]=

[3a 3b 3c4a 4b 4c

]w[a b c

]=

[aw bw cw

][34

]h =

[3h4h

]

Page 11: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Brief Refresher: Matrix Multiplication

[1 2

] [ ab

]= a+ 2b

[34

] [a b c

]=

[3a 3b 3c4a 4b 4c

]

w[a b c

]=

[aw bw cw

][34

]h =

[3h4h

]

Page 12: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Brief Refresher: Matrix Multiplication

[1 2

] [ ab

]= a+ 2b

[34

] [a b c

]=

[3a 3b 3c4a 4b 4c

]w[a b c

]=

[aw bw cw

]

[34

]h =

[3h4h

]

Page 13: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Brief Refresher: Matrix Multiplication

[1 2

] [ ab

]= a+ 2b

[34

] [a b c

]=

[3a 3b 3c4a 4b 4c

]w[a b c

]=

[aw bw cw

][34

]h =

[3h4h

]

Page 14: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Nonnegative Matrix Factorizaton

Top right: X. Left: W. Bottom: H. Three piano notes:

1 2 3

Fre

qu

en

cy

3

2

1

Time

Page 15: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Nonnegative Matrix Factorizaton

Top right: X. Left: W. Bottom: H. Kick and snare:

1

Fre

quen

cy

12

Time

2

Spectrogram

Page 16: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.NMF Algorithms

Multiplicative update rules:

W←W · XHT

WHHTH← H · WTX

WTWH

See [Lee and Seung, NIPS 2001].

Page 17: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.NMF Algorithms

Easy to implement!Python:

1 for iter in range(maxiter):

2 W = multiply(W, (X*H.T)/(W*H*H.T))

3 H = multiply(H, (W.T*X)/(W.T*W*H))

Matlab:

1 for iter=1:maxiter

2 W = W.*(X*H’)./(W*H*H’);

3 H = H.*(W’*X)./(W’*W*H);

4 end

Page 18: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.NMF Algorithms

Easy to implement!Python:

1 for iter in range(maxiter):

2 W = multiply(W, (X*H.T)/(W*H*H.T))

3 H = multiply(H, (W.T*X)/(W.T*W*H))

Matlab:

1 for iter=1:maxiter

2 W = W.*(X*H’)./(W*H*H’);

3 H = H.*(W’*X)./(W’*W*H);

4 end

Page 19: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Example: Source Separation

kick and snare:

[kick drum] and [snare drum]

oboe and horn:

Duan et. al: [oboe] and [horn]

Wang et. al: [oboe] and [horn]

Tjoa and Liu: [oboe] and [horn]

Vivaldi, Winter, Four Seasons:

[solo] and [accompaniment]

Page 20: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Example: Source Separation

kick and snare:

[kick drum] and [snare drum]

oboe and horn:

Duan et. al: [oboe] and [horn]

Wang et. al: [oboe] and [horn]

Tjoa and Liu: [oboe] and [horn]

Vivaldi, Winter, Four Seasons:

[solo] and [accompaniment]

Page 21: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Example: Source Separation

kick and snare:

[kick drum] and [snare drum]

oboe and horn:

Duan et. al: [oboe] and [horn]

Wang et. al: [oboe] and [horn]

Tjoa and Liu: [oboe] and [horn]

Vivaldi, Winter, Four Seasons:

[solo] and [accompaniment]

Page 22: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Example: Instrument Recognition

Use NMF to identify the instruments in a musical signal.Observe these atoms:

1

Fre

quen

cy

12

Time

2

Spectrogram

Page 23: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

Filter the temporal atoms from NMF [Tjoa and Liu, 2010]:

ta = 0.010

ta = 0.020

ta = 0.040

ta = 0.080

ta = 0.160

ta = 0.320

ta = 0.640

ta = 1.280

0 1 2 3 4 5Time (seconds)

Use support vector machine (SVM) to classify the processedspectral and temporal atoms.

Page 24: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Feature Vector of Kick Drum

1.0000.8000.6000.4000.2000.1000.0500.020

n = 1.2

n = 1.5

n = 2.0

n = 3.0

max

Attack Time (seconds)

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40Time (seconds)

InputAtom

Page 25: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Feature Vector of Snare Drum

1.0000.8000.6000.4000.2000.1000.0500.020

n = 1.2

n = 1.5

n = 2.0

n = 3.0

max

Attack Time (seconds)

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40Time (seconds)

InputAtom

Page 26: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Feature Vector of Trumpet

1.0000.8000.6000.4000.2000.1000.0500.020

n = 1.2

n = 1.5

n = 2.0

n = 3.0 max

Attack Time (seconds)

0.0 0.5 1.0 1.5 2.0Time (seconds)

InputAtom

Page 27: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Feature Vector of Violin

1.0000.8000.6000.4000.2000.1000.0500.020

n = 1.2

n = 1.5

n = 2.0

n = 3.0

max

Attack Time (seconds)

0 1 2 3 4 5 6Time (seconds)

InputAtom

Page 28: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Results: Isolated Instrument Recognition

Experiments on isolated instrument sounds:

Accuracy: 92.3%

Reflect state-of-the-art performance for isolated instrumentrecognition among as many as 24 classes.

Page 29: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

Bas

soon

Cla

rine

tF

lute

Obo

eSa

xoph

one

Hor

nT

rom

bone

Tru

mpe

tT

uba

Cel

loV

iola

Vio

linC

ello

Piz

zV

iola

Piz

zV

iolin

Piz

zG

lock

ensp

iel

Gui

tar

Mar

imba

Pia

noX

ylop

hone

Kic

kSn

are

Tim

pani

Tom

s

BassoonClarinet

FluteOboe

SaxophoneHorn

TromboneTrumpet

TubaCelloViola

ViolinCello PizzViola Pizz

Violin PizzGlockenspiel

GuitarMarimba

PianoXylophone

KickSnare

TimpaniToms

0.01

0.10

0.20

0.40

0.60

0.80

1.00

Page 30: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Results: Solo Melodic Phrases

Instrument classifications. One decision per signal.Accuracy: 96.2%.

bass

oon

clar

inet

flute

oboe

horn

trom

bone

trum

pet

tuba

cello

viol

a

viol

in

bassoon

clarinet

flute

oboe

horn

trombone

trumpet

tuba

cello

viola

violin0.00

0.20

0.40

0.60

0.80

1.00

Page 31: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Results: Solo Melodic Phrases

Family classifications. One decision per signal.Accuracy: 97.4%.

win

d

bras

s

stri

ngs

wind

brass

strings

0.00

0.20

0.40

0.60

0.80

1.00

Page 32: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Current and Future Work

Existing algorithms cannot handle “complicated” music.

Page 33: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

Related work:

smoothness

harmonicity

statistical priors

Page 34: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Sparse Coding

What if you already have a large dictionary?

mins

d(x,As)

Solution: Impose sparsity on s.

Benefits: guaranteed spectral structure; labels already known.

Page 35: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Sparse Coding

Related work:

matching pursuit (MP)

orthogonal matching pursuit (OMP)

basis pursuit (BP)

Disadvantages:

Complexity that is linear in the dictionary size.

Neither fast nor scalable.

Page 36: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Example: Orthogonal Matching Pursuit

OMP [Pati et al., 1993]:

Input: x ∈ RM ; A = [a1, a2, ..., aK ] ∈ RM×K s.t. ||ak ||2 = 1for all k .

Output: s ∈ RK

Initialize: S ← ∅; s← 0; r← x; ε > 0.

While ||r|| > ε:

1. k ← argmaxj aTj r

2. S ← S ∪ k3. Solve for {sj |j ∈ S}: minsj |j∈S ||x−

∑j∈S ajsj ||

4. r← x− As

s← s

Page 37: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Proposed Algorithm: Approximate Matching Pursuit

AMP [Tjoa and Liu]:

Input: x ∈ RM ; A = [a1, a2, ..., aK ] ∈ RM×K s.t. ||ak ||2 = 1for all k .

Output: s ∈ RK

Initialize: S ← ∅; s← 0; r← x; ε > 0.

While ||r|| > ε:

1. Find any k such that ak and r are near neighbors.2. S ← S ∪ k3. Solve for {sj |j ∈ S}: minsj |j∈S ||x−

∑j∈S ajsj ||

4. r← x− As

s← s

Page 38: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Locality Sensitive Hashing

Idea: Hash nearby points into the same bin.

0.20.4

0.60.8

0.20.4

0.60.8

0.2

0.4

0.6

0.8

Page 39: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Experiments: Music Transcription

0 1 2 3 4 5 6 7 8 9Time (seconds)

20

30

40

50

60

70

80

90

100

110

Pit

ch(M

IDI

num

ber)

C Major Scale (OMP)

0 1 2 3 4 5 6 7 8 9Time (seconds)

20

30

40

50

60

70

80

90

100

110

Pit

ch(M

IDI

num

ber)

C Major Scale (AMP, L=8, k=8)

Page 40: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Experiments: Music Transcription

0 2 4 6 8 10 12Time (seconds)

20

30

40

50

60

70

80

90

100

110

Pit

ch(M

IDI

num

ber)

Debussy Clair de Lune, mm. 1-4 (OMP)

0 2 4 6 8 10 12Time (seconds)

20

30

40

50

60

70

80

90

100

110

Pit

ch(M

IDI

num

ber)

Debussy Clair de Lune, mm. 1-4 (AMP, L=8, k=8)

Page 41: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Experiments: Music Transcription

0 2 4 6 8 10 12Time (seconds)

20

30

40

50

60

70

80

90

100

110

Pit

ch(M

IDI

num

ber)

Debussy Clair de Lune, mm. 5-8 (OMP)

0 2 4 6 8 10 12Time (seconds)

20

30

40

50

60

70

80

90

100

110

Pit

ch(M

IDI

num

ber)

Debussy Clair de Lune, mm. 5-8 (AMP, L=8, k=8)

Page 42: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Experiments: Music Transcription

Execution times in seconds.

Song OMP AMP8,8 AMP10,10

C-major scale 81.05 43.63 21.03Debussy mm. 1-4 118.57 88.45 29.01Debussy mm. 5-8 123.05 121.73 121.84

Page 43: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Where to Learn More

Conferences:

Int. Society of Music Information Retrieval (ISMIR)

MIR Evaluation Exchange (MIREX)

Int. Computer Music Conference (ICMC)

IEEE Int. Conf. Audio, Speech, Signal Processing (ICASSP)

ACM Multimedia

Journals:

IEEE Trans. Audio, Speech, Language, Processing

Journal of New Music Research

Computer Music Journal

Page 44: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4

Page 45: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4: Summary

Summary:

3.1 Separate sources.

3.2 Separate noisy sources.

3.3 Classify separated sources.

Page 46: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4: Matlab Programming Tips

Pressing the up and down arrows let you scroll throughcommand history.

A semicolon at the end of a line simply means “suppressoutput”.

Type help <command> for instant documentation. Forexample, help wavread, help plot, help sound. Usehelp liberally!

Page 47: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.1: Source Separation

1. In Matlab: Select File → Set Path.Select “Add with Subfolders”.Select /usr/ccrma/courses/mir2011/lab3skt.

2. As in Lab 1, load the file, listen to it, and plot it.

1 [x, fs] = wavread(’simpleLoop.wav’);

2 sound(x, fs)

3 t = (0:length(x)-1)/fs;

4 plot(t, x)

5 xlabel(’Time (seconds)’)

Page 48: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.1: Source Separation

3. Compute and plot a short-time Fourier transform, i.e., theFourier transform over consecutive frames of the signal.

1 frame_size = 0.100;

2 hop = 0.050;

3 X = parsesig(x, fs, frame_size, hop);

4 imagesc(abs(X(200:-1:1,:)))

Type help parsesig, help imagesc, and help abs formore information.This step gives you some visual intuition about how sounds(might) overlap.

Page 49: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.1: Source Separation

4. Let’s separate sources!

1 K = 2;

2 [y, W, H] = sourcesep(x, fs, K);

Type help sourcesep for more information.

5. Plot and listen to the separated signals.

1 plot(t, y)

2 xlabel(’Time (seconds)’)

3 legend(’Signal 1’, ’Signal 2’)

4 sound(y(:,1), fs)

5 sound(y(:,2), fs)

Feel free to replace Signal 1 and Signal 2 with Kick andSnare (depending upon which is which).

Page 50: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.1: Source Separation

6. Plot the outputs from NMF.

1 figure

2 plot(W(1:200,:))

3 legend(’Signal 1’, ’Signal 2’)

4 figure

5 plot(H’)

6 legend(’Signal 1’, ’Signal 2’)

What do you observe from W and H?Does it agree with the sounds you heard?

Page 51: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.1: Source Separation

7. Repeat the earlier steps for different audio files.

125BOUNC-mono.WAV

58BPM.WAV

CongaGroove-mono.wav

Cstrum chord mono.wav

... and more.Experiment with different values for the number of sources, K.Where does this separation method succeed?Where does it fail?

Page 52: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.2: Noise Robustness

Begin with simpleLoop.wav. Then try others.

1. Add noise to the input signal, plot, and listen.

1 xn = x + 0.01*randn(length(x),1);

2 plot(t, xn)

3 sound(xn, fs)

Page 53: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.2: Noise Robustness

2. Separate, plot, and listen.

1 [yn, Wn, Hn] = sourcesep(xn, fs, K);

2 plot(t, yn)

3 sound(yn(:,1), fs)

4 sound(yn(:,2), fs)

How robust to noise is this separation method?Compared to the noisy input signal, how much noise is left inthe output signals?Which output contains more noise? Why?

Page 54: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.3: Classification

Follow the K-NN example in Lab 1, but classify the separatedsignals.

1. As in Lab 1, extract features from each training sample in thekick and snare drum directories.

2. Train a K-NN model using the kick and snare drum samples.

1 labels=[[ones(10,1) zeros(10,1)];

2 [zeros(10,1) ones(10,1)]];

3 model_snare =

4 knn(5, 2, 1, trainingFeatures, labels);

5 [voting, model_output] =

6 knnfwd(model_snare, featuresScaled)

Page 55: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

.Lab 4.3: Classification

3. Extract features from the drum signals that you separated inLab 4.1.Classify them using the K-NN model that you built.Does K-NN accurately classify the separated signals?Repeat for different numbers of separated signals (i.e., theparameter K in NMF).

4. Overseparate the signal using K = 20 or more. For thoseseparated components that are classified as snare, add themtogether using sum. The listen to the sum signal. Is itcoherent, i.e., does it sound like a single separated drum?

Page 56: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

.

....and more!

If you have another idea that you would like to try out, pleaseask me!

Please collaborate with a partner.Together, brainstorm your own problems, if you want!

Page 57: Music Information Retrieval in Polyphonic MixturesMusic Information Retrieval in Polyphonic Mixtures Steve Tjoa MIR Workshop CCRMA, Stanford University June 27, 2013.. ... Debussy

Good luck!

Music Information Retrieval in PolyphonicMixtures

Steve Tjoa

MIR WorkshopCCRMA, Stanford University

June 27, 2013