53
Periodicity detection Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University 1

periodicity.pdf

Embed Size (px)

Citation preview

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 1/53

Periodicity detection

Juan Pablo Bello

MPATE-GE 2623 Music Information Retrieval

New York University

1

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 2/53

Periodicity detection

• Formally, a periodic signal

is defined as:

• Detect the fundamental

period/frequency (and

phase)

x(t) = x(t + T 0), ∀t

2

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 3/53

 Applications

• Pitch tracking -> pitch-synchronous

analysis, transcription

• Beat tracking -> segmentation, beat-

synchronous analysis, understanding rhythm

3

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 4/53

Difficulties

• Quasi-periodicities

• Multiple periodicities

associated with f0

• transient, temporal variations

and ambiguous events

• Polyphonies: overlap +

harmonicity

4

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 5/53

Difficulties

• Quasi-periodicities

• Multiple periodicities

associated with f0

• transient, temporal variations

and ambiguous events

• Polyphonies: overlap +

harmonicity

5

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 6/53

Polyphonies

6

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 7/53

Difficulties

• Quasi-periodicities

• Multiple periodicities

associated with f0

• transient, temporal variations

and ambiguous events

• Polyphonies: overlap +

harmonicity

7

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 8/53

 Architecture

 Audio

   t   i  m  e

 f 0  s  m

  o  o   t   h   i  n

  g 

Break intoblocks

Detection function +peak-picking

Integration overtime

8

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 9/53

Overview of Methods

• DFT

• Autocorrelation

• Spectral Pattern Matching

• Cepstrum

• Spectral Autocorrelation

• YIN

• Auditory model

9

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 10/53

DFT 

T 0

 f 0

10

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 11/53

DFT 

T 0

 f 0

11

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 12/53

DFT 

T 0

 f 0

12

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 13/53

DFT 

T 0

 f 0

13

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 14/53

• Cross-product measures similarity across time

• Cross-correlation of two real-valued signals x and y:

• Unbiased (short-term) Autocorrelation Function (ACF):

 Autocorrelation

rxy(l) =   1N 

N −1

n=0

x(n)y(n + l)

l = 0, 1, 2, · · ·   , N  − 1

modulo N 

rx(l) =  1

N  − l

N −1−

n=0

x(n)x(n + l)

l = 0, 1, 2, · · ·   , L− 1

14

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 15/53

 Autocorrelation

15

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 16/53

 Autocorrelation

• The short-term ACF can also be computed as:

rx(l) =   1

N −

l real(IFFT  (|X |2))

X  → FFT (x)

x  zero-padded to next power of 2

after (N  + L) − 1

16

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 17/53

 Autocorrelation

• This is equivalent to the following correlation:

rx(l) =  1

K − l

K −1

k=0

cos(2πlk/K )|X (k)|2

17

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 18/53

Pattern Matching 

• Comb filtering is a common strategy

• Any other template that realistically fits the magnitude spectrum

• Templates can be specific to instruments (e.g. inharmonicity for piano

analysis).

• Matching strategies vary: correlation, likelihood, distance, etc.

18

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 19/53

Pattern Matching

19

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 20/53

• Treat the log magnitude spectrum as if it were a signal -> take its (I)DFT

• Measures rate of change across frequency bands (Bogert et al., 1963)

• Cepstrum -> Anagram of Spectrum (same for quefrency, liftering, etc)

• For a real-valued signal is defined as:

Cepstrum

cx(l) = real(IFFT  (log(|FFT (x)|)))

20

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 21/53

Cepstrum

21

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 22/53

Spectral ACF

• Spectral location -> sensitive to quasi-periodicities

• (Quasi-)Periodic Spectrum, Spectral ACF.

• Exploits intervalic information (more stable than locations of partials), while

adding shift-invariance.

rX(lf ) =  1

N  − lf 

N −

1−

lf 

k=0

|X (k)||X (k + lf )|

lf   = 0, 1, 2, · · ·   , L− 1

22

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 23/53

Spectral ACF

23

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 24/53

 YIN

• Alternative to the ACF that uses the squared difference function

(deCheveigne, 02):

• For (quasi-)periodic signals, this functions cancel itself at l = 0, l 0 and its

multiples. Zero-lag bias is avoided by normalizing as:

d(l) =

N −1−l

n=0

(x(n)− x(n +  l))2

d̂(l) =

  1   l = 0d(l)/[(1/l)

l

u=1 d(u)] otherwise

24

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 25/53

 YIN

25

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 26/53

 Auditory model

 Auditory filterbank 

Inner hair-

cell model

Periodicity

Summarization

Inner hair-

cell model

Periodicity

x(n)

xc(n)

zc(n)

rc(n)

Summary periodicity function26

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 27/53

 Auditory model

• Auditory filterbank: gammatone filters (Slaney, 93; Klapuri, 06):

27

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 28/53

 Auditory model

The Equivalent Rectangular Bandwidths (ERB)of the filters:

bc  = 0.108f c + 24.7

f c  = 229×

10ψ/21.4 − 1

ψ = ψmin : (ψmax − ψmin)/F   : ψmax

ψmin/max = 21.4× log10(0.00437f min/max + 1)

F  = number of filters.

28

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 29/53

 Auditory model

• Beating: interference

between sounds of

frequencies f 1 and f 2 

• Fluctuation of amplitude

envelope of frequency |f 2 - f 1|

• The magnitude of the

beating is determined by

the smaller of the two

amplitudes

29

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 30/53

• Inner hair-cell (IHC) model:

 Auditory model

30

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 31/53

 Auditory model

• Sub-band periodicity

analysis using ACF

• Summing across

channels (Summary

 ACF)

• Weighting of the

channels changes the

topology of the SACF

31

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 32/53

 Auditory model

32

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 33/53

Comparing detection functions

33

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 34/53

Comparing detection functions

34

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 35/53

Comparing detection functions

35

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 36/53

Comparing detection functions

36

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 37/53

 Tempo

• Tempo refers to the pace of a piece of music and is usually given in beats per

minutes (BPM).

• Global quality vs time-varying local characteristic.

• Thus, in computational terms we differentiate between tempo estimation and

tempo (beat) tracking.

• In tracking, beats are described by both their rate and phase.

• Vast literature: see, e.g. Hainsworth, 06; or Goto, 06 for reviews.

37

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 38/53

• Novelty function (NF): remove local mean + half-wave rectify

• Periodicity: dot multiply ACF of NF with a weighted comb filterbank

 Tempo estimation and tracking (Davies, 05)

Rw(l) = (l/b2)e−l

2

2b2

*From Davies and Plumbley, ICASSP 200538

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 39/53

 Tempo estimation and tracking (Davies, 05)

• Choose lag that maximizes the ACF

39

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 40/53

 Tempo estimation and tracking (Davies, 05)

• Choose filter that maximizes the dot product

40

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 41/53

 Tempo estimation and tracking (Davies, 05)

• Phase: dot multiply DF with shifted versions of selected comb filter

41

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 42/53

 Tempo estimation and tracking (Grosche, 09)

• DFT of novelty function  !(n) for frequencies:

• Choose frequency that maximizes the magnitude spectrum at each frame

• Construct a sinusoidal kernel:

• In Grosche, 09 phase is computed as:

• Alternatively, we can find the phase that maximizes the dot product of  !(n)

with shifted versions of the kernel, as before.

κ(m) =  w(m−

n)cos(2π(ω̂m−

 ϕ̂))

ω  ∈ [30 : 600]/(60 × fsγ )

ϕ̂ =  1

2πarccos

Re(F (ω̂, n))

|F (ω̂, n)|

42

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 43/53

 Tempo estimation and tracking (Grosche, 09)

• tracking function: Overlap-add of optimal local kernels + half-wave rectify

*From Grosche and Mueller, WASPAA 200943

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 44/53

 Tempo estimation and tracking (Davies, 05)

44

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 45/53

 Tempo estimation and tracking (Grosche, 09)

45

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 46/53

 Tempo estimation and tracking (Davies, 05)

46

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 47/53

 Tempo estimation and tracking (Grosche, 09)

47

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 48/53

 Tempo estimation and tracking (Davies, 05)

48

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 49/53

 Tempo estimation and tracking (Grosche, 09)

49

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 50/53

 Tempo estimation and tracking (Davies, 05)

50

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 51/53

 Tempo estimation and tracking (Grosche, 09)

51

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 52/53

References

• Wang, D. and Brown, G. (Eds) “Computational Auditory Scene Analysis”. Wiley-

Interscience (2006): chapter 2, de Cheveigné, A. “Multiple F0 Estimation”; and

chapter 8, Goto, M. “Analysis of Musical Audio Signals”.

• Klapuri, A. and Davy, M. (Eds) “Signal Processing Methods for Music Transcription”.

Springer (2006): chapter 1, Klapuri, A. “Introduction to Music Transcription”; chapter

4, Hainsworth, S. “Beat Tracking and Musical Metre Analysis”; and chapter 8,Klapuri, A. “Auditory Model-based methods for Multiple Fundamental Frequency

Estimation”

• Slaney, M. “An efficient implementation of the Patterson Holdsworth auditory filter

bank”. Technical Report 35, Perception Group, Advanced Technology Group, Apple

Computer, 1993.

• Smith, J.O. “Mathematics of the Discrete Fourier Transform (DFT)”. 2nd Edition,

W3K Publishing (2007): chapter 8, “DFT Applications”.

52

7/17/2019 periodicity.pdf

http://slidepdf.com/reader/full/periodicitypdf 53/53

References

• Grosche, P. and Müller, M. “Computing Predominant Local Periodicity Information in

Music Recordings”. Proceedings of the IEEE Workshop on Applications of Signal

Processing to Audio and Acoustics (WASPAA), New Paltz, NY, 2009.

• Davies, M.E.P. and Plumbley, M.D. “Beat Tracking With A Two State Model”. In

Proceedings of the IEEE International Conference on Acoustics, Speech and Signal

Processing (ICASSP), Vol. 3, pp 241-244 Philadelphia, USA, March 19-23, 2005.

• This lecture borrows heavily from: Emmanuel Vincent’s lecture notes on pitch

estimation (QMUL - Music Analysis and Synthesis); and from Anssi Klapuri’s lecture

notes on F0 estimation and automatic music transcription (ISMIR 2004 Graduate

School: http://ismir2004.ismir.net/graduate.html )