MIRtoolbox - UiO

MIRtoolbox: Sound and music analysis of

audio recordings using MatlabMUS4831, Olivier Lartillot, 26.10.2017

Part 1

• MIRtoolbox overview

• Basic signal processing operators

• Auditory models

• Pitch estimation

• Timbral descriptions

Part 2 (in 2 weeks)

• Rhythm, metrical structure

• Tonal analysis

• Segmentation, structure

• Statistical descriptions, similarity

• Music & emotion

• Advanced use

Lecture-Workshop

• Lecture slides in PDF

• Workshop handout

• We will install MIRtoolbox together…

• Sound examples

• Useful: MIRtoolbox User’s Guide

1. Overview & syntax

General Principles• Why did we create MIRtoolbox?

• Research project about music & emotion

• Analysis tool for students from various background

• Modular framework: Building blocks

• Simple and adaptive syntax

• User can focus on the general design.

• MIRtoolbox takes care of the technical details.

• Free software, open source

• One standard tool for MIR study and research (10000s downloads)

MIRtoolbox Features

miraudio

mirinharmonicitymirrms mirlowenergy

mirplay

mirsave

mirlength

mirfilterbankmirframe

mirsegment

mirsum

mirnovelty mirpeaks

mirgetdata

mirexport

mirstat

mirzerocross

mirhisto

mircentroid

mirspread

mirskewness

mirkurtosis

mirflatness

mircluster

mirclassify

mirfeatures

mirzerocross

mirdist

mirquery

mirpeaks mirtempomirautocor

mirspectrum

*

mirpulseclarity

mirattackslopemirpeaks

mirattacktime

mireventdensity

mirsimatrix

mirtonalcentroid mirflux mirhcdf

mirpeaks

mirmode

mirkey

mirkeysom

mirchromagram mirkeystrength

mirpeaks mirpitch*

mirautocor

mircepstrum

mirpeaks

mirrolloff

mirbrightness

mirmfcc

mirroughness

mirregularity

mirspectrum

mirflux

mironsetsmirenvelope

mirbeatstrength

mirfluctuationmirspectrum

Let’s now install MIRtoolbox…

Requires:

• Matlab,

• Signal Processing toolbox,

• Statistics and Machine Learning toolbox

http://bit.ly/mirtoolbox

http://bit.ly/mirtoolbox

Basic Operationsmiraudio(’ragtime.wav’)

miraudio(’Folder’) ‘Folder’ = all files in Current Directory

.wav.mp3.mp4.m4a.ogg.flac.au

miraudio(..., ‘Extract’) extraction options

• miraudio(..., ‘Extract’, 1, 2)extracts signal from 1 s to 2 s after the start

a = miraudio(‘ragtime.wav’)b = miraudio(a, ‘Extract’,1,2)

mirplay(b)b = miraudio(‘ragtime.wav’, ‘Extract’,1,2)

mirsave(b)

miraudio(..., ‘Trim’) trimming options

• miraudio(‘ragtime.wav’, ‘Trim’)

trims (pseudo-)silence at start and end

• miraudio(..., ‘TrimStart’) at start only

• miraudio(..., ‘TrimEnd’) at end only

• miraudio(..., ‘TrimThreshold’, t)

specifies the silence threshold t = .06

Silent frames have RMS amplitude below t times the medium RMS amplitude of the whole audio file.

2. Basic signal processing operators

mirspectrum(‘trumpet.wav’) Discrete Fourier Transform

of audio signal x:

• mirspectrum(..., ‘Min’, f1) f1=0 Hz

• mirspectrum(..., ‘Max’, f2) f2=sampling rate/2

• mirspectrum(..., ‘Window’, ‘hamming’ )

0 1000 2000 3000 4000 5000 60000

200

400

600

800

1000Spectrum

frequency (Hz)

magnitude

f2f1

Xk =

��

N�1X

n=0

xne� 2⇡i

N kn

�� , k = 0, . . . , N/2

mirspectrum(…, ‘Terhardt’) auditory model: outer-ear filter

0 0.5 1 1.5 2 2.5

x 104

0

200

400Spectrum

frequency (Hz)

magnitude

0 0.5 1 1.5 2 2.5

x 104

0

50

100Spectrum

frequency (Hz)

magnitude

based on MA toolbox

• mirspectrum

• mirspectrum(..., ‘Terhardt’)

5000Hz

5000Hz

mirspectrum(…, ‘Mel’) auditory model: Mel-band spectrum

based on Auditory toolbox

mfcc

28

mfcc

PurposeMel-frequency cepstral coefficient transform of an audio signal

Synopsis[ceps,freqresp,fb,recon] = mfcc(input, samplingRate)

DescriptionFind the cepstral coefficients (ceps) corresponding to the input. Three other quanti-ties are optionally returned that represent the detailed FFT magnitude (freqresp), the log10 mel-scale filter bank output (fb), and the reconstruction of the filter bank output by inverting the cosine transform.

The sequence of processing includes for each chunk of data:Window the data with a hamming window,Shift it into FFT order,Find the magnitude of the FFT,Convert the FFT data into filter bank outputs,Find the log base 10,Find the cosine transform to reduce dimensionality.

The filter bank is constructed using 13 linearly-spaced filters (133.33Hz between center frequencies,) followed by 27 log-spaced filters (separated by a factor of 1.0711703 in frequency.) Each filter is constructed by combining the amplitude of FFT bin as shown in the figure below.

The forty filters look like this.

CF - 133Hz or CF CF + 133Hz orCF/1.0718 CF*1.0718

Frequency

1030

0.005

0.01

Frequency

0 5 10 15 20 25 30 35 400

5

10Mel!Spectrum

Mel bands

ma

gn

itu

de

• frequency bands equally spaced on mel scale

• in each mel band, perceptually same pitch range

mirspectrum(…, ‘Bark’) auditory model: Bark-band spectrum

based on MA toolbox

0 5 10 15 20 250

5000

10000Bark!Spectrum

Bark bands

ma

gn

itu

de

• another similar auditory model, decomposing the frequency axis into bands

mirspectrum pitch-based distribution

• mirspectrum(..., ‘Cents’)

6000 7000 8000 9000 10000 11000 12000 130000

500

1000Spectrum

pitch (in midicents)

mag

nitu

de0 2000 4000 6000 8000 10000 12000

0

500

1000Spectrum

frequency (Hz)

mag

nitu

de

• mirspectrum(‘...’)

0 200 400 600 800 1000 12000

1000

2000Spectrum

Cents

magnitude• mirspectrum(...,

‘Collapsed’)

0000

0000

0000

0000

0000

00

0000

0000

0000

C D E F G A B C

https://www.youtube.com/watch?v=MTrxqihjH98








mirautocor autocorrelation function

!"#$% !"#& !"#&% !"#' !"#'% !"#( !"#(% !") !")!% !")*

!!"%

!

!"%

+,-./012345/67

8.7409:;

27<=.8,-4

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05!0.05

0

0.05

0.1

0.15Waveform autocorrelation

lag (s)

co

eff

icie

nts

x

j

xR x

miraudio(‘trumpet.wav’)

mirautocor(‘trumpet.wav’)

mirautocor autocorrelation function

• mirautocor(…, ‘Min’, t1, ‘s’) t1=0 s

• mirautocor(..., ‘Max’, t2, ‘s’) t2=.05 s (audio) or t2=2 s (envelope)

• mirautocor(..., ‘Freq’) lags in Hz.

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05!0.05

0

0.05

0.1

0.15Waveform autocorrelation

lag (s)

co

eff

icie

nts

mirautocor(..., ‘Compres’) “compressed” autocorrelation

mirautocor(‘Cmaj.wav’, ’Freq’)

mirautocor(‘Cmaj.wav’, ’Freq’, ‘Compres’)

• Autocorrelation (by default):

• y = IDFT(|DFT(x)|2)

• “Compressed” autocorrelation:

• y = IDFT(|DFT(x)|k)

• mirautocor(…, ‘Compres’, k) k=.67

Tolonen, Karjalainen. A Computationally Efficient Multipitch Analysis Model, IEEE Transactions on Speech and Audio Processing, 8(6), 2000.

0 0.002 0.004 0.006 0.008 0.01 0.012 0.014!0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

0 0.002 0.004 0.006 0.008 0.01 0.012 0.0140

0.005

0.01

0.015

0.02

0.025

0.03

0 0.002 0.004 0.006 0.008 0.01 0.012 0.0140

0.005

0.01

0.015

0.02

0.025

0.03

0 0.002 0.004 0.006 0.008 0.01 0.012 0.0140

0.005

0.01

0.015

0.02

0.025

0.03

0 0.002 0.004 0.006 0.008 0.01 0.012 0.0140

0.005

0.01

0.015

0.02

0.025

0.03

0 0.002 0.004 0.006 0.008 0.01 0.012 0.0140

0.005

0.01

0.015

0.02

0.025

0.03

mirautocor(..., ‘Enhanced’) enhanced autocorrelation

• mirautocor(‘Amin3’, ‘Enhanced’, 2:10)

Tolonen, Karjalainen. A Computationally Efficient Multipitch Analysis Model, IEEE Transactions on Speech and Audio Processing, 8(6), 2000.

234567

mirautocor(‘Cmaj.wav’, ’Freq’, ‘Compres’, ‘Enhanced’)

mirautocor(‘Cmaj.wav’, ’Freq’, ‘Compres’, ‘Enhanced’, 2:20)

• mirframe(..., ‘WinLength’, l, ‘s’) unit: ‘s’ (seconds), ‘sp’ (samples)

• mirframe(..., ‘Hop’, h, ‘/1’) unit: ‘/1’ (ratio from 0 to 1),‘%’ (percentage), ‘s’, ‘sp’

• mirframe(‘.., l, ‘s’, h, ‘/1’)

mirframe frame decomposition

0 1 2 3 4 5 6!0.4

!0.3

!0.2

!0.1

0

0.1

0.2

0.3Audio waveform

time (s)

am

plit

ude

h

llmirframe(‘ragtime.wav’,1,.5)

mirframe syntax

mirframef

miraudioa

a = miraudio(‘mysong’)

f = mirframe(a)

or: s = mirspectrum(‘mysong’, ‘Frame’)

mirspectrums

f = mirframe(‘mysong’) s = mirspectrum(f)

•miraudio(..., ‘Frame’, l, ‘s’, h, ‘/1’)

•mirspectrum(..., ‘Frame’, l, ‘s’, h, ‘/1’)

•mirspectrum(’mysong’, ’Frame’, 1, .5, ’Mel’)

‘Frame’ option syntax

mirflux distance between successive frames

s = mirspectrum(a, ‘Frame’)

mirflux(s)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02

0.03

frames

co

eff

icie

nt

va

lue

(in

)

Spectral flux

• mirflux(a) = mirflux(mirspectrum(a, ‘Frame’, .05, .5))

• ac = mirautocor(a, ‘Frame’), mirflux(ac)

• mirflux(..., ‘Dist’, d) d = ‘Euclidean’, ‘City’, ‘Cosine’

mirrms root mean square

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02

0.03

0.04

frames

coeffic

ient valu

e (

in )

RMS energy

mirrms(‘ragtime.wav’)

The RMS energy related to file ragtime is 0.017932

mirrms(‘ragtime.wav’, ‘Frame’)

Default frame size .05 s, frame hop = .5

mirenvelope envelope extraction

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.005

0.01

0.015

0.02

0.025Envelope

time (s)

am

plit

ud

e

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.1

!0.05

0

0.05

0.1Audio waveform

time (s)

am

plit

ud

e

a:

e = mirenvelope(a) mirplay(e)mirsave(e)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.005

0.01

0.015

0.02

0.025Envelope

time (s)

am

plit

ude

e = mirenvelope(a)

mirrms(a, ‘Frame’)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02

0.03

0.04

frames

co

eff

icie

nt

va

lue

(in

)

RMS energy

mirenvelope(..., ‘Filter’) based on low-pass filtering

mirenvelope(..., ‘Tau’, .02): time constant (in s.)

Down-Sampling N mirenvelope(..., ‘PostDecim’, N) N=16

mirenvelope(..., ‘Sampling’, f)

Low-Pass Filter

LPF

Full-wave rectificationabs0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

!0.1

!0.05

0

0.05

0.1Audio waveform

time (s)

am

plit

ud

e

mirenvelope post-processing options

• mirenvelope(..., ‘Center’)

‘HalfWaveCenter’)

‘Diff’)

‘HalfWaveDiff’)

• mirenvelope(..., ‘Power’)

• mirenvelope(..., ‘Normal’)

• mirenvelope(..., ‘Smooth’,o) moving average, order o = 30

• mirenvelope(..., ‘Gauss’,o) gaussian, std deviation o = 30 sp

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.02

0

0.02Envelope

time (s)

am

plit

ud

e

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02Envelope (half!wave rectified)

time (s)

am

plit

ude

! !"# $ $"# % %"# & &"# ' '"#!$

!

$

%

()$!!' *+,,-.-/0+10-2)-/3-456-

0+7-)89:

1764+0;2-

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

2

4x 10

!4 Envelope (half!wave rectified)

time (s)a

mp

litu

de

mirplay

MakeERBFilters

24

MakeERBFilters

PurposeDesign the filters needed to implement an ERB cochlear model.

Synopsisfcoefs = MakeERBFilters(fs,numChannels,lowFreq)

DescriptionThis function computes the filter coefficients for a bank of Gammatone filters. These filters were defined by Patterson and Holdworth for simulating the cochlea.

The result is returned as an array of filter coefficients. Each row of the filter arrays contains the coefficients for four second order filters. The transfer function for these four filters share the same denominator (poles) but have different numerators (zeros). All of these coefficients are assembled into one vector that the ERBFilterBank func-tion can take apart to implement the filter.

The filter bank contains numChannels channels that extend from half the sampling rate (fs) to lowFreq.

Note this implementation fixes a problem in the original code. The original version calculated a single eighth-order polynomial representing the entire filter function. This new version computes four separate second order filters. This avoids a problem with round off errors in cases with very small characteristic frequencies (<100Hz) and large sample rates (>44kHz). The problem is caused by roundoff error when a number of poles are combined, all very close to the unit circle. Small errors in the eighth-order coefficient, are magnified when the eighth root is taken to give the pole location. These small errors lead to poles outside the unit circle and instability. Thanks to Julius Smith for pointer me to the proper explanation.

ExamplesThe ten ERB filters between 100 and 8000Hz are computed using

»fcoefs = MakeERBFilters(16000,10,100);

The resulting frequency response is given by»y = ERBFilterBank([1 zeros(1,511)], fcoefs);

»resp = 20*log10(abs(fft(y')));

»freqScale = (0:511)/512*16000;

»semilogx(freqScale(1:255),resp(1:255,:));

»axis([100 16000 -60 0])

»xlabel('Frequency (Hz)');

»ylabel('Filter Response (dB)');

A simple cochlear model can be formed by filtering an utterance with these filters. To convert this data into an image we pass each row of the cochleagram through a half-wave-rectifier, a low-pass filter, and then decimate by a factor of 100. A cochleagram of the ‘A huge tapestry hung in her hallway’ utterance from the TIMIT database

102

103

104

-60

-40

-20

0

• mirfilterbank(..., ‘Gammatone’)

Equivalent Rectangular Bandwidth (ERB) Gammatone filterbank

• f = mirfilterbank(..., ‘NbChannels’, N=10)

• mirplay(f)

miraudio

filterN

filter2filter1

...

Filter frequency responses

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.01

0

0.01

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.05

0

0.05

2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.1

0

0.1

3

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.1

0

0.1

4

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.05

0

0.05

5

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.02

0

0.02

6

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.02

0

0.02

7

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.01

0

0.01

8

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!2

0

2x 10

!3

9

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!1

0

1x 10

!3 Audio waveform

10

Hz

dB

mirfilterbankfilterbank decomposition

mirfilterbank filterbank decomposition

• mirfilterbank(..., ‘Manual’, [44, 88, 176, 352, 443])

0 0.005 0.01 0.015 0.02 0.025

−40

−30

−20

−10

0

N li d F ( d/ l )

Mag

nitu

de (d

B)

Magnitude Response (dB)

• mirfilterbank(..., ‘Manual’, [-Inf 200 400 800 1600 Inf])

0.02 0.04 0.06 0.08 0.1 0.12

−50

−40

−30

−20

−10

0

N li d F ( d/ l )

Mag

nitu

de (d

B)

Magnitude Response (dB)

mirsum across-channels summation

e = mirenvelope(f)

a

e 0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.1

0.2

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.5

1

2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.2

0.4

3

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.05

0.1

4

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

5x 10

!3 Envelope

5

mirenvelope

mirfilterbank f = mirfilterbank(a)0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

!1

0

1

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!5

0

5

2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!2

0

2

3

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!1

0

1

4

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.1

0

0.1Audio waveform (centered)

5

f

mirsum

s

s = mirsum(e)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.08

!0.06

!0.04

!0.02

0

0.02

0.04

0.06Envelope

time (s)

am

plit

ud

e

mirsum stereo summation

e = mirenvelope(a)

miraudioa

a = miraudio(..., ‘Mono’, 0)

emirenvelope

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02

0.03

time (s)

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02

0.03Envelope

2

mirsum

s

s = mirsum(e)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.01

0.02

0.03

0.04

0.05Envelope

time (s)

ampl

itude

mirsum summary

e = mirenvelope(f)

amirfilterbank

ff = mirfilterbank(a)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!1

0

1

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!5

0

5

2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!2

0

2

3

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!1

0

1

4

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5!0.1

0

0.1Audio waveform (centered)

5

e0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

0

0.1

0.2

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.5

1

2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.2

0.4

3

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

0.05

0.1

4

0 0.5 1 1.5 2 2.5 3 3.5 4 4.50

5x 10

!3 Envelope

5

mirenvelope

ac = mirautocor(e)ac

mirautocor

0 0.2 0.4 0.6 0.8 1 1.2 1.40

1

2x 10

!7

1

0 0.2 0.4 0.6 0.8 1 1.2 1.4!5

0

5x 10

!6

2

0 0.2 0.4 0.6 0.8 1 1.2 1.4!1

0

1x 10

!6

3

0 0.2 0.4 0.6 0.8 1 1.2 1.4!5

0

5x 10

!8

4

0 0.2 0.4 0.6 0.8 1 1.2 1.4

!0.50

0.5

x 10!10 Envelope autocorrelation

5

mirsums

s = mirsum(e)

0 0.2 0.4 0.6 0.8 1 1.2 1.4!5

0

5x 10

!6 Envelope autocorrelation

lag (s)

coeffic

ients

mirpeaks peak picking

• p = mirpeaks(..., ‘Total’, 3, ‘NoBegin’)

• To get peak positions:

• mirgetdata(p)

• To get peak amplitudes:

• get(p, ‘PeakVal’)

default: t=0

default: c=.1

! " #! #" $! $" %! %" &!!

"

#!

#"

$!'()!*+(,-./0

'()123456

03748-/5(

*

* OKOK

OKnot OK

mirpeaks parameters specification

• mirpeaks(..., ‘Threshold’, t)

• mirpeaks(..., ‘Contrast’, c)

t

1

00

1

c

3. Feature extractors

• Pitch / f0

• Timbre

• Tempo

• Tonality

• Segmentation(Wed

nesd

ay)

mirpitch pitch estimation

mirautocor

mirfilterbank

mirframe

mirpeaks

mirsum

mirpitch(...,

‘2Channels’,

‘Enhanced’, 2:10, ‘Compress’, .5

‘Total’, Inf, ‘Min’, 75, ‘Max’, 2400, ‘Contrast’, .1, ‘Threshold’, .4)

or ‘NoFilterbank’,

Timbre

• Zero-crossing rate: mirzerocross

• Spectral distribution: mirrolloff, mirbrightness, mircentroid, mirspread, …

• Mel-Frequency Cepstral Coefficients: mirmfcc

• Sensory Dissonance: mirroughness

• mirregularity

mirzerocross waveform sign-change rate

• Is supposed to indicate how noisy the sound is.

• But highly dependent on the presence of low or high frequency components in the sound.

1.701 1.702 1.70& 1.704 1.705 1.706 1.707 1.708 1.709

!0.0&

!0.02

!0.01

0

0.01

0.02

0.0&

Audio waveform

time (s)

am

plit

ude

mirrolloff high-frequency energy (I)

• mirrolloff(..., ‘Threshold’, .85)

0 2000 4000 6000 8000 10000 120000

500

1000

1500

2000

2500

3000

3500

Spectrum

frequency (Hz)

magnitude 85% of the energy

5640.53 Hz

mirbrightness high-frequency energy (II)

• mirbrightness(..., ‘CutOff’, 1500) (in Hz)

• mirbrightness(..., ‘Unit’, u) u = ‘/1’ or ‘%’

0 2000 4000 6000 8000 10000 120000

500

1000

1500

2000

2500

3000

3500

Spectrum

frequency (Hz)

magnitude

1500 Hz

53.96% of the energy

mircentroid geometric center of spectral distribution

µ1 =�

xf(x)dx

µ1

mirspread spectral dispersion

⇥2 = µ2 =�

(x� µ1)2f(x)dxsecond moment:

µ1

mirkurtosis spectral pickiness

< 0 > 00-2

mirflatness smooth vs. spiky

geometric mean

arithmetic mean

N

⌅⇤N�1n=0 x(n)

�PN�1n=0 x(n)

N

⇥

mirmfcc mel-frequency cepstral coefficients

• Description of spectral shape.

mirspectrum (‘Mel’)Abs LogDiscreteCosine

Transform

mirroughness sensory dissonance

• mirroughness(..., ‘Sethares’)

Dissonance produced by two sinusoids

depending on their frequency ratio

mirspectrum mirpeaks dissonanceestimated for

each pair of peaks

sum

mironsets onset detection function

time (s)0.5 1 1.5 2 2.5 3 3.5 4

ampl

itude

0.2

0.4

0.6

0.8

1Onset curve (Envelope)

• mironsets• mirpeaks(mirsum(mirspectrum(…, ‘Frame’)))

• mironsets(…, ‘Filter’)• mirpeaks(mirsum(mirenvelope(mirfilterbank(…, ‘NbChannels’, 40))))

• mironsets(…, ‘SpectralFlux’)• mirpeaks(mirflux(…, ‘Inc’, ‘Halfwave’))

mironsets(…, ‘Contrast’,

…)

mironsets(..., ‘Attack’)

time (s)0.5 1 1.5 2 2.5 3 3.5 4

ampl

itude

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Onset curve (Envelope)

mirattackslope average slope of note attacks

• o = mironsets(‘george.wav’, …)

• mirattackslope(o)

4 5 6 7 8 9

!0.5

0

0.5

1

1.5

2

2.5

Envelope (centered)

time (s)

am

plit

ude

mirattackleap amplitude of note attacks

• o = mironsets(‘george.wav’, …)

• mirattackleap(o)

4 5 6 7 8 9

!0.5

0

0.5

1

1.5

2

2.5

Envelope (centered)

time (s)

am

plit

ude

Part 2 (in 2 weeks)

• Rhythm, metrical structure

• Tonal analysis

• Segmentation, structure

• Statistical descriptions, similarity

• Music & emotion

• Advanced use

Documents

MIRtoolbox - UiO