Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Semi-supervisedMusical Instrument Recognition
Master’s Thesis Presentation
Aleksandr Diment1
1Tampere University of Technology, Finland
Supervisors:
Adj.Prof. Tuomas Virtanen, MSc Toni Heittola
17 May 2013
1 Introduction
2 System description
3 Evaluation
4 Conclusions
Outline
1 Introduction
2 System description
3 Evaluation
4 Conclusions
Semi-supervised instrument recognition – A. Diment 17.5.2013 2/25
1 Introduction
Musical instrumentrecognition
Semi-supervisedlearning
Objectives and mainresults
2 System description
3 Evaluation
4 Conclusions
1 Introduction
2 System description
3 Evaluation
4 Conclusions
Semi-supervised instrument recognition – A. Diment 17.5.2013 3/25
1 Introduction
Musical instrumentrecognition
Semi-supervisedlearning
Objectives and mainresults
2 System description
3 Evaluation
4 Conclusions
IntroductionMusical instrument recognition
Music information retrieval: obtaining information of variouskinds from music.
Situationally tailored playlisting, personalised radio etc.
Instrument recognition ⇒
automatic music database annotation;
automatic music transcription;
musical genre classification.
Semi-supervised instrument recognition – A. Diment 17.5.2013 4/25
1 Introduction
Musical instrumentrecognition
Semi-supervisedlearning
Objectives and mainresults
2 System description
3 Evaluation
4 Conclusions
IntroductionSemi-supervised learning
Traditional learning paradigms:
unsupervised: no additional knowledge about the datasamples
supervised: a label is assigned to each data sample
Supervised: large amounts of annotated training data needed.
SSL: only part of training data needs to be annotated.
Not yet applied for instrument recognition.
Semi-supervised instrument recognition – A. Diment 17.5.2013 5/25
1 Introduction
Musical instrumentrecognition
Semi-supervisedlearning
Objectives and mainresults
2 System description
3 Evaluation
4 Conclusions
IntroductionObjectives and main results
Objectives:
studying techniques for musical instrument recognition;
studying various SSL schemes;
Main results:
developed pattern recognition system for instrumentrecognition with SSL;
two algorithms are implemented.
Semi-supervised instrument recognition – A. Diment 17.5.2013 6/25
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
1 Introduction
2 System description
3 Evaluation
4 Conclusions
Semi-supervised instrument recognition – A. Diment 17.5.2013 7/25
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
System descriptionBuilding blocks
Preprocessing
Feature extraction
Training
Models
Classification
Input data
FeaturesTraining set Testing set
Decision
Figure 1: A block diagram of a typical pattern classification system.
Semi-supervised instrument recognition – A. Diment 17.5.2013 8/25
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
System descriptionFeature extraction
Relevant information is within the timbre, unique to a particulargroup of instruments.
Set of tonal qualities which characterise a particular musicalsound.
Everything except pitch, loudness and duration.
Semi-supervised instrument recognition – A. Diment 17.5.2013 9/25
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
System descriptionFeature extraction
Pre-emphasis
Frame blockingand windowing
|FFT|2
Log
DCT ddt
Input signal
Static coefficients Delta coefficients
Figure 2: A block diagram of MFCCs calculation.
Semi-supervised instrument recognition – A. Diment 17.5.2013 10/25
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
System descriptionTraining algorithms
The EM algorithm ⇒ MLEs of model parameters when treatingobservations as incomplete data.
Used when obtaining direct equations for the model is impossible.
Extending EM for SSL: the iterative and incremental EM-basedalgorithms1.
1 P. J. Moreno and S. Agarwal, “An experimental study of EM-based algorithms for semi-supervised learning in audioclassification”, in Proc. of the ICML-2003 Workshop on the Continuum from Labeled to Unlabeled Data, 2003.
Semi-supervised instrument recognition – A. Diment 17.5.2013 11/25
Initial iteration
Piano Guitar
L
U
U
L
U U
U
U training sample from the unlabelled dataset L training sample from the labelled dataset
Iteration t = 1
Piano Guitar
L
U
U
L
U
U
U
Iteration t = 2. . .
Piano Guitar
L
U
U
L
U
UU
Training dataset
Training
Models
Classification
...
Unlabelled samples
Labelled samples, piano
Labelled samples, guitar
Unlabelled samples classified as piano
Unlabelled samples classified as guitar
Figure 3: Schematic and feature space representation of the trainingstage with three iterations of the iterative EM-based algorithm.
Initial iteration
Piano Guitar
L
U
U
L
U U
U
U training sample from the unlabelled dataset L training sample from the labelled dataset
Iteration t = 1
Piano Guitar
L
U
L
U
U
U
U
Iteration t = 2. . .
Piano Guitar
L
U
U
L
U
U
U
Training dataset
Training
Models
Classification
...
Unlabelled samples
Labelled samples, piano
Labelled samples, guitar
Unlabelled samples classified as piano
Unlabelled samples classified as guitar
Figure 4: Schematic and feature space representation of the trainingstage with three iterations of the incremental EM-based algorithm.
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
System descriptionLabelled data weighting
The algorithms improve the performance in case the initiallabelled data size is relatively low.
Not much difference when adding large or small amount ofunlabelled data.
⇒ de-weight the contribution of the unlabelled data.
Simply: Sl = ω(t)♦Sl.
Semi-supervised instrument recognition – A. Diment 17.5.2013 14/25
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
System descriptionOne-class-at-a-time training
Another issue of the iterative algorithm: resulting classificationaccuracy is oscillating along the iteration axis.
0 5 10 15 20
80
90
100
Iteration
Accuracy, %
Semi-supervised instrument recognition – A. Diment 17.5.2013 15/25
1 Introduction
2 System description
Building blocks
Feature extraction
Training algorithms
Labelled dataweighting
One-class-at-a-timetraining
3 Evaluation
4 Conclusions
System descriptionOne-class-at-a-time training
One-class-at-a time approach: at an iteration only one class isretrained.
⇒ previous models of one class are unaffected by traininganother one
⇒ fewer peaks
⇒ safer to chose an arbitrary iteration index for termination.
Semi-supervised instrument recognition – A. Diment 17.5.2013 16/25
1 Introduction
2 System description
3 Evaluation
Datasets
Baseline results
Results with theiterative EM-basedalgorithm
4 Conclusions
1 Introduction
2 System description
3 Evaluation
4 Conclusions
Semi-supervised instrument recognition – A. Diment 17.5.2013 17/25
1 Introduction
2 System description
3 Evaluation
Datasets
Baseline results
Results with theiterative EM-basedalgorithm
4 Conclusions
EvaluationDatasets
RWC Music Database2.
Three variations per instrument.
Separate notes within the whole range with a step of asemitone.
44.1 kHz, 16 bit.
2 M. Goto et al., “RWC music database: music genre database and musical instrument sound database”, in Proc. ofthe 4th Int. Conf. on Music Information Retrieval (ISMIR), 2003, pp. 229–230.
Semi-supervised instrument recognition – A. Diment 17.5.2013 18/25
1 Introduction
2 System description
3 Evaluation
Datasets
Baseline results
Results with theiterative EM-basedalgorithm
4 Conclusions
EvaluationDatasets
Table 1: List of instruments and number of recordings of the notesused in the smaller and larger sets, respectively.
Instrument # notes
Acoustic Guitar 702Electric Guitar 702Tuba 270Bassoon 360
Total 2 212
Instrument # notes
Pianoforte 792Classic Guitar 702Electric Guitar 702Electric Bass 507Trombone 278Tuba 270Horn 288Bassoon 360Clarinet 360Banjo 941
Total 5 200
Semi-supervised instrument recognition – A. Diment 17.5.2013 19/25
1 Introduction
2 System description
3 Evaluation
Datasets
Baseline results
Results with theiterative EM-basedalgorithm
4 Conclusions
EvaluationBaseline results
Baseline scenario: treating all available data as labelled. Upperbound for possible SSL performance.
Table 2: Average classification accuracy across all classes infully-supervised case.
Instrument set size Recognition accuracy, %
4 instruments 92.110 instruments 82.9
Semi-supervised instrument recognition – A. Diment 17.5.2013 20/25
1 Introduction
2 System description
3 Evaluation
Datasets
Baseline results
Results with theiterative EM-basedalgorithm
4 Conclusions
EvaluationResults with the iterative EM-based algorithm
Table 3: Classification results with the iterative EM-based SSLalgorithm when incorporating both modifications.
Instrument set size Recognition accuracy, %
initial maximum absolute gain relative gain
4 instruments 89.19 95.30 6.11 6.8510 instruments 58.73 68.43 9.70 14.18
Semi-supervised instrument recognition – A. Diment 17.5.2013 21/25
1 Introduction
2 System description
3 Evaluation
Datasets
Baseline results
Results with theiterative EM-basedalgorithm
4 Conclusions
0 2 4 6 8 10 12 14 16 18 20
80
90
100
(Macro)iteration
Accuracy, %
Initial approachOne class at a time
Labelled data weightingBoth approaches
Figure 5: Comparison of the modifications to the iterative algorithmwith their combination and the initial version, smaller instrument set.
Semi-supervised instrument recognition – A. Diment 17.5.2013 22/25
1 Introduction
2 System description
3 Evaluation
Datasets
Baseline results
Results with theiterative EM-basedalgorithm
4 Conclusions
5 6 8 10 15 20 30 100
80
90
labelled/total dataset sizes ratio, %
Accuracy, %
Initial accuracyFinal accuracy
Figure 6: Comparison of the accuracies of the initial models and thefinal iteration of the incremental EM-based SSL algorithm as a functionof the relative labelled dataset size with the smaller instrument set.
Semi-supervised instrument recognition – A. Diment 17.5.2013 23/25
1 Introduction
2 System description
3 Evaluation
4 Conclusions
1 Introduction
2 System description
3 Evaluation
4 Conclusions
Semi-supervised instrument recognition – A. Diment 17.5.2013 24/25
1 Introduction
2 System description
3 Evaluation
4 Conclusions
Conclusions
SSL for instrument recognition works. Gain 9.7%;
as little as 7% data needs to be annotated;
proposed extensions simplify termination and increaseaccuracy.
Future work:
more complex scenarios (more instruments, noise,reverberation. . . );
neighbouring problems.
Semi-supervised instrument recognition – A. Diment 17.5.2013 25/25