Upload
susanna-riley
View
216
Download
0
Embed Size (px)
Citation preview
Automatic music classification Automatic music classification and the importance of and the importance of
instrument identificationinstrument identification
Cory McKay and Ichiro FujinagaCory McKay and Ichiro FujinagaMusic Technology Area Music Technology Area
Faculty of MusicFaculty of MusicMcGill UniversityMcGill UniversityMontreal, CanadaMontreal, Canada
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 22/25/25
OverviewOverview
Examination of the relative importance of Examination of the relative importance of different high-level features in automatic different high-level features in automatic music classificationmusic classification
Performed an experiment involving Performed an experiment involving automatic genre classification of MIDI filesautomatic genre classification of MIDI files
Found that features based on Found that features based on instrumentation (an abstraction of timbre) instrumentation (an abstraction of timbre) were of particular importancewere of particular importance
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 33/25/25
TopicsTopics
Introduction to automatic music classificationIntroduction to automatic music classificationRelated researchRelated researchDetails of experiment performedDetails of experiment performed
Features usedFeatures usedFeature weightingFeature weightingTaxonomies usedTaxonomies usedClassifiers and training data usedClassifiers and training data usedResultsResults
ConclusionsConclusions
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 44/25/25
Introduction to automatic music Introduction to automatic music classificationclassification
There are many ways in which computers can There are many ways in which computers can classify musicclassify music
GenreGenre
ComposerComposer
PerformerPerformer
Geographical/temporal/cultural originGeographical/temporal/cultural origin
etc.etc.
Music classification can be difficult for both Music classification can be difficult for both humans and computershumans and computers
Rarely have precise, clear and consistent guidelines Rarely have precise, clear and consistent guidelines delineating the musical characteristics of categoriesdelineating the musical characteristics of categories
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 55/25/25
Applications of automatic music Applications of automatic music classificationclassification
Discovery of probable authorship of anonymous Discovery of probable authorship of anonymous compositionscompositions
Sociological and psychological research into how Sociological and psychological research into how humans construct the notion of musical similarity and humans construct the notion of musical similarity and form musical groupingsform musical groupings
Automatic sorting of large databasesAutomatic sorting of large databases
Music recommendation systemsMusic recommendation systems
Sorting of personal music collectionsSorting of personal music collections
e.g. based on mood or listening scenariose.g. based on mood or listening scenarios
Automated transcription Automated transcription
Detection of pirated recordingsDetection of pirated recordings
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 66/25/25
Advantages of automatic music Advantages of automatic music classificationclassification
Computers can perform classifications Computers can perform classifications faster and more consistently than humansfaster and more consistently than humans
Computers can analyze music in novel Computers can analyze music in novel and non-intuitive ways that might not occur and non-intuitive ways that might not occur to humansto humans
Computers can avoid human Computers can avoid human preconceptions that might contaminate preconceptions that might contaminate experimental resultsexperimental results
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 77/25/25
How automatic classification worksHow automatic classification works
““Feature” extractionFeature” extractionProperties or characteristics of recordingsProperties or characteristics of recordings
Percepts that classifiers base decisions onPercepts that classifiers base decisions on
Can be extracted from audio (e.g. MP3) or symbolic Can be extracted from audio (e.g. MP3) or symbolic (e.g. MIDI) recordings(e.g. MIDI) recordings
Good features are essential to successful classificationGood features are essential to successful classification
Classification can be done usingClassification can be done usingExpert systems: utilize pre-set heuristicsExpert systems: utilize pre-set heuristics
Machine learning (AI): supervised or unsupervised Machine learning (AI): supervised or unsupervised learninglearning
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 88/25/25
FeaturesFeatures
Low-level featuresLow-level featuresSignal processing quantitiesSignal processing quantities
e.g. spectral centroid and spectral fluxe.g. spectral centroid and spectral flux
Can be effective practicallyCan be effective practicallyCan have psychoacoustic significanceCan have psychoacoustic significanceHave little direct theoretical meaning musicologically or Have little direct theoretical meaning musicologically or sociologicallysociologically
High-level featuresHigh-level featuresBased on musical abstractionsBased on musical abstractions
e.g. tempo and metere.g. tempo and meter
Currently difficult or impossible to extract from audio Currently difficult or impossible to extract from audio recordingsrecordingsHave more theoretical relevance than low-level featuresHave more theoretical relevance than low-level features
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 99/25/25
Overview of this experimentOverview of this experiment
Empirical examination of which features Empirical examination of which features are most useful to classifiersare most useful to classifiers
Used high-level features because of their Used high-level features because of their theoretical significancetheoretical significance
Used test task of genre classificationUsed test task of genre classificationA particularly difficult type of classificationA particularly difficult type of classification
Related to many other types of classificationRelated to many other types of classification
Features useful for this task likely to be Features useful for this task likely to be particularly robustparticularly robust
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1010/25/25
Related researchRelated research
Relatively little work has been done on features Relatively little work has been done on features that could be useful for arbitrary types of musicthat could be useful for arbitrary types of music
Cantometrics project (Lomax 1968) Cantometrics project (Lomax 1968) Tagg (1982) Tagg (1982) Cope (1991)Cope (1991)Arden and Huron (2001)Arden and Huron (2001)
Studied the correlation between musical features and Studied the correlation between musical features and geographical regionsgeographical regions
Automatic genre classification has received Automatic genre classification has received considerable attention recentlyconsiderable attention recently
Audio classification work of Tzanetakis and Cook (2002) Audio classification work of Tzanetakis and Cook (2002) is often citedis often citedBest results to date with symbolic data have been Best results to date with symbolic data have been achieved by McKay and Fujinaga (2004) achieved by McKay and Fujinaga (2004)
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1111/25/25
BodhidharmaBodhidharma
Experiments carried Experiments carried out with the out with the Bodhidharma systemBodhidharma systemA general-purpose A general-purpose symbolic feature symbolic feature extraction and extraction and classification systemclassification systemEasy-to-useEasy-to-usePortablePortableApplicable to a wide Applicable to a wide range of research tasksrange of research tasks
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1212/25/25
Features studiedFeatures studied
111 high-level features implemented:111 high-level features implemented:InstrumentationInstrumentation
e.g. whether modern instruments are presente.g. whether modern instruments are presentMusicalMusical TextureTexture
e.g. standard deviation of the average melodic leap of different linese.g. standard deviation of the average melodic leap of different linesRhythmRhythm
e.g. standard deviation of note durationse.g. standard deviation of note durationsDynamicsDynamics
e.g. average note to note change in loudnesse.g. average note to note change in loudnessPitch StatisticsPitch Statistics
e.g. fraction of notes in the bass registere.g. fraction of notes in the bass registerMelodyMelody
e.g. fraction of melodic intervals comprising a tritonee.g. fraction of melodic intervals comprising a tritone
Largest available set of implemented high-level featuresLargest available set of implemented high-level features42 more features have been proposed, but have not been 42 more features have been proposed, but have not been implemented yetimplemented yetMore information available in Cory McKay’s master’s thesis (2004)More information available in Cory McKay’s master’s thesis (2004)
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1313/25/25
Features to useFeatures to use
An insufficient number of features can fail to An insufficient number of features can fail to provide classifiers with enough information to provide classifiers with enough information to make good decisionsmake good decisions
Too many features can overwhelm and Too many features can overwhelm and confuse classifiersconfuse classifiers
Can be difficult to predict in advance which Can be difficult to predict in advance which features will work well togetherfeatures will work well together
Individual performance of a feature is not necessarily Individual performance of a feature is not necessarily indicative of its performance in combination with other indicative of its performance in combination with other featuresfeatures
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1414/25/25
Feature weightingFeature weighting
Feature weighting is a technique for Feature weighting is a technique for experimentally determining the importance of experimentally determining the importance of various features by assigning weights to themvarious features by assigning weights to them
Used genetic algorithms hereUsed genetic algorithms here““Evolves” a good set of weightsEvolves” a good set of weights
The weights produced by the genetic The weights produced by the genetic algorithm provides an indication of the algorithm provides an indication of the importance of particular features in particular importance of particular features in particular contextscontexts
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1515/25/25
Types of classification performedTypes of classification performed
The choice of “best” features is context-dependantThe choice of “best” features is context-dependante.g. best features for distinguishing between Baroque e.g. best features for distinguishing between Baroque and Romantic different than when comparing Punk and and Romantic different than when comparing Punk and Heavy MetalHeavy Metal
Performed three types of classification:Performed three types of classification:FlatFlat
HierarchicalHierarchical
Round-robinRound-robin
Hierarchical and round-robin feature weighting Hierarchical and round-robin feature weighting allowed classifiers to use specialized weightings in allowed classifiers to use specialized weightings in order to improve performanceorder to improve performance
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1616/25/25
Taxonomies usedTaxonomies used
Used hierarchical taxonomiesUsed hierarchical taxonomiesA recording could belong to more than one A recording could belong to more than one categorycategoryA category could be a child of multiple parents in A category could be a child of multiple parents in the taxonomical hierarchythe taxonomical hierarchy
Performed experiments with two taxonomies:Performed experiments with two taxonomies:Large (38 leaf categories):Large (38 leaf categories):
Used to test system under realistic conditionsUsed to test system under realistic conditions
Small (9 leaf categories):Small (9 leaf categories):Used to loosely compare system to existing sytemsUsed to loosely compare system to existing sytems
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1717/25/25
Large taxonomyLarge taxonomy
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1818/25/25
Small taxonomySmall taxonomy
JazzJazzBebopBebopJazz SoulJazz SoulSwing Swing
PopularPopularRapRapPunkPunkCountryCountry
Western ClassicalWestern ClassicalBaroqueBaroqueModern ClassicalModern ClassicalRomanticRomantic
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 1919/25/25
Training and testingTraining and testing
Used ensembles of k-nearest neighbour Used ensembles of k-nearest neighbour and neural network classifiersand neural network classifiers950 MIDI files950 MIDI files
Hand-classified for training based on a variety Hand-classified for training based on a variety of on-line databasesof on-line databases
5 fold cross-validation5 fold cross-validation80% training, 20% testing80% training, 20% testing
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 2020/25/25
Average success ratesAverage success rates
9 Category 9 Category TaxonomyTaxonomy
Leaf: 90% Leaf: 90%
Root: 98%Root: 98%
38 Category 38 Category TaxonomyTaxonomy
Leaf: 57%Leaf: 57%
Root: 81%Root: 81%
Classification Performance
0
10
20
30
40
50
60
70
80
90
100
Classical Jazz Pop Average Chance
Suc
cess
Rat
e (%
)
Root Genres
Leaf Genres
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 2121/25/25
Success rates achieved in previous Success rates achieved in previous researchresearch
Audio results:Audio results: Many systems have been implementedMany systems have been implemented
Generally only used 10 categories or lessGenerally only used 10 categories or less
Success rates generally below 80% for more than 5 Success rates generally below 80% for more than 5 categoriescategories
Symbolic results:Symbolic results:84% for 2-way classifications (Shan & Kuo 2003)84% for 2-way classifications (Shan & Kuo 2003)
89% for 2-way classifications (Ponce de Leon & Inesta 2004)89% for 2-way classifications (Ponce de Leon & Inesta 2004)
63% for 3-way classifications (Chai & Vercoe 2001)63% for 3-way classifications (Chai & Vercoe 2001)
60-70% for 6-way classifications (Basili, Serafini & Stellato 60-70% for 6-way classifications (Basili, Serafini & Stellato 2004)2004)
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 2222/25/25
Feature performanceFeature performance
Feature GroupFeature Group Number of FeaturesNumber of Features Weighting Scaled by Number of Features (%)Weighting Scaled by Number of Features (%)
InstrumentationInstrumentation 2020 46.146.1
PitchPitch 2525 24.524.5
RhythmRhythm 3030 14.314.3
MelodyMelody 1818 11.611.6
TextureTexture 1414 1.71.7
DynamicsDynamics 44 1.61.6
Features based on instrumentation were assigned Features based on instrumentation were assigned 46.1% of all weightings (after scaling)46.1% of all weightings (after scaling)At least one instrumentation feature played a major role At least one instrumentation feature played a major role in almost every classifierin almost every classifier Two of the top three features were based on Two of the top three features were based on instrumentationinstrumentation
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 2323/25/25
Importance of instrumentationImportance of instrumentation
Features based on instrumentation clearly dominantFeatures based on instrumentation clearly dominantA high-level abstraction of timbreA high-level abstraction of timbre
Implies that audio classification systems could Implies that audio classification systems could benefit from instrument identification modulesbenefit from instrument identification modules
Caveat:Caveat:These results present the overall averages of weightingsThese results present the overall averages of weightings
Other features played a dominant role in certain stages of Other features played a dominant role in certain stages of classificationclassification
The best results were achieved by including a wide variety The best results were achieved by including a wide variety of features and applying feature weightingof features and applying feature weighting
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 2424/25/25
ConclusionsConclusions
Features based on instrumentation can play an Features based on instrumentation can play an essential role in automatic music classification, essential role in automatic music classification, and should be used if possibleand should be used if possibleHigh-level features can produce good results, High-level features can produce good results, and should not be neglected in favour of low-and should not be neglected in favour of low-level featureslevel featuresBodhidharma’s large feature library combined Bodhidharma’s large feature library combined with feature weighting is an effective approachwith feature weighting is an effective approachVery good genre classification success rates can Very good genre classification success rates can be achieved with small taxonomies, and we are be achieved with small taxonomies, and we are at least approaching a point where large at least approaching a point where large taxonomies can be dealt with effectivelytaxonomies can be dealt with effectively
10 March 200510 March 2005 CIM Montreal / McKay & FujinagaCIM Montreal / McKay & Fujinaga 2525/25/25