Improving Music Genre Classification Using Collaborative Tagging Data Ling Chen, Phillip Wright *,...

Preview:

DESCRIPTION

Introduction – Music Genre Classification Challenge: music is an evolving art. Past works trained with low-level features from signals. Timbral texture, rhythmic content, melodic and harmonic content Tags of music tracks provide high-level features. Utilizing tags is trivial? tags may be useful information or noise.

Citation preview

Improving Music Genre Classification Using Collaborative

Tagging DataLing Chen, Phillip Wright*, Wolfgang Nejdl

Leibniz University Hannover*Georgia Institute of Technology

WSDM 2009

Introduction – Music Information Retrieval

People need to search music by music content. Music genre

A top-level description of content Ex: Jazz, Rock, Country etc Critical for music information retrieval

Microsoft required 30 musicologists over one year to manually label a “few hundred thousand songs”.

Introduction –Music Genre Classification

Challenge: music is an evolving art. Past works trained with low-level features from signals.

Timbral texture, rhythmic content, melodic and harmonic content Tags of music tracks provide high-level features. Utilizing tags is trivial?

tags may be useful information or noise.

Problem Description A set of music tracks X = {x1, x2, …, xn}

A set of music tracks C = {c1, c2, …, ck}

Classification: assign the label of xi C(xi) C Γ(xi) = audio signal features of xi

T (xi) = a set of tags of xi

Graph of Tracks Adjacent nodes are semantically similar tracks, in terms of tags. Goal: using the tag information indirectly due to the data sparsity problem Sim(xi, xj): cosine & TF-IDF weighting

xi and xj are adjacent if Sim(xi, xj) > the threshold ε

Single-layer Classification Assuming the audio content of a track has no

direct coupling with its neighbors’ genres:

Double-layer Classificaiton Idea: learning from unknown tracks whose genre labels need to be predicted.

Relaxation labeling technique is adopted. Δk = all of the known information

Audio content of all tracks and genre labels of known tracks Find the class ci for xi to maximize Pr(ci|Δk)

Framework of Double-layer Classification

Naïve Bayes Classifier using audio content information

Iterative ProcessNu(xi) = the set of unknown neighbors of xi

Nk(xi) = the set of known neighbors of xi

base classifier

Experiment Data Crawl MP3 files from the Last.fm Collect the ground truth genre data from All Music Guide 2,262 tracks remaining in 6 genres

Each track has at most 99 tags and at least 1 tag; 29.9 tags on average.

Baseline Performance

Performance of Single-layer Classification

The similarity threshold ε is set to 0.2

Performance of Double-layer Classification

Misclassification Analysis

The performance is limited when using a smaller set of training data

Misclassification usually occurs among Rock, R&B, and Rap. Reason: many cross-class edges between tracks of the

three genres Caused by the noise problem of tag data

Optimizing strategies Tag discrimination Tag augmentation Content combination

Tag Discrimination Idea: assign a higher weight to the tag with a lower class entropy: TF-IDF(tj, xi) TF-IDF(tj, xi) / EC(tj) The similarity values decrease ε is set to 0.05

Performance of Tag Discrimination

Tag Augmentation Idea: increase the number of in-class edges For each known track, its original tag vector is

augmented by adding tags of its neighbors to its tag vector.

Similarity between two tracks after augmentation:

Performance of Tag Augmentation α= 0.6, ε= 0.2

Content Combination Idea: augment features with other information sources SC(xi, xj) = content-based similarity between xi and xj

Overall similarity

Performance of Content Combination

β= 0.6, ε= 0.5

Conclusions While most of existing approaches on automatic

music genre classification focus on finding better low-level features, here we explore the usage of social tags for this task.

Tag information are used to construct a graph of tracks.

Two classification methods are introduced and the Double-layer classifier performs better.

Several strategies of feature processing are considered to improve the performance.

Recommended