23
A Music Search Engine Built upon Audio-based and Web-based Similarity Measures P. Knees, T., Pohle, M. Schedl, G. Widmer SIGIR 2007

A Music Search Engine Built upon Audio-based and Web-based Similarity Measures P. Knees, T., Pohle, M. Schedl, G. Widmer SIGIR 2007

Embed Size (px)

Citation preview

A Music Search EngineBuilt upon Audio-based andWeb-based Similarity Measures

P. Knees, T., Pohle, M. Schedl, G. Widmer

SIGIR 2007

INTRODUCTION

Basically all existing music search systems make use of manually assigned subjective meta-information like genre or style to index the underlying music collection. Explicit manual annotations A small set of meta-data

Recent approaches Content-based analysis of the audio files Collaborative recommendations Incorporate information from different sources

RELATED WORK

Query-by-example Query-by-Humming/Singing (QBHS) Operate on MIDI Music piece → Meta-data

Cross-media Semantic ontology

Semantic relations Crawler on “audio blogs” Word sense disambiguation Text surrounding the links to audio files Last.fm – listening habits & tags

PREPROCESSING THE COLLECTION

ID3 tags Artist Album Title

Ignored Only speech pieces ( skit in rap) Intro / Outro Duration below 1 minute

WEB-BASED FEATURES

Queries to Google1. “artist” music2. “artist” “album” music review3. “artist” “title” music review -lyrics

For each query, retrieve top-ranked 100 pages Clean HTML tags and stop words in 6 languages

WEB-BASED FEATURES (CONT.)

term list of each music piece Remove all terms with dftm <= 2

global term list Remove all terms that co-occur < 0.1%

Resulting 78,000 terms (dimensions) weight( t, m )

tf * idf N – # of music pieces mpft – music piece frequency

Cosine normalization Removes the influence of the length of pages

AUDIO-BASED SIMILARITY

MFCCs, Gaussian Mixture Model, KL divergence

Problem Hubs - frequently similar Outliers - never similar to others Triangle inequality - does not fulfill

Author’s previous work solve these problems

AUDIO-BASED SIMILARITY (CONT.)

Always similar – hubs ndist(A) = distance to the nth nearest neighbour

g(A, Pi) = Dbasic(A, Pi) / ndist(Pi), for all i

sort g(A, Pi) ascending, pick nth value as f(A)

Dn-NN norm(A, B) = Dbasic(A, B) / ( f(A) * f(B) )

Never similar – outliers like above

Triangle inequality sort Dbasic(A, Pi), for all i

interpolating Dbasic(A, B) into Dbasic(A, Pi)

DP(A, B) is the rank of Dbasic(A, B) in Dbasic(A, Pi)

Dpv(A, B) = DP(A, B) + DP(B, A)

DIMENSIONALITY REDUCTION

χ2 test s : 100 most similar tracks d : 100 most dissimilar tracks Calculate χ2( t, s )

N terms with highest value are then joined into a global list

s d

tA B

!tC D

n __ 50 100 150

dimensionality

78000 4679 6975 8866

VECTOR ADAPTATION

Particularly necessary for tracks where no related information could be retrieved from the web

Perform a simple smoothing

QUERYING THE MUSIC SEARCH ENGINE

Original query + “music” -site:last.fm

Google search 10 top-most web pages Map to vector space Calculate Euclidean distances

AUDIOSCROBBLER GROUND TRUTH

Common approach genre information several drawbacks

http://www.audioscrobbler.net Web services to access Last.fm data Tag information provided by Last.fm drawbacks

Using top tags for tracks (total 227 tags)

PERFORMANCE EVALUATION

Dimensionality reduction

pass significance test

χ2 /50 best

random permutation

PERFORMANCE EVALUATION

Vector adaptation(re-weighting)

no significance

PERFORMANCE EVALUATION

Overall Precision after 10 documents

EXAMPLES

Rock with great riffs

Punk

Relaxing music

FUTURE WORK

Dimensionality reduction

12601tracks

ID3 tag

Web-based feature

Google search

Audio similarity

Vector adaptatio

n

Query Google search

Vector space

results

FUTURE WORK

Dimensionality reduction

12601tracks

ID3 tag

Web-based feature

Google search

Audio similarity

Vector adaptatio

n

Query Google search

Vector space

合輯 , remix

results

FUTURE WORK

Dimensionality reduction

12601tracks

ID3 tag

Web-based feature

Google search

Audio similarity

Vector adaptatio

n

Query Google search

Vector space

Lyrics

results

FUTURE WORK

Dimensionality reduction

12601tracks

ID3 tag

Web-based feature

Google search

Audio similarity

Vector adaptatio

n

Query Google search

Vector space

Indexing documents

results

FUTURE WORK

Dimensionality reduction

12601tracks

ID3 tag

Web-based feature

Google search

Audio similarity

Vector adaptatio

n

Query Google search

Vector space

PLSA

results

FUTURE WORK

Dimensionality reduction

12601tracks

ID3 tag

Web-based feature

Google search

Audio similarity

Vector adaptatio

n

Query Google search

Vector space

Computation inefficient

results

FUTURE WORK

Dimensionality reduction

12601tracks

ID3 tag

Web-based feature

Google search

Audio similarity

Vector adaptatio

n

Query Google search

Vector space

Ground truth?results