LAM: Musical Audio Similarity Michael Casey Centre for Cognition, Computation and Culture Department...

Preview:

Citation preview

LAM: Musical Audio Similarity

Michael CaseyCentre for Cognition, Computation and Culture

Department of Computing

Goldsmiths College, University of London

Overview

• Machine Music Understanding• Features / Classes / Clusters

• Real-Time Audio Matching• Feature Extraction• Feature Similarity (Indexing / Retrieval)• PD/MSP Tools

• Music Similarity Applications• Sound object matching• Texture matching

Sound Understanding

Signal Processing Sound Understanding

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

Feature Extraction

frame 2

frame 3

overlapframe 1

audiosource

20ms10ms 30ms 40ms

p( | ) * P( )

Statistical Learningfor Decision Making

Decision boundary

Partitioning of feature space

P( | )= p( )

MusicSpeech

MPEG-7 Audio Tools

Audio

MPEG-7 Audio Tools

Log FrequencySpectrogramAudio

AudioSpectrumEnvelopeD

MPEG-7 Audio Tools

Log FrequencySpectrogramAudio

LogAmplitude

DecorrelatingTransform /

Dimension Reduction

AudioSpectrumEnvelopeD

AudioSpectrumProjectionD

SoundModelStatePathD

State Path

Use estimated state sequence as a feature

MPEG-7 Audio Tools

Log FrequencySpectrogramAudio

LogAmplitude

DecorrelatingTransform /

Dimension Reduction

AudioSpectrumEnvelopeD

AudioSpectrumProjectionD

Hidden MarkovModel

SoundModelDS

MPEG-7 Audio StringsAcoustic Lexicons

Log FrequencySpectrogramAudio

LogAmplitude

DecorrelatingTransform /

Dimension Reduction

AudioSpectrumEnvelopeD

AudioSpectrumProjectionD

Hidden MarkovModel

SoundModelDS StatePath

? 7 1 V 7 1 0 1 ...

SoundModelStatePathD

SYMBOL STRING

State Symbol Sequence (40 State Model)

?71V7101 ...

State Symbol Sequence (40 State Model)

?71V7101 ...

State Symbol Sequence (40 State Model)

?71V7101 ...

State Symbol Sequence (40 State Model)

?71V7101 ...

SoundModelStateHistogramD

seconds

state

index

state

index

0.01s Frames

Self-Similarity Matrix

Self-Similarity Matrix

Self-Similarity Matrix

|||||||||cos, 1

ba

baT

ba

Self-Similarity Matrix

|||||||||cos, 1

ba

baT

ba

a

Self-Similarity Matrix

|||||||||cos, 1

ba

baT

ba

a

b

Self-Similarity Matrix

|||||||||cos, 1

ba

baT

ba

a

b

Self-Similarity Matrix

|||||||||cos, 1

ba

baT

ba

S-Matrix

Efficient Storage / Retrieval

• Real-Time Access

• Large Databases

• Distributed Databases

PostgreSQL Database Representation of State Path “Strings” and Histograms

Similarity

• Compute distance between feature pairs• Features == SoundModelStateHistogramD

• Similarity Metric•dist(a,b) >= 0•dist(a,b)== 0 iff a==b•dist(a,b) + dist(b,c) >= dist(a,c)

• Vector Dot Product

|||||||||cos, 1

ba

baT

ba

Similarity of Feature Trajectories

Dynamic Time Warping

Acousticon Strings

• Distance Metric– String Edit Distance (Levenschtein)

• Scalable to Large Databases– PostgreSQL Implementation– Can use built-in Index Structures

• Scalable to Real-Time Implementation– matching and audio streaming (< 20ms )

Information Retrievalfor Creativity

• Utilize sound extant database for new material

• Take the structure of a music clip but replace the content.

• New interfaces for music creativity.

Audio Information Retrieval

MPEG-7Database

A pre-indexed Collection of Sounds

Audio Query Extract

MPEG-7Database

Segment Match

Result ListA Sound or Scene orList of Sounds

Audio Information Retrieval

Audio Query Extract

MPEG-7Database

Segment Match

Result ListFeature extractionfrom audio.

Audio Information Retrieval

Audio Query Extract

MPEG-7Database

Segment Match

Result ListPartitioningof audio intochunks.

Audio Information Retrieval

Audio Query Extract

MPEG-7Database

Segment Match

Result List

Find similar chunksof Audio

Audio Information Retrieval

Real-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time MatchingReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching

MusaicsReal-Time Matching