Transcript
Page 1: From hits to niches? ...or how popular artists can bias music recommendation and discovery

2nd netflix workshop // ACM KDD / Las Vegas, US // August, 24th 2008

From Hits to Niches?...or how popular artists can bias

music recommendation and discovery

Òscar Celma, Pedro Cano(Music Technology Group ~ UPF)

(Barcelona Music and Audio Technologies, BMAT)

Page 2: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

The problem

Page 3: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

music overload

• Today(August, 2007)

iTunes: 6M tracks / 3B Sales P2P: 15B tracks 53% buy music on line

• Tomorrow All music will be on line Billions of tracks Millions more arriving every week

• Finding new, relevant music is hard!

Page 4: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Can recommender systems help us?

Page 5: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the Long Tail of popularity

• Help me find it!

Page 6: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

If you like The Beatles you might like ...

Page 7: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the musical Turing Test

• Which recommendation is from a human, which is from a machine?

List #2

• Chuck Berry

• Harry Nilsson

• XTC

• Marshall Crenshaw

• Super Furry Animals

• Badfinger

• The Raspberries

• The Flaming Lips

• Jason Faulkner

• Michael Penn

List #1

• Bob Dylan

• Beach Boys

• Billy Joel

• Rolling Stones

• Animals

• Aerosmith

• The Doors

• Simon & Garfunkel

• Crosby, Stills Nash & Young

• Paul Simon

Page 8: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the musical Turing Test

• Which recommendation is from a human, which is from a machine?

List #2

• Chuck Berry

• Harry Nilsson

• XTC

• Marshall Crenshaw

• Super Furry Animals

• Badfinger

• The Raspberries

• The Flaming Lips

• Jason Faulkner

• Michael Penn

List #1

• Bob Dylan

• Beach Boys

• Billy Joel

• Rolling Stones

• Animals

• Aerosmith

• The Doors

• Simon & Garfunkel

• Crosby, Stills Nash & Young

• Paul Simon

Machine: 

Up to 11

Human

Page 9: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Page 10: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Page 11: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

• popularity bias

• low novelty

Page 12: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Our proposal

Page 13: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Long Tail popularity + item similarity network

Page 14: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Long Tail popularity + item similarity network

Page 15: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Item-to-item Complex Network Analysis

Page 16: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• Datasets: Artist recommendation networks CF*: Social-based, incl. item-based CF (Last.fm)

“people who listen to X also listen to Y”

CB: Content-based Audio similarity “X and Y sound similar”

EX: Human expert-based (AllMusicGuide) “X similar to (or influenced by) Y”

Page 17: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• Small-world networks

Network traverse in a few clicks

Page 18: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• Last.fm is a scale-free network power law exponent for the cumulative indegree

distribution

A few artists (hubs) control the network

Page 19: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• Indegree – avg. neighbor indegree correlation

Page 20: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• Indegree – avg. neighbor indegree correlation

Page 21: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• Indegree – avg. neighbor indegree correlation

Kin(Robert Palmer)=430=>avg(Kin(sim(Robert Palmer)))=342

Page 22: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• Indegree – avg. neighbor indegree correlation

Kin(Robert Palmer)=430=>avg(Kin(sim(Robert Palmer)))=342

Kin(Ed Alton)=11=>avg(Kin(sim(Ed Alton)))=18

Page 23: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

• Indegree – avg. neighbor indegree correlation

• Last.fm presents assortative mixing Artists with high indegree are connected together,

similarly for low indegree artists r = Pearson correlation

complex network analysis

Page 24: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

|------------|---------|-----|-----------|

| | Last.fm | CB | Exp (AMG) |

|------------|---------|-----|-----------|

|Small World | yes | yes | yes |

| | | | |

| Scale-free | yes | No | No |

| | | | |

|Ass. mixing | yes | No | No |

|------------|---------|-----|-----------|

• Summary

Page 25: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

complex network analysis

• But, still some remaining questions...

Are the hubs the most popular artists?

Can we navigate through the Long Tail, via artist similarity?

Page 26: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Long Tail popularity + item similarity network

Page 27: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Long Tail popularity + item similarity network

Page 28: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Item popularity analysis

Page 29: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the Long Tail in music

• last.fm dataset ~260K artists

Page 30: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the Long Tail in music

• last.fm dataset ~260K artists

radiohead (40,762,895)

red hot chili peppers (37,564,100)

muse (30,548,064)

the beatles (50,422,827)

death cab for cutie (29,335,085)pink floyd (28,081,366)coldplay (27,120,352)

metallica (25,749,442)

Page 31: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the Long Tail model [Kilkki, 2007]

• F(x) = Cumulative distribution up to x

Page 32: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the Long Tail model [Kilkki, 2007]

• Top-8 artists: F(8)~ 3.5% of total plays

50,422,827   the beatles40,762,895   radiohead37,564,100   red hot chili peppers30,548,064   muse29,335,085   death cab for cutie28,081,366   pink floyd27,120,352   coldplay25,749,442   metallica

Page 33: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

the Long Tail model [Kilkki, 2007]

• Split the curve in three parts

(82 artists) (6,573 artists) (~254K artists)

Page 34: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Combining

complex network analysis

(item-item similarity)

with the Long Tail

(item popularity)

Page 35: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

Long Tail popularity + item similarity network

Page 36: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist indegree vs. artist popularity

• Are the network hubs the most popular items?

???

Page 37: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist indegree vs. artist popularity Last.fm: correlation between Kin and playcounts

Page 38: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist indegree vs. artist popularity Audio CB similarity: no correlation

Page 39: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist indegree vs. artist popularity Expert: correlation between Kin and playcounts

Page 40: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• “From Hits to Niches” navigation in the Long Tail using artist similarity

how many clicks?

Page 41: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• “From Hits to Niches” Audio CB similarity example (VIDEO)

Page 42: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• “From Hits to Niches” Audio CB similarity example

Bruce Springsteen (14,433,411 plays)

Page 43: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• “From Hits to Niches” Audio CB similarity example

Bruce Springsteen (14,433,411 plays) The Rolling Stones (27,720,169 plays)

Page 44: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• “From Hits to Niches” Audio CB similarity example

Bruce Springsteen (14,433,411 plays) The Rolling Stones (27,720,169 plays) Mike Shupp (577 plays)

Page 45: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• navigation in the Long Tail Similar artists, given an artist in the HEAD part:

Also, it can be seen as a Markovian Stochastic process...

54,68%

(0%)

45,32%64,74%

28,80%6,46%

60,92%33,26%

5,82%

CF CB EXP

Head Mid Tail Head Mid Tail Head Mid Tail

Page 46: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• navigation in the Long Tail Markov transition matrix

Page 47: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• navigation in the Long Tail Markov transition matrix

Page 48: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• navigation in the Long Tail Last.fm Markov transition matrix

Page 49: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist similarity vs. artist popularity

• navigation in the Long Tail From Head to Tail, with P(T|H) > 0.4 Number of clicks needed

CF : 5 CB : 2 EXP: 2 HEAD

TAIL#clicks?

Page 50: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

artist popularity

|-----------------------|---------|-----|-----------|

| | Last.fm | CB | Exp (AMG) |

|-----------------------|---------|-----|-----------|

| Indegree / popularity| yes | no | yes |

| | | | |

|Similarity / popularity| yes | no | no |

|-----------------------|---------|-----|-----------|

Summary

Page 51: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

conclusions|-----------------------|---------|-----|-----------|

| | Last.fm | CB | Exp (AMG) |

|-----------------------|---------|-----|-----------|

| Small World | yes | yes | yes |

| | | | |

| Scale-free | yes | no | no |

| | | | |

| Ass. mixing | yes | no | no |

|-----------------------|---------|-----|-----------|

| Indegree / popularity| yes | no | yes |

| | | | |

|Similarity / popularity| yes | no | no |

|-----------------------|---------|-----|-----------|

| POPULARITY BIAS | YES | NO | FAIRLY |

|-----------------------|---------|-----|-----------|

Page 52: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

future work

Page 53: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

future work

• 3D Long Tail item popularity recommendation

network (item-to-item)

user profile

Page 54: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

future work

• 3D Long Tail item popularity recommendation

network (item-to-item)

user profile

relevant items

Page 55: From hits to niches? ...or how popular artists can bias music recommendation and discovery

ACM KDD // 2nd netflix workshop // 2008 // òscar celma & pedro cano

future work

• 3D Long Tail item popularity recommendation

network (item-to-item)

user profile

relevant items

relevant and novel?

...paper @ ACM RecSys'08

Page 56: From hits to niches? ...or how popular artists can bias music recommendation and discovery

2nd netflix workshop // ACM KDD / Las Vegas, US // August, 24th 2008

From Hits to Niches?...or how popular artists can bias music recommendation

and discovery

THANKS!!!

Òscar Celma, Pedro Cano(Music Technology Group ~ UPF)

(Barcelona Music and Audio Technologies, BMAT)