View
42
Download
0
Category
Preview:
DESCRIPTION
Context-Aware Recommendation. Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea Dongjoo Lee. Center for E -Business Technology Seoul National University Seoul, Korea. Introduction. Traditional recommendation methods - PowerPoint PPT Presentation
Citation preview
Context-Aware Recommendation
Intelligent Database Systems LabSchool of Computer Science & Engineering
Seoul National University, Seoul, Korea
Dongjoo Lee
Center for E-Business TechnologySeoul National UniversitySeoul, Korea
Copyright 2008 by CEBT
Introduction
Traditional recommendation methods
Content-based recommendation
– What’s the features that can describe the item?
Collaborative filtering
– Item based CF
– User based CF
– Hybrid CF
Issues in using context information in recommendation
1) What is context information?
2) How to use context information?
3) Is it really useful to use context information in recommen-dation?
2
0.9 0.3 0.20.8 0.50.7 0.6 0.7
0.6 0.70.4 0.3 0.40.1 0.8 0.7 0.5
item
user
Copyright 2008 by CEBT
MusicRecommender
Recommendation
3
(User1 hasSimilarTasteWithf User2) (User2 likesf Song2) (User1 notListened Song2)=> Recommendf Song2 to User1
U1
U2
(User1 likesf Song1) (Song2 isSimilarWithf Song1)(User1 notListened Song2)=> Recommendf Song2 to User1
Collaborative Filtering Content based Recommendation
m1
Recommendf hasSimilarTasteWithf isSimilarWithf notListened likesf
m3
Listen (when, where,…)
Vocabularies
m2
m5
m6
m7
m4
HasFeatureValue
Interpreter
Recommend
Assumption.If user1 scored similarly with user 2User1 and user 2 has similar taste.
Assumption.If music1 and music2 have similar feature valuesMusic1 and music2 is similar.
Interpreter
Logs
RulesInput: User1, User DB, Song DB, User Listen Log
Output: Sorted Song List Output: Sorted Song List
Assumption.If there is not listen log, user didn’t listen a song before.
Assumption.If user listen a song frequently, user likes it.
Interpreter
Interpreter
Interpreter
Input: User1, User DB, Song DB, User Listen Log
Copyright 2008 by CEBT
Recommendation (cont’d)
Recommendation
Context-Aware Recommendation
1. Context-Aware Collaborative Filtering
2. Context-Aware Contents-based Recommendation
1) Context abstraction
2) Context grouping
3) Item abstraction
4) Item grouping
5) User profiling
4
)|()( umPmscore like
),|()( ucmPmscore like
Copyright 2008 by CEBT 5
0.2 0.2
0.8 0.5
0.7 0.6 0.7
0.6 0.7
0.4 0.3 0.4
0.1 0.8 0.7 0.5
1. Context-Aware Collaborative Filtering
item
user
context
0.3 0.1 0.9
0.8 0.5
0.7 0.6 0.3
0.6 0.7
0.4 0.3 0.4
0.1 0.8 0.7 0.5
0.9 0.7 0.2
0.8 0.5
0.7 0.6 0.6
0.6 0.7
0.4 0.3 0.4
0.1 0.8 0.7 0.5
0.7 0.2 0.1
0.8 0.5
0.7 0.6 0.1
0.6 0.7
0.4 0.3 0.4
0.1 0.8 0.7 0.5
There are too many probable contexts
각 상황 별로 사용자의 성향을 구분하고 , 상황 별로 Collaborative Filtering 을 수행한다 . Active context ( 추천해야 하는 현재 상황 ) 에 해당하는 상황을 찾고 이에 따라서 추천을 수행한다 .
Copyright 2008 by CEBT
2. Context-Aware Content-based Recom-mendation
6
USER_ID ARTIST TITLE TIME LOCATION2monkeyflower Ada Our Love Never Dies 2008 04 4 05:05:002monkeyflower Ada Our Love Never Dies 2008 05 23 07:59:002monkeyflower Adam Green Hard to Be a Girl 2007 07 3 00:00:002monkeyflower Adam Green Salty Candy 2007 07 5 00:00:002monkeyflower Adriana Calcanhoto Segundos 2008 10 20 15:37:002monkeyflower Aerospace December Slow 2007 10 29 00:00:002monkeyflower Aesop Rock Basic Cable 2007 07 5 00:00:002monkeyflower Afterlife (feat. Neve) Elijah 2008 11 10 07:13:002monkeyflower Afterlife (feat. Neve) Elijah 2008 11 14 08:20:002monkeyflower Afterlife (feat. Neve) Elijah 2008 11 7 05:51:002monkeyflower Aim Linctus 2008 07 24 05:47:002monkeyflower Air Alone in Kyoto 2007 04 12 00:00:002monkeyflower Air Cherry Blossom Girl 2007 07 12 00:00:002monkeyflower Air Dead Bodies 2007 07 18 00:00:002monkeyflower Air Traffic Shooting Star (Demo) 2008 05 26 01:57:002monkeyflower Aitch Beautiful Girl 2008 07 17 06:18:002monkeyflower Akron/Family Future Myth 2008 01 31 08:37:002monkeyflower Akron/Family Gone Beyond 2007 05 28 00:00:002monkeyflower Akron/Family Love And Space 2007 06 20 00:00:00
…
Last.fm user listen logs
ARTIST TITLEAda Our Love Never DiesAdam Green Salty CandyAdriana Calcanhoto SegundosAerospace December SlowAfterlife (feat. Neve) ElijahAim LinctusAir Alone in KyotoAirbase GenieAitch Beautiful GirlAkron/Family Future MythAkron/Family Love And SpaceDavid Bowie Come And Buy My ToysDiana Ross (I Love) Being in Love With YouDuran Duran All She Wants IsEndre Kallocain (Robert Nickson Remix)Frank Sinatra The Way You Look TonightJose Amnesia Hentai (dj tom-x mix)Kyau vs. Albert Made Of Sun [KvA Volume Three Mix]Liberty X Never Give UpMidway IncaNorah Jones Humble MeNu NRG Le MirageQueen Don't Try So HardR.E.M. Walk UnafraidRachel Stevens Sweet Dreams My L.A. Ex (Bimbo Jones club mix)Robbie Williams Mack the KnifeSalt Tank Sargasso SeaSolar Factor Urban Shakedown (original mix)Sunny Lax M.I.R.A.The Cure TrapVDM ShamuWet Wet Wet Morning (96 Remix)Whitney Houston Heartbreak Hotel
…
Last.fm music
Context Group 1
Music Group 2These songs can be recommended.
Context Group 2
0.9
0.3Music Group 1
Copyright 2008 by CEBT
2. Context-Aware Content-based Recom-mendation
7
mg1
User
ItemContext
cg2
cg1
cgm
ContextGroup
mg2
mg2
… … …
ItemGroup
MGmg CGcg
MGmg
like
ccgPucgmgPmgmP
ucmgPmgmP
ucmPmscore
)|(),|()|(
),|()|(
),|()(
c
Active context
1
32
132
Model based recommendation
Copyright 2008 by CEBT
Abstraction in Music Domain
8
User Song
Tag
listen
timetagged
AlbumArtist
songBy
trackOf
Context Concept
belongs
tagged
tagged
Context AbstractionSemantic Annotation = Abstraction with domain concepts
Context can be obtained from users’ listen logs
location occasion
Learn user’s preference from
listen logs
name
id
gender
Tag_count
Track_count
age
country
count
count
count
User Time Loca-tion
Mu-sic
A … … M1
…
Users listen logs
Copyright 2008 by CEBT 9
1) Context Abstraction
Filter
sensed data
filtered data
concept
context
context
context
context
Coolcontext
Filter
Sensor
Sensor
Context Concepts
fuzzy membership function
filtered data
Copyright 2008 by CEBT
1) Context Abstraction – Fuzzy Join
10
Concept
Hot
Cool
Cold
…
Temp.
39
28
17
7
-1
-20
f
Temp.Con-cept
Fuzzi-ness
39 Hot 0.98
28 Hot 0.84
17 Hot 0.20
7 Hot 0
-1 Hot 0
-20 Hot 0
39 Cool 0
28 Cool 0.1
17 Cool 0.87
7 Cool 0.5
-1 Cool 0.05
-20 Cool 0
39 Cold 0
28 Cold 0
17 Cold 0
7 Cold 0.5
-1 Cold 0.87
-20 Cold 0.99
Temp.Con-cept
Fuzzi-ness
39 Hot 0.98
28 Hot 0.84
17 Hot 0.20
17 Cool 0.87
7 Cool 0.5
7 Cold 0.5
-1 Cold 0.87
-20 Cold 0.99
Temperature
Fuzzy Join Functions
HotCoolCold
Fuzz
iness
Context Data Abstract ContextConcepts
Product of two relation Fuzzy Join Result
α-cut may improve query performance
Copyright 2008 by CEBT
1) Context Abstraction – Fuzzy Equi-Join
Normal Equi-Join
Fuzzy Equi-Join
the most important thing is fuzzy function (≈) that com-pares two values
obtain fuzzy membership degree
Performance Improvement
– Sort-Merge Join using partial order of fuzzy similarity
11
SELECT T1.*, T2.*FROM table1 T1 JOIN table2 T2 ON T1.a = T2.b
SELECT T1.*, T2.*, FuzzyValueFROM table t1 JOIN table t2 ON t1.a ≈ t2.bWHERE FuzzyValue > THETA
Copyright 2008 by CEBT 12
1) Context Abstraction – Periodic Membership Function
Modified Cosine Function
Because temporal value is periodic , periodic function is appropriate for calculating membership degree to the temporal concepts.
time
f(x) = max(min(10.0 * cos( 2pi * (x - (150) ) / 1440 - 8.5), 1, 0)
dawn
midnight
f(x) = max(min(7.0 * cos( 2pi * (x - (60) ) / 1440 - 5.5), 1, 0)
f(x) = max(min(4.0 * cos( 2pi * (x - (172800) ) / 525600 - 2.4), 1, 0)
Spring
)))/2cos(,1min(,0max()( dcbxaxf
Dawn, Morning, Noon, Afternoon, Evening, Night, Mid-night
Monday, Tuesday, Wednesday, Thursday, Friday, Satur-day, Sunday
Spring, Summer, Autumn, Winter
New Year’s Day, Valentine’s Day, White Day, Children’s Day, Parents’ Day, Christmas
Copyright 2008 by CEBT
2) Context Grouping
1) Atomic Context Concept
Assume concepts are independent
2) Clustering
K-means
Fuzzy C-means
Hierarchical clustering
Mixture of Gaussians
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/index.html
3) Group Meaningful Context13
cgwcccgP .)|(
),(distance)|( ccgccgP
?)|( ccgP
Copyright 2008 by CEBT 14
3) Music Abstraction
My Fist Your Face
Rose
I've Got to See You Again
Use annotations
rock
alternative
seen live
indie
90s
electro
romance
Thinking of You
Sleeping Beauty
jazz
Sleeping Beauty {(rock, w1), (90s, w2)}Rose {(rock, w1), (indie, w2)}
Weight calculation
|}in |{|
||log
) in (
)in (, sdcs
DC
sdccount
sdccountw
jdc
i
ijdcs ji
song
5.0|}in |{|
5.0|}in |{|log
)1(
)1(
1,
1,
,
sdcs
sdcsN
avgdl
dlbbktf
ktfw
j
j
dcs
dcs
dcs
ji
ji
ji
dci
ijdcs sdccount
sdccounttf
ji ) in (
)in (,
annotations
…
Tf-idf
BM25
ModelingS = {(t, w) | t ∈ T, w ∈ R and 0 ≤ w ≤1}
Copyright 2008 by CEBT
4) Music Grouping
Similar to context grouping
1)
2)
3)
15
mgwcmmgP .)|(
),(distance)|( mmgmmgP
?)|( mmgP
Copyright 2008 by CEBT 16
context music
5) User Profiling
0.5 0.3 0.9 0.8
User’s listen logAbstracted context
0.9 0.3 0.20.8 0.50.7 0.6 0.7
0.6 0.70.4 0.3 0.40.1 0.8 0.7 0.5
Item groups
con
text
gro
up
s
User profile
mg1
cg2
cg1
cgm
mg2
mg2
c m
… …
…
Copyright 2008 by CEBT
5) User Profiling – Fuzzy Join and Aggrega-tion
17
User Time Music
urisj27 2008.02.25 8:30 Beautiful Day
Music Tag Fuzziness
Beautiful Day Dreamy
User listen logs with context
Music with annotations
Concept Fuzzy Function
Morning
Afternoon
… ...
Context conceptsand fuzzy function
User Time Music Context Fuzziness
urisj27 2008.02.25
8:30Beautiful Day Morning 1.0
Fuzzy-equivalent Join (Time)
Equivalent Join (Music)
User Context DomainFuzzi-ness
Fuzziness
urisj27 Morning Dreamy 1.0
User Context Domainp(mg|cg,u)
urisj27 Morning Dreamy 0.7
Aggregation
i
ji
ji
cgcgluulLl
mgmglcgcgluulLl
ijmgcgu fuzzinessl
fuzzinessl
ucgmgpw
..
...
,, .
.
),|(
},,|),,,({ MGmgCGcgUufuzzinessmgcgulL
Context groupingItem grouping
Copyright 2008 by CEBT 18
Contribution
Model based context aware recommendation
Do not depends on ambiguous relationships among concepts, users, and items
Not from the name or description
But from the semantic annotations, tags
Abstract context concepts by using fuzzy membership func-tions
Distinguish context concepts from domain concepts
There is no reason to put them together
Even though they have the same name, we have to consider them as different.
– Domain concepts are only meaningful when they are used in that domain. They may have different meaning when they are used in different domains.
Copyright 2008 by CEBT 19
How to Evaluate?
How to evaluate effect of the context?
Divide logs into training set and test set
Give the same information and see the results of no context using path and context using path
– If recommended song list contains the song, it’s ok.
– Top k recommendation results.
Copyright 2008 by CEBT
Experiments
Two domains
Music domain
– Last.fm
Movie domain
– iMDB
They has different characteristics
20
Copyright 2008 by CEBT 21
Publication Schedule
Target conference
The 2009 IEEE/WIC/ACM International Conference on Web Intelligence (WI ’09)
– Info: 15-18 September 2009, Milan Italy
– Due date: April 10, 2009
– Notification: June 3, 2009
– Format: IEEE 2 column format, max 8 pages
Copyright 2008 by CEBT
Deep Research Topic
1) What is the important features of context and music?
2) What is the optimal model?
22
context music
Copyright 2008 by CEBT 23
Additional Issues
Crawling
Data sampling
Relationship extraction
Approximate string matching
Copyright 2008 by CEBT 24
Crawling last.fm
735,000 users
South Korea, North Korea, Japan, United Kingdom, USA
5,855,000 tracks
duplicated multiple tracks
913,720/3,322,000 …… still crawling
69,725,000 user listen recent tracks
69,000,000 listen tracks of thousands of users
6,659,000 user loved tracks
2,311,000 user tags
Copyright 2008 by CEBT
Data sampling
미국 국적에 음악을 많이 들은 상위 100 여명 정도에 대해서만 테스트 select * from lfm_user where country = 'United States' and
track_count2 > 1000 order by track_count2 desc
상위 100 여명 정도가 많이 들은 노래 선정 select * from lfm_rel_user_track_2 where user_id = 'thetas-
teofink‘
상위 100 여명 정도가 많이 들은 노래에 있는 tag 로 음악 추상화 Artist, album 을 어떻게 활용할지는 일단 보류
앨범 이름 , 곡명이 일치하지 않는 것 어떻게 처리할지 고려하자 .
Approximate string matching 을 적용하는 것은 또 다른 문제
25
Copyright 2008 by CEBT
Approximate String Matching
Not exact link data – Approximate string matching
Data were represented by user’s song name so that same song have multiple names.
Last.fm does not assume strict foreign key constraints.
26
u2
u1
m3
m1
u4
m2
u3
m4
User
U1
U2
U3
U4
Music
M1
M2
M3
M4
User Music Time
U1 M1
U1 M3
U2 M2’
U2 M4’
We have to match by us-ing fuzzy match methods
Copyright 2008 by CEBT
Approximate String Matching
Approximate string search
Levenshtein Distance (Edit Distance)
This calculates the minimum number of insertions, deletions, and substitutions necessary to convert one string into another.
http://www.merriampark.com/ld.htm
Gestalt
SoundEx
Its goal is to group letters that sound alike, then convert the name into a series of numbers that can represent the name
Jaccard Similarity
http://en.wikipedia.org/wiki/Jaccard_Similarity_Coefficient
Cosine Similarity
http://en.wikipedia.org/wiki/Cosine_similarity
Dice Similarity
http://en.wikipedia.org/wiki/Dice%27s_coefficient
27
Copyright 2008 by CEBT 28
Packages
FLAMINGO Package
http://flamingo.ics.uci.edu/releases/2.0/
Copyright 2008 by CEBT 29
Additional Research Topics
Approximate string matching for Korean
한글은 초성 , 중성 , 종성으로 나누어지기 때문에 글자 단위가 아니라 , 이 같은 음소 단위로 처리 해야할 필요가 있음 .
동일한 발음을 가지는 ‘ㅔ’ ,’ ㅐ’ 등의 모음 처리 및 ‘ㄷ’ ,’ ㅌ’ 등의 자음 받침에 대한 고려가 필요함 .
What can it be used for?
검색어 추천 및 맞춤법 교정
한글 데이터가 포함된 Data mining
웹 상의 자료는 오타 , 맞춤법 오류 등이 많으므로 이를 고려해야 함
Recommended