FlexRecs: Expressing and Combining Flexible Recom-mendations
IDS Lab. Seminar
Winter 2010
Minsuk Kahng
Jan. 8th, 2010
G. Koutrika, B. Bercovitz, H. Garcia-Molina
SIGMOD 2009
Center for E-Business TechnologySeoul National UniversitySeoul, Korea
Stanford InfoLab
Intelligent Database Systems Lab.
강 민 석
Copyright 2010 by CEBT
Abstract
FlexRecs Recommendation systems have a number of limitations.
Algorithm is hard-wired into the system.
propose a recommendation framework, FlexRecs
decouple the definition of a recommendation process
declaratively define recommendation process as a high-level workflow
comprise traditional relational operators and new operators
Prototype, flexible recommendation engine
Realize the proposed framework, FlexRecs
2
Contents
Introduction
Related Work
Recommendation Framework
System Architecture
Experiments
Conclusions
3
Copyright 2010 by CEBT
Introduction
Recommendation System Provide advices on movies, products, travel, and many other topics
Become very popular in systems
Google News, Amazon, MovieLens
Many recommendation approaches have been proposed.
4
Copyright 2010 by CEBT
Motivation
CourseRank Stanford InfoLab has developed CourseRank
A social tool that helps students to make informed choices about classes
수업 공식 정보 , 수업 게시판 , 학점 분포 , 수강 후기 , 시간표 , 4 년 시간표 , 추천 등
5
Copyright 2010 by CEBT
Motivation
Challenges The need for flexibility and expressivity
Initial version offered no choices 추천 결과는 n 개의 list 만 제공
추천된 n 개 중 1 개와 관련된 더 많은 추천 결과를 보고 싶어도 방법이 없음 .
type 을 제한하기 , 친구의 이력을 기반으로 추천하기 , 학점이 비슷한 사람의 추천 등 불가
The need for experimentation and higher productivity
여러 추천 기법의 통합 환경에 따라 방법 X 와 방법 Y 중 좋은 경우 다름
두 추천 방법에 적절한 weight 을 줘서 결합할 필요성
여러 추천 기법의 구현 time-consuming, counter-productive
not easily expandable and manageable
6
Copyright 2010 by CEBT
Introduction
Limitations of Recommendation Hard Wired
NOT expressed declaratively – algorithm typically embedded in the system code
Make it hard to modify the algorithms, or experiment with different ap-proaches
No Flexibility
추천 결과는 fixed. End users are given few choices
Users may expect diverse recommendations in different contexts.
Unable to request recommendations for user-defined constraints
Limited World Model
일반적으로 추천은 deal with two types entities: users & items
Provide recommendations using richer data representations is not straight-forward.
7
Copyright 2010 by CEBT
Introduction
FlexRecs, Proposed Framework Flexible Recommendations
to be easily defined, customized, processed over structured data
Decouples definition of recommendation process
Declaratively define recommendation process as a high-level workflow
Enable generating any recommendations with the same engine
Recommendation expressed as a high-level workflow
contain traditional relational operators
plus new recommendation operators
can handle data in relational form
Designers can create multiple, customizable workflows
Prototype flexible recommendation engine that realizes the frame-work
Execute a workflow over conventional DBMS
8
Copyright 2010 by CEBT
Contents
Introduction
Related Work
Recommendation Framework
System Architecture
Experiments
Conclusions
9
Copyright 2010 by CEBT
Related Work
Limitations of Recommendation Systems Algorithms are hard wired in the system code.
Design, implement, experiment with new methods can be time-consuming.
Generate only a predefined and fixed set of recommendations
기존 방법 ( 컨텐츠 기반 , CF) 의 문제점 해결하기 위한 여러 시도들 과거 이력에 지나치게 의존하는 문제 , cold-start problem in CF 등
But, may be required under different circumstances by different users
Limited World
많은 실제 app. 에서 reside much richer data in DB.
Different types of entities may co-exist in a single DB.
Current ones are not very expressive
Some extensions Incorporate multi-criteria ratings into recommendations
Language RQL
Allow users to formulate recommendation in a flexible manner
But, not very expressive because formulated on a pre-specified multi-D cube of ratings
10
Contents
Introduction
Related Work
Recommendation Framework
Data Model
Operators
Recommendation Workflow
System Architecture
Experiments
Conclusions
11
Copyright 2010 by CEBT
Data Model
Data Model Data reside in structured form, and particularly in relational form.
Focus on databases that follow relational model
Base Relation Database comprises a set of relations.
A Relation has a set of attributes
An attribute instantiated to a single value is called base attribute.
A relation with only base attributes called base relation.
12
Copyright 2010 by CEBT
Data Model
Extended Relation The authors introduce the concept of an extended relation.
Now, an attribute value can be a relation.
13
Copyright 2010 by CEBT
Data Model
Extended Relation Examples
can be thought as “views”
Generalized?
Model and Language could be generalized to arbitrary nesting No need for generality for practical scenarios
Materialized or not?
This issue if orthogonal to their definition. may not be stored in DB
14
Copyright 2010 by CEBT
Operators
Base Operators can operate on base and extended relations
Operators
Select select tuples from relation, for which the condition holds
condition refers only to base attributes
결과는 base or extended relation depending on 원래 type
Project project the relation into a smaller set of its attributes
A is a list of base, embedded or extended attributes
Join combine tuples in two relations that meet some condition
condition refers only to base attributes
about Nested Relation Algebra
Such generality is not necessary for practical recommendation.
15
Copyright 2010 by CEBT
Operators
The Extend Operator information that conceptually refers to entity is found in several re-
lations.
create extended attributes in the tuples of a relation
Example
Ratings made by each studentas a single “unit of information” per student
16
Copyright 2010 by CEBT
The Recommend Operator
Comparison function Recommendations are based on comparisons
e.g. Courses are rated by comparing their topics to student’s interests.
e.g. User-User similarity in CF
Have a library of comparison functions for recommendation tasks
Comparison Function
P 에는 기본적으로 attribute 가 들어갈 수 있음 .
17
Copyright 2010 by CEBT
The Recommend Operator
Comparison function Examples
Comparisons of string values – Jaccard similarity
Comparisons of numerical values – Simple Distance
Using conditional probabilities
Comparisons of extended values
Comparisons of single values to extended values
18
Copyright 2010 by CEBT
The Recommend Operator
Aggregation Comparison function Comparison functions compare one tuple to another tuple.
Desirable to compare one tuple to a set of tuples
Combine all partial values into a final one (e.g. max, avg)
Example
Weighted average of the partial comparison values
19
Copyright 2010 by CEBT
The Recommend Operator
Recommend Operator Score value of each tuple is produced by comparing it to other tu-
ples
Ri 의 tuple ri 을 Rj 의 모든 tuple 와 함수 cf을 이용하여 비교한 후
aggregation 함수 a을 이용해서 그 결과를 aggregate 한 결과가 value v
추천 후보인 Ri 의 tuple ri 각각에 대해 점수 값을 얻게 됨 .
Example
Alice 에게 course 을 추천
20
Copyright 2010 by CEBT
The Blend Operator
Blend Combine recommendations generated by two different processing
paths
e.g. 친구들이 들은 과목 기반 추천 + 졸업을 위해 필요한 과목 기반 추천
Blending methods
Occurrence-based blending
Normalized blending
Weighted average blending
21
Copyright 2010 by CEBT
Recommendation Workflows
Recommendation and Blend Operators capture the essence of most recommendation approaches
can be composed and combined with select, project, join to describe rec.
Recommendation Workflow
Examples
take several examples
가상의 학생 (user) Alice 가 요청
당연한 몇 가지 사항들은 제외 Alice 가 이미 소비한 item 들은 제외하기
22
Copyright 2010 by CEBT
Recommendation Workflows
Recommendation Workflow Examples Example 1 : Related Courses
Alice 는 현재 “ Programming: Part One”(C22) 과목에 대해 보는 상태
2008 년에 제공되는 과목 중 이 과목과 비슷한 과목을 추천하기
비교 함수로는 과목명 (Title) 에 대해 Jaccard Similarity 를 이용
23
CourseID Title Score
C23 Programming: Part Two 2/4 = 0.5
C25 Advanced Programming Methodology
1/5 = 0.2
C30 Computer Graphics 0/5 = 0
… … …
Copyright 2010 by CEBT
Recommendation Workflows
Recommendation Workflow Examples Example 2 : Content-based Recommendation
Alice(StudID=1234) 는 literature, writing 관련 과목들을 이미 수강한 상태
올해 (2008 년 ) 들을 과목을 그 동안 Alice 가 들었던 과목과 비슷하게 추천 받고자 함
24
Copyright 2010 by CEBT
Recommendation Workflows
Recommendation Workflow Examples Example 3 : Nearest-neighbor collaborative filtering
SuID=444 인 학생과 비슷한 취향의 학생을 찾아서 이들의 이력을 기반으로 추천
비슷한 취향의 학생의 점수를 많이 반영하여 각 과목에 대한 점수 도출 Course is rated by taking weighted average of the ratings provided by these stu-
dents.
Comparisons of single values to extended values
25
Copyright 2010 by CEBT
Recommendation Workflows
Recommendation Workflow Examples Example 5 : Blending
Ex.2 에서 구한 content-based 결과와 Ex.3 에서 구한 CF 결과를 blend 0.7 :1 의 비율로 반영
26
Copyright 2010 by CEBT
Recommendation Workflows
Recommendation Workflow Examples Ex. Many recommend and blend operators
과목 내용이 비슷한 학생 , 학점 (GPA) 가 비슷한 학생 모두 고려하여 추천
Ex. Classification
Alice 가 Honor Student 들과 얼마나 비슷한지 판단하여 Honor Student 여부 판단
Ex. Recommending a major
Course 외의 다른 item(major) 도 추천 가능
Ex. Item-to-item movie recommendation
Item based CF
27
Contents
Introduction
Related Work
Recommendation Framework
System Architecture
Architecture
Recommendation Plan Generator
Experiments
Conclusions
28
Copyright 2010 by CEBT
System Architecture
Architecture Workflow Manager
allow designer to define rec. workflows
Hide details
Workflow Parser
Construct an expression tree
Recommendation Plan Generator
Generate a rec. execution plan
Plan is a sequence of SQL and func. calls
Recommendation Generator
Execute a plan and returns the rec.
Send SQL to DB engine
29
Copyright 2010 by CEBT
Recommendation Plan Generator
Recommendation Plan Generation Build a recommendation plan by traversing an expression tree
Query 1 – similar users (create temporary in-memory table)
Query 2 & 3 – One Recommendation (Example 3)
Query 4 - Blend
30
Contents
Introduction
Related Work
Recommendation Framework
System Architecture
Experiments
Conclusions
31
Copyright 2010 by CEBT
Experiments
Objective Examine the feasibility and performance of flexible recommenda-
tions
Study different workflows with different characteristics
real data 사용
written in Java on top of MySQL
Workflow Collaborative Filtering
Major recommendation
Related courses
Friends-of-friends
more complex that content-based and CF ones
32
Copyright 2010 by CEBT
Experiments
Workflow Collaborative Filtering
모든 user 에 대해 다른 모든 user 와 similarity 구해서 추천할 때 , user 별 평균 시간
Gen time 은 SQL 생성 시간으로 수행 시간에 비해 얼마 걸리지 않음
User 수 증가에 따라 선형적으로 증가
comparison function 어떤 것을 쓰더라도 비슷한 결과
Summary easy to create multiple workflows and execute them transparently
over the same flexible rec. system that combines extensibility with reasonable performance
33
Contents
Introduction
Related Work
Recommendation Framework
System Architecture
Experiments
Conclusions
34
Copyright 2010 by CEBT
Conclusions
Contributions decouple the definition of a recommendation process
Introduce an extend operator that generates a virtual nested relation
define recommend & blend operators that capture essence of rec. work-flows
provide several examples that show how common rec. can be expressed
describe a prototype flexible recommendation engine that realizes the pro-posed framework
New operators can be compiled into standard SQL for execution.
present experimental results that show the potential of FlexRecs
Future Work make possible to study the optimization of multiple recommendation work-
flows
currently work on scaling over very large inputs
Automatically balance complexity and effectiveness and identify the best option
It would be interesting to define flexible rec. for XML or ontologies.
design appropriate user interfaces for enabling users express flexible rec.35
Copyright 2010 by CEBT
Discussion
Flexible
make “flexible”
Synergy
Decouple the Definition of Recommendation
Recommend operator 로 generalize
use Nested Relation
Nested Relational Model 을 이렇게 이용
실제로는 GROUP BY 쓰면 될 일
SQL
use conventional DBMS
지금도 SQL 을 이용한 추천 구현이 가능한데 , 성능 평가가 필요한지
36