34
Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui Intelligent Systems Research Group, BT Laboratories Hong Kong Baptist University

Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Embed Size (px)

Citation preview

Page 1: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Mining customer ratings for product recommendation using the support vector machine and

the latent class model

William K. Cheung, James T. Kwok, Martin H. Law,

Kwok-Ching Tsui

Intelligent Systems Research Group, BT Laboratories

Hong Kong Baptist University

Page 2: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

What is a Recommender System?

RecommenderSystem . . .

Records of other customers(possibly with ratings)

Page 3: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Product Recommendation in E-commerce

Products

Recommendations

www.amazon.com

Page 4: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Product Recommendation in E-commerce

Products

Recommendations

www.cdnow.com

Page 5: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Overview

Content-based Recommender

System

Personal Profile

CollaborativeRecommender

System. . .

Records of other customers(possibly with ratings)

Ratings Ratings

The Support VectorMachine (SVM)

The Support VectorMachine (SVM)

The Extended LatentClass Model (ELCM)

The Extended LatentClass Model (ELCM)

Page 6: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Presentation Outline

• Content-based Recommendation– Existing Solutions and Their Limitations– Our Proposed Solution - the SVM

• Collaborative Recommendation– Existing Solutions and Their Limitations– Our Proposed Solution - the Extended LCM

• Experimental Evaluation• Conclusion and Future Works

Page 7: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Content-based Recommendation• Matching between the personal profile and the

features extracted from product descriptions.• Assumptions:

– Customer personal profiles are available.

– Detailed product descriptions are available so that a set of representative features can be extracted.

– Both the profiles and the product descriptions share the same representation.

Content-based Recommender

System

Personal Profile

Page 8: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Some Existing Solutions

• Keyword Matching– problems of synonymy and polysemy.

• Pattern Classification Approaches– f={ f1(y), f2(y), … fm(y) } the set of features for product y

– ax(f(y)) the classifier output for customer x’s interest obtained via training, such that

yx 1 (y))(ax in interested is fyx 0 (y))(ax in interested NOT is f

– Examples of classifiers:• Naïve Bayes, k-NN, C4.5 (decision tree)

Page 9: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Feature Selection Problem

• The performance of content-based recommendation depends heavily on the discriminative power of the features selected to be extracted.– Too few features => hard to learn useful profiles

(shallow analysis)– Too many features => hard to estimate the

classifier’s parameters with good generalisation performance.

Page 10: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Our Proposed Solution - the use of SVM

• The Support Vector Machine has been shown to be able to achieve good generalisation performance for classification of high-dimensional data sets and its training can be framed as solving a quadratic programming problem.

• => ones can simply use all extracted features for the input and there is no need for feature selection at all.

Page 11: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Pattern Classification...

Page 12: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Which line is the best?(Training and Generalization)

Page 13: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

• Intuitively, maximize the margin between classes • Theoretically sound

– related to minimizing the VC-dimension under the theory of structural risk minimization

Support Vector Machine (SVM)

margin

Page 14: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Solving for the line

• Computationally, this leads to a quadratic programming problem– maximize a quadratic objective function

subject to some linear constraints– no local maximum (cf neural networks)

Page 15: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Support Vectors

• The line depends only on a small number of training examples.

Page 16: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Nonlinear Cases

• use another coordinates system such that the “curve” becomes a “line”

Page 17: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Kernels

• Only inner products, (x)T (y) , are involved in the calculation

• Under certain conditions, there exists a kernel K such that K(x,y)=(x)T (y)

– e.g. Polynomial of degree d: K(x,y)=(xTy+1)d

• replace xTy by (x)T (y)

Page 18: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Overlapping Cases

• Impossible to perfectly separates the two classes– Include an error term

• Instead of maximizing margin, minimize error + / margin

• Again, involves only quadratic programming

Page 19: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Collaborative Recommendation

• Matching between the customer’s ratings with the ratings of others (the word-of-mouth approach).

• Assumptions:– Customer ratings of a reasonably large group of

customers are available.

– Each product has been rated by some of the customers.

– The product ratings are overlapping to certain degrees.

CollaborativeRecommender

System

. . .

Records of other customers(possibly with ratings)

Product Ratings

Product Ratings

Page 20: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Some Existing Solutions

• Memory-based Approach– Pearson Correlation Coefficient

– … and its variants– suffer from the sparsity and the first-rater problems.

• Model-based Approach– solve the sparsity problem by incorporating a priori models.– E.g., Naïve Bayes Classifier, Bayesian Network, Latent

Class Model

)()()()(

)()(),(

BBT

BBAAT

AA

BBT

AABA

vvvvvvvv

vvvvxxw

Page 21: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Limitations• The sparsity problem (lacking sufficient ratings)• The first-rater problem (encountering new products)

A New Customer xn

Customer x1

Customer x2

Customer x3

5 4

-

- - - - - -

5 -4 - - - -

1 -- 4 4 - - -

5 -- -- - - -

Page 22: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Grouping Preference Ratings

A New Customer xn

Customer x1

Customer x2

Customer x3

5 4

-

- - - - - -

5 -4 - - - -

1 -- 4 4 - - -

5 -- -- - - -

- to solve the sparsity problem PreferencePattern #1

PreferencePattern #2

Recommended !Recommended !

Page 23: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Integrating Product Contents

A New Customer xn

Customer x1

Customer x2

Customer x3

5 4

-

- - - - - -

5 -4 - - - -

1 -- 4 4 - - -

5 -- -- - - -

PreferencePattern #1

PreferencePattern #2

- to solve the first-rater problem

Recommended !

Page 24: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Our Proposed Solution - the use of LCM

• The latent class model has been proposed by Thomas Hofmann et al. in IJCAI’99 for clustering preference ratings with promising results.

• Limitation: only capable of recommending products to customers in the training set.

• We extend their model so that– a) Existing products can be recommended to the

customers not in the training set– b) New products can be recommended to the existing

customers (not described in the paper).

Page 25: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Latent Class Model

CustomerX

ProductY

PreferencePattern Z

Observed Hidden

Model Training:Learn P(z), P(x|z) andP(y|z) usingthe EM algorithm. Themodel initialization isdone by theK-means clustering.

z

zyPzxPzPyxP )|()|()(),(

Page 26: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Existing Products to Existing Customers

• Compute the probabilities that x is interested in y

• Products can then be sorted according to the values

of P(y|x) for recommendation.

)|()()|(

)()|(

)|()|()|(

zyPzPzxP

zPzxP

zyPxzPxyP

z z

z

Page 27: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Extension 1: Existing Products to New Customers

z

nn zyPxzPxyP )|()|()|(

),()()|(

),()|(ˆ

),|(ˆ)|(

hnYy

h

hnYy

h

hnn

yxnzPzyP

yxnyzP

YxzPxzP

hh

hh

xn is not inside thetraining set. Thus, wedon’t have P(z|xn).

Inner product of the pdf of pattern z and the ratings of xn.

Page 28: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Extension 2: New Products to Existing Customers

z

nn zyPxzPxyP )|()|()|(

yn is not inside thetraining set. Thus, wedon’t have P(yn|z).

zzny

zny

n

nnn

zfyf

zfyfyzP

zP

yPyzPzyP

)()(exp

)()(exp)|(

)(

)(ˆ)|()|(

distance between yn and z in the feature space

Page 29: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Performance Measures

• accuracy: the percentage of correct recommendations • recall: the percentage of interesting products that can be

located in the output list• precision: the percentage of products in the output list which

are really interesting to the customer.• break-even point: The point where recall = precision

• expected utility:

– its value is high if the products rated high appear early in the output list.

jj

iji

dvR

)1/()1(2

)0,max( )( minmaxmin

iiii RRRRutility

Page 30: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Experiment One: Setup(content-based by SVM)

• Product ratings data set– EachMovie (from DEC)

• Product description data set– Internet Movie Database (http://www.imdb.com)– Size of feature set = 6620, including

• Release date, Runtime, Language, Director, Producer, Original music, Writing credit, ...

• No. of products = 1628 – 5-fold cross-validation– ~1200 for training and remaining for testing

• No. of customers = 100

Page 31: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Experiment One: Results(content-based by SVM)

Accuracy(%)

Break-evenPoint (%)

Utility (%)

SVM 77 80.3 65

Naïve Bayes 76 78.8 61

C4.5rules (100) 74 76.0 52

C4.5rules(400) 75 75.1 52

1-NN 69 76.2 45

C4.5(100) 74 - -

C4.5 (400) 74 - -

majority 75 - -

Page 32: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Experiment Two: Setup(collaborative by ELCM)

• Ratings data set– EachMovie (from DEC)

• Training– No. of products = 500– No. of customers = 90

• Testing– No. of customers = 10– No. of products = 250– Size of the product set where ratings are considered for

matching, L = {10, 63, 83, 125, 250}

Page 33: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Experiment Two: Results(collaborative by ELCM)

Accuracy (%) Break-even point(%)

Utility (%)L

No. oflatent

classes ELCM P-Corr ELCM P-Corr ELCM P-Corr

25061015

636263

6175.677.075.5

75.6576259

65

12561015

616262

6075.676.675.3

73.5576360

61

8361015

616062

5075.677.372.3

73.3586160

52

6361015

615860

5175.677.272.3

71.0596259

52

1061015

626162

4072.071.070.5

63.1565655

45

Page 34: Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin

Conclusion and Future Works

• SVM and ELCM are empirically shown to be promising for content-based recommendation and collaborative recommendation, respectively.

• Future works– ELCM

• Model Enhancement - BiELCM, hierarchical, ...

• Scalability issue of the EM algorithm for ELCM

• Modelling dynamic preference patterns

• Applications to cross-selling?

– Integration of SVM and ELCM for improvement