6
A Cold-start Recommendation Algorithm Based on New User's Implicit Information and Multi-Attribute Rating Matrix YIN Hang, CHANG Guiran, WANG Xingwei School of Information Science and Engineering Northeastern University Shenyang, China [email protected] Abstract—Traditional collaborative filtering recommendation algorithms face the cold-start problem. A collaborative filtering recommendation algorithm based on the implicit information of the new users and multi-attribute rating matrix is proposed to solve the problem. The implicit information of the new users is collected as the first-hand interest information. It is combined with other rating information to create a User-Item Rating Matrix (UIRM). Singular Value Decomposition is used to reduce the dimensionality of the UIRM, resulting in the initial neighbor set for target users and a new user-item rating matrix. The user ratings are mapped to the relevant item attributes and the user attributes respectively to generate a User-Item Attribute Rating Matrix and a User Attribute- Item Attribute Rating Matrix (UAIARM). The attributes of new items and UAIARM are matched to find the N users with the highest match degrees as the target of the new items. The attributes of the new users are matched with UAIARM to find the N items with the highest match degrees as the recommended items. Experiment results validate the feasibility of the algorithm. Keywords-collaborative filtering; implicit information; attribute rating matrix; cold-start; recommendation algorithm I. INTRODUCTION With the popularization of Internet and e-Commerce, recommendation systems have become an important research area. Recommendation systems can be divided into two main types: content-based and collaboration-based. Content-based systems generate recommendations based on past preference information of the user to the target items. Whereas the collaborative filtering systems adopt statistical techniques to find the neighbor consumers with the same interest with the target consumer and then generate recommendations according to the selection of the neighbor consumers. Thus these systems are also called collaborative filtering recommendation systems based on user (user-based) [1]. With the increase of the amount of the users and items, traditional collaborative filtering recommendation algorithms face the problem of sparse user rating data which causes quick degradation in the quality of recommendations. This strategy also faces the cold-start problem. Many algorithms have been proposed to solve the problem of sparse user rating data, such as recommendation based on item (item-based) filtering [2]. But the problem is still not solved satisfactorily. In this paper, a collaborative filtering recommendation algorithm based on the implicit information of the new users [3] and Multi-Attribute Rating Matrix (MARM) is proposed to solve the problem. The implicit information of the new users is used to create the initial neighbor set [4] of the target user and the user attribute matrix, which together with the item attribute matrix can change the ratings given by the user to the item into the ratings to the item attributes and the ratings made from the user attributes to the item attributes. Then the User Attribute-Item Attribute Rating Matrix (UAIARM) and MARM are created. In this way, fuzzy identification to the similarity degree between a user and the neighbor users can be changed to accurate identification to that between some attributes. And then the user preference to the item attributes can be changed to the preference of some of the user attributes to some item attributes. The preference degree of the user attributes to the item attributes can effectively alleviate the sparse data problem [1,2] because different users can directly use the ratings made from some user attributes to some item attributes, without the need of the common rating data of the users to the items. II. RELATED CONCEPTS A. Collaborative Filtering Recommendation Algorithm Collaborative Filtering Recommendation Algorithm (CFRA) generates a recommendation list according to the user's opinion. The algorithm assumes that if the ratings of the users to some items are similar, then their ratings to other items will be alike too. CFRA searches the nearest neighbor users of the target user, and then creates recommendation list with the ratings by the nearest neighbor users [5]. Definition 1: D=(U,I,R) is the data source of the recommendation system, U={U 1 ,U 2 ,...,U m } is the set of basic users, |U|=m; I={I 1 ,I 2 ,...,I n } is the set of items, |I|=n; an N M u dimensional matrix R is the rating-matrix of the basic users to the items, and r means the rating by user i in U to item j in I. ij Because the number of users (m>1000) and items (n>1000) at a large-scale e-Commerce site is large and increasing continuously, R becomes a high dimensional matrix and there are no or only sparse common ratings to the items by the users. And thus the computation time of similarity is very long, which affects online recommendation to the user. In particular, the quality and efficiency of cold- start recommendation for new items are very low. 2009 Ninth International Conference on Hybrid Intelligent Systems 978-0-7695-3745-0/09 $25.00 © 2009 IEEE DOI 10.1109/HIS.2009.184 353

[IEEE 2009 Ninth International Conference on Hybrid Intelligent Systems - Shenyang, China (2009.08.12-2009.08.14)] 2009 Ninth International Conference on Hybrid Intelligent Systems

  • Upload
    xingwei

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Page 1: [IEEE 2009 Ninth International Conference on Hybrid Intelligent Systems - Shenyang, China (2009.08.12-2009.08.14)] 2009 Ninth International Conference on Hybrid Intelligent Systems

A Cold-start Recommendation Algorithm Based on New User's Implicit Information and Multi-Attribute Rating Matrix

YIN Hang, CHANG Guiran, WANG Xingwei School of Information Science and Engineering

Northeastern University Shenyang, China

[email protected]

Abstract—Traditional collaborative filtering recommendation algorithms face the cold-start problem. A collaborative filtering recommendation algorithm based on the implicit information of the new users and multi-attribute rating matrix is proposed to solve the problem. The implicit information of the new users is collected as the first-hand interest information. It is combined with other rating information to create a User-Item Rating Matrix (UIRM). Singular Value Decomposition is used to reduce the dimensionality of the UIRM, resulting in the initial neighbor set for target users and a new user-item rating matrix. The user ratings are mapped to the relevant item attributes and the user attributes respectively to generate a User-Item Attribute Rating Matrix and a User Attribute-Item Attribute Rating Matrix (UAIARM). The attributes of new items and UAIARM are matched to find the N users with the highest match degrees as the target of the new items. The attributes of the new users are matched with UAIARM to find the N items with the highest match degrees as the recommended items. Experiment results validate the feasibility of the algorithm.

Keywords-collaborative filtering; implicit information; attribute rating matrix; cold-start; recommendation algorithm

I. INTRODUCTION

With the popularization of Internet and e-Commerce, recommendation systems have become an important research area. Recommendation systems can be divided into two main types: content-based and collaboration-based. Content-based systems generate recommendations based on past preference information of the user to the target items. Whereas the collaborative filtering systems adopt statistical techniques to find the neighbor consumers with the same interest with the target consumer and then generate recommendations according to the selection of the neighbor consumers. Thus these systems are also called collaborative filtering recommendation systems based on user (user-based) [1].

With the increase of the amount of the users and items, traditional collaborative filtering recommendation algorithms face the problem of sparse user rating data which causes quick degradation in the quality of recommendations. This strategy also faces the cold-start problem.

Many algorithms have been proposed to solve the problem of sparse user rating data, such as recommendation based on item (item-based) filtering [2]. But the problem is still not solved satisfactorily.

In this paper, a collaborative filtering recommendation algorithm based on the implicit information of the new users

[3] and Multi-Attribute Rating Matrix (MARM) is proposed to solve the problem. The implicit information of the new users is used to create the initial neighbor set [4] of the target user and the user attribute matrix, which together with the item attribute matrix can change the ratings given by the user to the item into the ratings to the item attributes and the ratings made from the user attributes to the item attributes. Then the User Attribute-Item Attribute Rating Matrix (UAIARM) and MARM are created. In this way, fuzzy identification to the similarity degree between a user and the neighbor users can be changed to accurate identification to that between some attributes. And then the user preference to the item attributes can be changed to the preference of some of the user attributes to some item attributes. The preference degree of the user attributes to the item attributes can effectively alleviate the sparse data problem [1,2] because different users can directly use the ratings made from some user attributes to some item attributes, without the need of the common rating data of the users to the items.

II. RELATED CONCEPTS

A. Collaborative Filtering Recommendation Algorithm Collaborative Filtering Recommendation Algorithm

(CFRA) generates a recommendation list according to the user's opinion. The algorithm assumes that if the ratings of the users to some items are similar, then their ratings to other items will be alike too. CFRA searches the nearest neighbor users of the target user, and then creates recommendation list with the ratings by the nearest neighbor users [5].

Definition 1: D=(U,I,R) is the data source of the recommendation system, U={U1,U2,...,Um} is the set of basic users, |U|=m; I={I1,I2,...,In} is the set of items, |I|=n; an

NM dimensional matrix R is the rating-matrix of the basic users to the items, and r means the rating by user i in U to item j in I.

ij

Because the number of users (m>1000) and items (n>1000) at a large-scale e-Commerce site is large and increasing continuously, R becomes a high dimensional matrix and there are no or only sparse common ratings to the items by the users. And thus the computation time of similarity is very long, which affects online recommendation to the user. In particular, the quality and efficiency of cold-start recommendation for new items are very low.

2009 Ninth International Conference on Hybrid Intelligent Systems

978-0-7695-3745-0/09 $25.00 © 2009 IEEE

DOI 10.1109/HIS.2009.184

353

Page 2: [IEEE 2009 Ninth International Conference on Hybrid Intelligent Systems - Shenyang, China (2009.08.12-2009.08.14)] 2009 Ninth International Conference on Hybrid Intelligent Systems

B. The cold-start problem The cold-start problem for new item recommendation is

often referred to as "the first commentator problem", that is, “which user will be recommended with the new items ". The essential problem is the extreme sparseness of rating data [6].

Because the first user rating the new items cannot benefit from the action, introducing an incentive mechanism to encourage users to make a rating cannot solve the cold-start problem. A simple solution to the problem is to search its similar neighbor set based on the contents of the new items and use the average rating value of similar neighbors as the predicted rating, or to use formula (1) to compute the weighted approximation of the predicted rating [7]

1

1 ,

,|),(|

),(*

j

j ju

iujisim

jisimRP (1)

where is the predicted rating given to new item i by user u, n is the number of similar neighbors of item i,is the rating given to the similar neighbor item j by user u, and is the similarity between the new item i and the similar neighbor item j.

iuP ,

juR ,

),( jisim

But there are some shortcomings in the above method [7]: If the ratings of the similar neighbors of the new items are largely low, then the average rating will be low, then the new item cannot be recommended to the user. The method need to compare other items in the new item set based on content to get the similar neighbor set of the new item. But many of the specific contents of the items at e-Commerce sites are difficult or impossible to determine.

Three solutions are proposed in [8] to solve this problem. 1) Simple average method: Choose the average value of

the ratings of all the items given by the target user as the predicted rating given to the new items by the target user.

2) Statistical mode method: As the ratings given to the items by the users are usually similar, the statistical mode method can be used to predict the ratings given by the target users to the new items.

3) Information entropy method: Because the amount of information contained in the rating of each item is different, the effect on the recommendation will be different. Thus we can select some items based on information entropy, compute the average score the target user given to these items, and then use it as the predicted rating.

III. A COLD-START RECOMMENDATION ALGORITHM

A. Basic ideas During online shopping, the browsing behavior of a user,

including the first click or inquiries, the order of clicks, the repeated clicks and the visiting time, reflects the preference of the user to the goods. We can collect the implicit information of the user through the analysis on this behavior. The collection can be divided into server logs-based and client-based. The preference of a new user to different attributes can help to form the similar neighbor set [8].

At an e-Commerce website, the information of each item is stored in a relational database, and the attribute values of the item are stored as the column value of a two-dimensional data table. And the shopping of a user is usually because of his/her interest in certain attributes of the merchandise. Thus the attributes of the merchandise bought by the users reflect the implicit preference of the users. In this way, the ratings given by the user to the items can be transformed into a clearer rating matrix based on the attributes.

Similarly, the user information can also be transformed into attributes and stored into a two-dimensional data table, such as age group, the preferred style of movies, and the preferred actors. For old users, these can be identified by previous feedbacks. For new users, these can be identified by the implicit information collected from the analysis on the behavior. Different categories of user attributes usually correspond to different interests, thus different user attributes contain the preference of the user.

Therefore, the MARM constructed from the user attributes and the item attributes can transform the user preference to one item into the user preference to the attributes of the item, even into the preference of the user attributes to certain attributes of the item, and transform the fuzzy rating by the user into accurate attribute to attribute rating. It can reduce the sparse data problem and realize new item and new user recommendation as follows (see Figure 1).

1) Collect implicit information: a) Collect implicit information based on logs: The

website can get the user preference by historical data.b) Collect implicit information based on behavior:

The website can get the user preference by the browsing behavior of the user and the user preference degree. This method is suitable to the first time visitors [9].

2) Create the User -Item Rating Matrix (UIRM): Create a UIRM based on the implicit information of the user.

3) Obtain the initial neighbor set through SVD dimension reduction: SVD matrix factorization technology is used to reduce the dimension of UIRM, to get the tidy initial neighbor set, and to reduce computation time [10].

4) Create the MARM: a) Create the two-dimensional table of item attributes

and map the ratings by the target user and the neighbor users to the attributes of each item to create the User-Item Attribute Rating Matrix (UIARM).

b) Create the two-dimensional table of user attributes based on the user preference collected from the implicit information of the user and map the UIARM to the attributes of each user to generate the User Attribute -Item Attribute Rating Matrix (UAIARM).

c) Merge similar items in UAIARM to create the simple UAIARM MARM: The similarity degree of the user based on MARM makes it possible to compute the similarity between the users with no commonly rated items but with some commonly rated attributes, so as to reduce the sparse data problem and get the nearest neighbors of the target user, and then recommend items based on their rating.

354

Page 3: [IEEE 2009 Ninth International Conference on Hybrid Intelligent Systems - Shenyang, China (2009.08.12-2009.08.14)] 2009 Ninth International Conference on Hybrid Intelligent Systems

Figure 1. The filtering recommendation algorithm based on MARM 5) Recommend new item:

a) When a new item is added to the system, the recommendation system reads the item attributes and match them with UAIARM to get the attribute with highest rating, and then choose the users who have such attributes as the Recommended Users (RU).

b) Validate the recommendation: Match the attributes of the new items with UIARM to find the users giving the highest rating to the attributes and choose them as RU.

c) When a recommended user logins in, new items will be recommended immediately.

d) Once new items are given a rating by a user, then recalculate UIRM with the rating and update UIARM and UAIARM.

6) Recommend to new user: a) When a new user enters the system, the

recommendation system collects the implicit information of the user through the analysis on his/her actions and gets the attribute values of the new user, and then create UAIARM.

b) Match the attributes of the new user with UAIARM to find the item attributes with the highest rating and choose the items with such attributes as the Recommended Items (RI) to new users.

c) When one of the new users logins in, the recommended items will be recommended.

d) Once the new users give a rating to a RI, then recalculate UAIARM.

B. Collect Implicit Information of New Users 1) Build a three-dimensional model: Usually a three-

dimensional model is built to describe the relations between user, goods, and attribute. We can get the proportion of the browsing time on one item in the total time and use the

proportion of an attribute to get the composite score. The steps are as follows.

user's browsing behavior

system DB

UIRM

the initial neighbor set(based on SVD)

MARM

UAIARM

UIARM

add new item

new item’s attribute

match the item’s attribute

chooserecommendations

audience

recommend items

add new user

new user’s attribute

match the user’s attribute

shooserecommended

items

recommend items

item's attribute

table

user'sattribute

table

user’s rating

attributesdecomposition

amendments of rating matrix

STEP 1 The time of the sth browsing by user i for attribute k of item j is as shown in equation (2): )(ij

skt

)(

1

)(

)()( ij

skK

k

ijsk

isij

sk ff

tt

j

(2)

where is the time user i spent at the sth browsing. )(ist

STEP 2 Calculate the implicit rating by the user to the attributes of the items with formula (3).

iS

s

is

ijk

ijk

t

tR

1

)(

)( (3)

STEP 3 Calculate the User-Item Attribute Matrix about the proportion of attributes . It reflects the preference degree of all users to attribute k of item j.

}{ jkwW

jK

k

ijk

I

i

I

i

ijk

ijk

t

tW

1

)(

1

1

)( (4)

where I is the number of users and k is the number of attributes of item .

jK

STEP 4 is the composite score given by user i to item j:

ijR

(5) jk

K

kijkij wRR

j

1

2) Combine the user attributes for implicit rating to the item attributes: Import the user attributes and represent them by vector U (u1, u2, u3, u4, u5 ).

Define a UAIARM which reflects the rating given by the first attribute of user i to the kth hidden attribute of item j. Because the same types of users and items usually have the same attribute set, we analyze the preference degree given by a certain type of user attributes to a certain type of item attributes to get the example UAIARM MARM.

}m1{1 ijkM

Define a MARM which reflects the average rating given by attribute i of users to attribute k of items, as shown in Table 1.

}m{ lksSM

TABLE 1. User Attributes-Item Attributes Rating/Preference Table item attributes user attributes f1 f 2 f3 f4 f5 f6 f7

U1 1 0 0 0 1 0 1U2 1 1 1 0 0 0 1… … ... … … … … …Un 0 0 0 0 1 1 1

C. SVD Dimensionality Reduction of UIRM An nm matrix can be decomposed into three matrices

by SVD: VSUR . U and V are an nm orthogonal matrix and an rn orthogonal matrix, respectively.

),,...,,( 21 rdiagS is an rr diagonal matrix

355

Page 4: [IEEE 2009 Ninth International Conference on Hybrid Intelligent Systems - Shenyang, China (2009.08.12-2009.08.14)] 2009 Ninth International Conference on Hybrid Intelligent Systems

(r<=min(m, n)), whose diagonal elements are the singular values and satisfy )...( 21 r

. Under the Frobenius norm, the best low dimensional approximation matrix of the original matrix R can be constructed through SVD. So it can be used to get a reduced matrix by retaining the k largest diagonal elements (the singular values) in matrix S. Processing matrix U and V, we can get a reconstructed matrix

kkkk that is most similar to matrix R

and has the rank k. When , it will have the smallest Frobenius norm in all matrices with rank k. K can be set with the method introduced in [10].

)( rkS k

VSUR

RRk

Before reducing the dimensionality of matrix R with SVD, we need to fill all the elements of each column of R with the average value of this column to solve the sparseness problem. And each row of R must be normalized to reduce the impact on the calculation of the users who have a large number of items to be evaluated, that is to use

iji RR , (

iR is the average score given by user i) to replace the original score . And then construct the matrix as follows:

jiR , normRa) Use SVD to decompose matrix R to get

matrix ;normRb) Reduce S to a k-dimensional matrix to get the

matrix ;)( rkS k

c) Deal with matrix U to get kU

d) Calculate the square root of to get kS 2/1

kSe) Calculate the matrix : is the k-

dimensional space representation of the m users, and the size of the matrix is reduced.

2/1kk SU 2/1

kk SU

D. Construct MARM At the business Website, two-dimensional tables for item

attributes and user information are used to decompose the rating matrix. The two-dimensional table for item attributes is usually as shown in Table 2.

TABLE 2. Two-Dimensional Table for Item Attributes f(1) f(2) f(3) …. f(j)

I(1) Attri(1,1) Attri(1,2) Attri(1,3) … Attri(1,j)I(2) Attri(2,1) Attri(2,2) Attri(2,3) … Attri(2,j)… … … … … …I(i) Attri(i,1) Attri(i,2) Attri(i,3) … Attri(i, j)

The structure of the user attributes table is similar. In table 2, the row is used for items or users and column is used for item or user attributes, and Attri (i, j) is the attribute value.

The UIARM is created first. The item attributes are mapped to the scores in UIRM. The scores for the items are transformed to the scores for all the attributes of the items. Let Attri (i, j) be the score given by user u to attribute j of item i, then take the average of all these values as the score given by user u to attribute j of all items:

N

jiAttriuRjuP

N

nn

1)),(,(

),( (6)

where N is the total times that user u gives a score to attribute j, is the corresponding score. )),(,( jiAttriuRn

Then the UAIARM is created. The user attributes are mapped to the scores in UIARM. The scores given by the users to the item attributes are assumed to be the sum of the scores given by the attributes of each user to each attribute of the items. Let Attri(r, i, j) be the score given by attribute r of user u to attribute j of item i, then take the average of all these values as the score given by attribute r of user u to attribute j of all items:

N

jirAttriruRkjruP

N

nn

1

)),,(],[(])[],[( (7)

where u[r] is attribute r of user u, N is the total times that attribute r of user u gives a score to attribute j of all items, and is the corresponding score. )),,(],[( jirAttriruRn

Finally, UAIARM is simplified. Sum the P(u[r], j[k]) values of all users (totally M) to get the composite score, i.e., sum all the scores given by attribute r of all users to attribute j of all items. The simplified formula is as follows:

M

kjruPjrP

M

m 1

])[],[(),( (8)

Because only a small number of elements have score values in the final MARM, the computation time to find a neighbor set with this matrix is small.

E. Cold-start Recommendation Based on UAIARM 1) New item recommendation: When a new item is

added to the system, the recommendation system reads the item attributes and match them with UAIARM to get the attribute with highest rating, and then choose the users who have such attributes as the backup recommended-users U1. At the same time, the new item attributes are matched with UIARM and users who give a highest score to these attributes are chosen as backup recommended-users U2. The two sets are validated and intersected to get the candidate recommended user set.

Algorithm 1 PROGRAM2 (I, UGDB, GAT, UAT, T) Input: New items I, User-Item Rating Database UGDB,

item attribute table GAT, user attribute table UAT, the number of recommended items T.

Output: The recommended user set for new item I. Steps

Create the UIRM by UGDB, GAT and UAT; Map the item attribute table to get UIARM; Map the user attribute table to get UAIARM; Match the new item attribute I[j] with Attri(r, i, j) in UAIARM, choose attribute r of user u which gives the highest score to attribute j of item I, and insert it into the user attribute set R={r1,r2,..rk};Calculate the score of users with the set R and the weight of the attribute and choose the first k users as the backup recommended user set U1= {u1, u2... uk};Match the new item attribute I[j] with Attri(i, j) in UIARM, choose user u who gives the highest score to attribute j of item I, and insert it into the backup recommended user set U2={u1,u2,..uk};

356

Page 5: [IEEE 2009 Ninth International Conference on Hybrid Intelligent Systems - Shenyang, China (2009.08.12-2009.08.14)] 2009 Ninth International Conference on Hybrid Intelligent Systems

Validate the set U1 and U2 and get the intersection of them, U={u1,u2,..uy}, as the candidate recommended user set, y is the number of users; If T y, then take the entire candidate recommended user set I as the final recommended user set, while if T<y, then use formula (2) to rate the candidate recommended user set and choose the first T items as the final recommended item set.

2) New user recommendation:When a new user enters the system, the recommendation system collects the implicit information of the user through analysis on his/her actions and gets the attribute values of the new user. The attributes of the new user are match with UAIARM to find the item attributes with the highest rating and the items with such attributes form the recommended item set for the new users.

Algorithm 2: PROGRAM3 (U, UGDB, GAT, UAT, T) Input: New user U, User Item Rating Database UGDB,

item attribute table GAT, user attribute table UAT, the number of recommended items T.

Output: The recommended item set of new user U. Steps:

Create the UIRM by UGDB, GAT and UAT; Map the item attribute table to get UIARM; Map the user attribute table to get UAIARM; Match the new user attribute U[r] with Attri(r, i, j) in UAIARM, choose attribute j of item I which is given the highest score by attribute r of user u, insert it into the item attribute set J={j1,j2...jk};Calculate the score of items with the set J and the weight of attributes and choose the first y items as the candidate recommended item set I={i1,i2...iy}, y is the number of items; If T y, then take the entire candidate recommended item set I as the recommended item set, while if T<y, then use formula (2) to rate the candidate recommended item set and choose the first T items as the recommended item set.

F. Cold-start Recommendation Based on MARM Consider the new item/user recommendation. When a

new item or user enters the system, it analyzes the attribute table and/or collect the implicit information of the user, gets the new item/user attribute values. Then the new attributes are matched with MARM to get the attributes with the highest score, and the users/items with such attributes form the recommended user set or recommended item set.

Algorithm 3 PROGRAM4 (D, UGDB, GAT, UAT, T) Input: new user/item D, User-Item Rating Database

UGDB, item attribute table GAT, user attribute table UAT, the number of recommended items or recommended users T.

Output: The recommended user set or item set. Step

Create the UIRM by UGDB, GAT and UAT; Map the item attribute table to get UIARM; Map the user attribute table to get UAIARM; Simplify UAIARM to get MARM; Get the attribute set A1= {a1, a2...ak} from D;

Regard A as i, j and match them with P(r, j) in MARM, choose attribute j or r with highest score, and insert it into the attribute set A2= {a1, a2...ak};Calculate the score of items/users with set A2 and the weight of attributes, and choose the first y items/users as the candidate data set I= {i1, i2...iy}, y is the number of the data; If T y, then take the entire candidate data set I as the recommended item set or recommended user set, while if T<y, then use formula (2) to rate the candidate data set, and choose the first T items as the recommended item set or recommended user set.

IV. EXPERIMENT RESULTS AND ANALYSIS

The performance of the proposed algorithms is evluated through experiments and is compared with other collaborative filtering recommendation algorithms.

A. Experimental environment and data The hardware platform is a PC with Pentium 4 2.4GHz

CPU and 1G RAM, the Database is SQL Server2000, the development tools include PowerBuilder10.0, SVD, and Matlab. MovieLenS data set is used as the experiment data (http:/www.grouplens.org/). The data set used in the experiments contains 14014 score data that 120 users scored to 212 film. Each user scores at least 10 movies.

The data set contains five disjoint subsets. Each time a subset is selected as the test set and the other four as the base data set, resulting in five pairs of base data sets and test data sets. 5-fold cross-validation method is used in the experiments. Each time, a pair of base data set and test data set is chosen, the records in the base data set are used as basic users, and the recommendation algorithms are tested with the target users. The average deviation computed after 5-fold cross-validation is regard as the experiment result [11].

B. Performance Evaluation CriteriaThere are two main criteria for evaluating the quality of

the result of a recommendation system: statistical accuracy measurement and decision support accuracy measurement [12]. The mean absolute error (MAE) method used in this experiment belongs to statistical method which can directly measure the quality of the recommendation.

Definition 2 Assume the predicted rating set of the target users is {p1, p2...pn}, the corresponding real rating set is {q1, q2...qn}, then the mean absolute error (MAE) is defined as:

N

qpMAE

N

nii

1 (9)

MAE measures the accuracy of the prediction by calculating the deviation between the predicted score of the target users and the actual score. Thus, the smaller the MAE is, the higher the quality of recommendation [13].

C. Experiment Results and Analysis The following algorithms are used in the experiments for

comparison: the traditional collaborative filtering recommendation algorithm (user-based), a simple

357

Page 6: [IEEE 2009 Ninth International Conference on Hybrid Intelligent Systems - Shenyang, China (2009.08.12-2009.08.14)] 2009 Ninth International Conference on Hybrid Intelligent Systems

recommendation algorithm based on item attributes (based on UIARM), the cold-start recommendation algorithm based on UAIARM, and the cold-start recommendation algorithm based on MARM proposed in this paper.

1) Experiment results of new item cold-start recommendation: 25 new items are added in the experiment, and recommendations are made with algorithm 1 based on UAIARM and algorithm 3 based on MARM, respectively. The results are shown in Figure 2.

Figure 2. New items recommended by different algorithms

The results show that both the algorithm based on UAIARM and the algorithm based on MARM successfully make the recommendation in case of cold-start, and the algorithms based on MARM generate a good number of recommendations when there are only two neighbors.

2) Experiment results of new user cold-start recommendation: 22 new users enter the system in the experiment. The results are shown in Figure 3.

Figure 3. New users recommended by different algorithms

The results show that the recommendation algorithm based on MARM can generate a good number of recommendations when there are only two neighbors.

V. CONCLUSION

A cold-start collaborative filtering recommendation algorithm based on the implicit information of the new user and multi-attribute rating matrix is proposed to solve the problem with cold-start and sparse data. UIARM is used to measure the similarity between the users and to reduce the data sparseness effectively. The new item attributes are matched with the user attributes in UAIARM, the user attribute with the highest score is regard as the necessary parameter of the recommended users, and thus a new item cold-start recommendation is achieved. Similarly, the new user attributes obtained by analyzing the browsing behavior

of the users are matched with the item attributes in UAIARM, the item attribute with the highest score is regard as the necessary parameter of the recommended items, and thus a new user cold-start recommendation is achieved. And the extreme case that recommends a new item to a new user can be handled. The attributes can also be matched with UIARM in order to verify the accuracy and improve the quality of the recommendation.

In addition, the simplified UAIARM, MARM, can provide better recommendations with the relationship between the user attributes and the item attributes.

ACKNOWLEDGMENT

This work is supported by the National Natural Science Foundation of China under Grant No. 60673159 and No. 70671020; the Key Project of Chinese Ministry of Education under Grant No. 108040; Specialized Research Fund for the Doctoral Program of Higher Education under Grant No. 20060145012 and No. 20070145017.

REFERENCES

[1] J. B. Schafer, J. A. Konstan and J. Ried, “E-Commerce Recommendation Applications [J],” Data Mining and Knowledge Discovery, 2001, 5(1-2), pp. 115-153.

[2] J. B. Schafer, J. A. Konstan and J. Ried, “Recommender Systems in E-Commerce // Proceedings of the ACM Conference on Electronic Commerce,” New York, ACM Press, 1999, pp. 158-166.

[3] ITU-T Recommendation H. 323(Version 4), “Packet-Based Multimedia Communications Systems[S],” 2000.

[4] ITU-T Recommendation H.225.0, “Call signaling protocols and media stream packetization for Packet-Based Multimedia Communications Systems[S],” 2002.

[5] X. H. Sun, “The Sparseness of Collaborative Filtering Communications Systems and Cold-Start Problem [D],” Hangzhou, Zhejiang University, 2005.

[6] S. Deerwester, S. T. Dumais, G. W. Furnas, “Indexing by Latent Semantic Analysis[J],” Journal of the American Society for Information Science, 1990, 41(6), pp. 391-407.

[7] C. Li, C. Y. Liang, “A Collaborative Filtering Recommendation Algorithm Based on Attributes-value Preference Matrix [J],” Intelligence Journal, 2008, 27(7), pp. 884-890.

[8] Y. H. Jiang, L. Q. Gao, “The Research of Implicit Information Collection in Personalized Recommendation System,” Theoretical exploration, 2006, 11.

[9] L. Q. Gao, L. Z. Li, “The Recommendation Algorithm Based on Customers' Behavior [J],” Computer Engineering and Applications, 2005, (3), pp. 188-190.

[10] B. M. Sarwar, G. Karypis, J. A. Konstan, “Application of Dimensionality Reduction in Recommender System-a case study[R],” Minneapolis, MN: University of Minnesota, 2000.

[11] J. Basilico, T. Hofmann, “Unifying collaborative and content-based filtering [C]//Proc. of the 21st International Conference on Machine Learning,” New York: ACM Press, 004: 65-72.

[12] R. Jason, S. Nathan, “Fast maximum margin matrix factorization for collaborative prediction [C], //Proc. of the 21st International Conference on Machine Learning,”, New York: ACM Press, 2005, pp. 71-719.

[13] K. Yu, A Schwaighofer, V. Tresp, “Probabilistic memory base collaborative filtering [J],” IEEE Transaction on Knowledge and Data Engineering, 2004, 16(1), pp. 56-69.

358