User profile as a bridge in cross-domain recommender systems …static.tongtianta.site/paper_pdf/2edf5326-9fbb-11e9-b630... · 2019-07-06 · In the past two decades, recommender

https://doi.org/10.1007/s10489-018-01402-3

User profile as a bridge in cross-domain recommender systemsfor sparsity reduction

Ashish Kumar Sahu1 · Pragya Dwivedi1

© Springer Science+Business Media, LLC, part of Springer Nature 2019

AbstractIn the past two decades, recommender systems have been successfully applied in many e-commerce companies. One ofthe promising techniques to generate personalized recommendations is collaborative filtering. However, it suffers fromsparsity problem. Alleviating this problem, cross-domain recommender systems came into existence in which transferlearning mechanism is applied to exploit the knowledge from other related domains. While applying transfer learning, someinformation should overlap between source and target domains. Several attempts have been made to enhance the performanceof collaborative filtering with the help of other related domains in cross-domain recommender systems framework. Althoughexploiting the knowledge from other domains is still challenging and open problem in recommender systems. In this paper,we propose a method namely User Profile as a Bridge in Cross-domain Recommender Systems (UP-CDRSs) for transferringknowledge between domains through user profile. Firstly, we build a user profile using demographical information of a user,explicit ratings and content information of user-rated items. Thereafter, the probabilistic graphical model is employed tolearn latent factors of users and items in both domains by maximizing posterior probability. At last prediction on unrated itemis estimated by an inner product of corresponding latent factors of users and items. Validating of our proposed UP-CDRSsmethod, we conduct series of experiments on various sparsity levels using cross-domain dataset. The results demonstratethat our proposed method substantially outperforms other without and with transfer learning methods in terms of accuracy.

Keywords Cross-domain recommender systems · Recommender systems · Transfer learning · User profile ·Matrix factorization

1 Introduction

With development and explosion of Internet technologiesand continuous growth of accessibility of the World WideWeb, the amount of digital information generated byhumans increases exponentially. We become more easilyoverwhelmed by huge amount of information and unable tofind what we really desire. Recommendation Systems (RSs)[1, 2] are the software tool that help us to find most relevantitems out of millions of items in the database. Severaltechniques [3–5] are used for generating personalizedrecommendations, among them Collaborative Filtering (CF)

� Ashish Kumar [email protected]

Pragya [email protected]

1 Motilal Nehru National Institute of Technology Allahabad,Prayagraj, 211004, India

is one of the most promising techniques in recent years. CFfocuses on user preferences data which are provided by theuser explicitly such as, numerical ratings, like/dislike, etc.It can be classified into two categories: memory-based andmodel-based. The former category of method focuses onsimilarity strategy between co-rated users or items, followedby TopK Neighbors (kNN) selection and then weightedaverage strategy is used for prediction. There exist multiplevariations [5–8] of memory-based CF, these are based onthe similarity measure modification and the topK neighborselection.

In the case of model-based CF or Latent Factor Model(LFM), firstly the model is constructed, and then predictioncan be made. One of the leading methods in this categoryis Matrix Factorization (MF) [9] in which both items andusers characterize in small number of factors that infer fromthe user-item rating matrix. Multiple variations [10–14] ofMF have been proposed by several researchers with theirlimitations.

Although CF gains great success in recent years, oneof the major problems is data sparsity because users provide

Applied Intelligence (2019) 49:2461–2481

Published online: 19 January 2019

http://crossmark.crossref.org/dialog/?doi=10.1007/s10489-018-01402-3&domain=pdf

http://orcid.org/0000-0003-4731-961X

mailto: [email protected]

mailto: [email protected]

A. K. Sahu and P. Dwivedi

their feedbacks (in terms of numerical ratings) in limitednumber of items out of millions of items. Using sparse datafor generating the predictions followed by recommenda-tions, the system may degrade the performance. Addressingthis problem of CF, several authors have provided their solu-tions by using additional feedbacks with in-domain such aslikes/dislikes [15, 16], users reviews [17], history records[18], etc. and tried to mitigate the sparsity problem.

Rather focusing on heterogeneous feedbacks with in-domain, another solution to this problem is Cross-domainRecommender Systems (CDRSs) [19–22] in which lever-aging the knowledge from other related additional domains(source domains) to improve the performance of the tar-get domain. This type of strategy in CDRSs can be doneby transfer learning mechanism [23]. Transfer learning is anew paradigm of machine learning which aims to transfersome useful knowledge from one or more source domainsto the target domain in order to increase the performance ofthe target domain with some assumptions. While applyingtransfer learning in CDRSs, two assumptions are: 1) datasparsity of a source domain must be dense compare to tar-get one 2) some information must overlap between sourceand target domains. According to [24], CDRSs model canbe categorized into three types based on the overlappingof users/items between source and target domains, i.e.,fully users/items overlap, partially users/items overlap andnon-overlap users and items.

This paper focuses on the third category of CDRSsmodel, i.e., non-overlap of users and items. In this category,two approaches have been proposed: tags informationtransfer [25–28] and rating pattern knowledge transfer [19].In the first approach, tags (a lightweight user review in textform) should be common in both domains. A limitation oftag-based transfer learning methods in CDRSs model is thatit is too expensive to find overlap tags between domains.

Another approach of non-overlap users and items cate-gory is rating pattern transfer knowledge. Li, et al. [19] haveproposed Codebook Transfer (CBT) method wherein com-pact rating pattern extracted from a source domain and thentransferred to the target domain. The authors have focusedon numerical rating feedbacks only and assumed that bothdomains share the same rating distribution. But in a practi-cal scenario, every domain has its rating distribution that wecan not assume to be the same.

In this paper, we propose a novel method namely UserProfile as a Bridge in Cross-domain Recommender Systems(UP-CDRSs) for exploiting the knowledge from sourcedomain in order to enhance prediction accuracy of the targetdomain. This paper focuses on an additional domain inwhich neither users nor items overlap between domains,and tag information is also not available in both domains.According to [23], some information should overlapbetween domains, therefore, for building the bridge, we

focus on implicit information of users as well as items.In case of user side internal information, it may bedemographical data such as user’s age, gender, occupation,etc., and content information of an item (the set of genresif a movie as an item) as item side internal information,for instance, if we focus on two different movie domains,these types of information may common in both domains.The novelty of proposed method with existing methodsis that we can be able to exploit the knowledge of anadditional domain even no explicit information overlapbetween domains, i.e., overlap users and items, overlaptags, etc.

In our proposed method, we combine both domainsinformation through graphical model theory [29] in whichuser latent factors, item latent factors, rating and similarityof users profile between distinct domains work as randomvariables. We then solve the graphical model and maximizethe posterior probability of user latent factors and item latentfactors of both domains.

The proposed method consists four fold: firstly, webuild a user profile in both domains using demographicalinformation of a user, explicit ratings (provided by auser) and content information of user-rated items. We thencalculate the similarity between users profile of distinctdomains. After that, we build the probabilistic graphicalmodel where each node represents a random variable,and links express probabilistic relationships between thesevariables. Random variables in our proposed UP-CDRSsmethod are: a vector of user latent factors, a vector of itemlatent factors, rating information and similarity of usersprofile. We solve the probabilistic graphical model the learnlatent factors of users and items in both domains. In last fold,prediction is estimated on unrated items through an innerproduct of corresponding latent factors of users and items inthe target domain. The major contributions of our work canbe summarized as follows:

– Presenting a new CDRSs method for exploiting theknowledge from a source domain in order to enhancethe accuracy of the target domain. The knowledge istransferred in terms of users profile which are built fromexplicit ratings, internal information of users and items.

– Applying transfer learning in CDRSs, we need someoverlap information between the domains. The noveltyof our proposed method is that we focus on non-overlapusers and items category. Hence, we extract implicitinformation of users and items for mapping the usersprofile in distinct domains.

– To the best of our knowledge, this is the first attemptin CDRSs model in which knowledge is transferredin terms of internal information of users and items byprobabilistic graphical model where each node worksas random variable. After solving the graphical model

2462

User profile as a bridge in cross-domain...

by maximizing the posterior probability, we are able toextract latent factors of users and items more precisely.

– Experiments are done on CDRSs non-overlap usersand items dataset. The proposed method is comparedwith and without transfer learning methods. In withouttransfer learning, we focused only the target domainrating matrix and apply their approaches. In case oftransfer learning methods, we used two rating matrices(source and target) domains data as training set andapplied their approaches. We used two evaluationmatrices and compared all existing related works withour proposed UP-CDRSs method.

The rest of this paper is organized as follows. In Section 2,we introduce some basic definitions, mathematical nota-tions and the problem formulation. The literature review isdescribed in Section 3 where we will briefly describe RSs,CDRSs in transfer learning framework followed by relatedworks dealing with the problems of sparsity. In Section 4,we describe our proposed UP-CDRSs method, after that weshow results of experiments those are conducted on cross-domain dataset to verify the effectiveness of the proposedmethod in Section 5. Finally, we conclude our work andprovide future direction in section Section 6.

2 Basic definitions, notations and theproblem formulation

In this section, we describe some basic definitions inSection 2.1, followed by mathematical notations thoseare used in this paper in Section 2.2 and the problemformulation is described in Section 2.3.

2.1 Basic definitions

– Matrix factorization (MF): MF is one of the model-based CF methods. The small assumption of MF is thatinteraction between users and items is governed by asmall number of hidden factors called latent factors.Therefore, the user and the item described in a vectorform and, the size of a vector is equal to the number ofhidden factors. For example, in a movie recommendersystem, each movie measures the distribution of latentfactors (e.g., Science Fiction, Comedy, Action, Love,etc.), and each user represents a user taste on thoselatent factors. So, estimate overall user taste by an innerproduct of corresponding user and item latent factors.

– Transfer learning: Transfer learning [23] or knowl-edge transfer is a part of machine learning frameworkthat reapplies knowledge acquired from one or moresource domains to the target domain in order to improve

the performance of the target domain though we haveless amount of recorded data for learning the model.While applying transfer learning framework, informa-tion should overlap between domains.

In mathematics notation, the definitions of domainand task are: A domain consists twoterms: A feature space , and marginal probability distri-bution P(X), where .We can differentiate domains through both terms, i.e., iftwo domains are different, then they may have differentfeature space or different marginal probability distribu-tion. In terms of machine learning framework, a datasetis domain and using this, we have two components: alabel space Y and an objective function f (·). Secondterm, A task T = {Y, f (·)}, we have to learn an objec-tive function using training data which consists{xi , yi },where xi ∈ X and yi ∈ Y . After that, learned functionf (·) used for prediction of a new instance x with corre-sponding label.

– Cross-domain recommender systems (CDRSs):CDRSs [20] are new framework for recommendersystems to mitigate the problems of traditional RSstechniques. It uses additional information from one ormore source domains to improve the recommendationsquality of the target domain.

– User profile: A user profile is a representation ofknowledge and personal characteristics of a user. Theprofiling information can be elicited from demograph-ical information (e.g., user’s age, gender, occupation,etc.), user-rated items, and content information suchas movie title, genre, director, etc.(in case of moviedomain) of those rated items. A user profile representsin form of vector (refer in Fig. 7).

2.2 Notations

In this subsection, we describe notations which are usedthroughout the paper. The list of notations as follows:

Ri, j Prediction on item j to user i of the target domainDk Domian kS(

Psi ,Pt

i ′) Similartiy between users profile i and i’ in

domains and t, respectivelyPki User profile of user i in domain k

Rk ∈ RNk×Mk

User-item rating matrix of domain kRki, j Rating provided by user i on item j in domain k

Mk Number of users in domain kNk Number of items in domain kUki ∈ R

1× f A row vector of latent factor of user i indomain k

f Number of latent factors

2463


I k ∈ RNk×Mk

Binary mask of rating martix Rk

V kj ∈ R

1× f A row vector of latent factor of item j indomain k

2.3 Problem formulation

This subsection describes the motivation behind ourproposed method in the CDRSs framework. How transferlearning is being done and what knowledge of sourcedomain is being exploited wherein users and items do notoverlap, the answers of both questions are described byusing an example.

Our solution for establishing a bridge between domainsis user profile. As we mentioned earlier, a user profile isa representation of knowledge and personal characteristicsof a user. We can correlate two users of distinct domainsthrough users profile. If both users of distinct domains havesimilar users profile, we can say that the behavior of bothusers also be similar. For instance, user usi and user uti ′ havePsi and Pt

i ′ users profile. If both are similar in terms of usersprofile, then similarity value of users profile S(

Psi ,Pt

i ′) tend

to be 1.Another intuition is based on LFM model in which user’s

characteristics are represented in vector form. A vector oflatent factors represents how much his/her likes or dislikeson each of latent factor. If two users belong two distinctdomains and latent vectors of both users have similarlikes/dislikes on all corresponding latent factors, then aninner product of both vectors tend to be 1. For instance, userusi and user uti ′ have Us

i and Uti ′ vectors of latent factors,

respectively, then Usi U

tTi ′ ≈ 1.

Two users(usi , u

ti ′)

have similar users profile and alsohave similar likes/dislikes in form of latent factors, then thedifference S(

Psi ,Pt

i ′) - Us

i UtTi ′ ≈ 0 . If the difference is not

tend to zero, error may present and it can be minimizedusing only update latent factor values of users because userprofile values are static.

According to [23], applying transfer learning mechanismto use the knowledge of source domain, some informationmust overlap. In our case, users and items both disjoint but

content information of items and demographical data of auser overlap. So, if both types of information also disjoint,we can not exploit the knowledge of source domain.According to our formulation, similarities between all usersprofile of distinct domains would be zero and our solutionof cross-domain would be converted into a single-domainRSs.

We take an example for better understanding. Figure 1shows the user-item matrices of both domains. Source andtarget domains are movie and book, respectively. Thereare five users and six items in both domains. Ratings arein numerical form{1, 2, 3, 4, 5, ?}, where ’?’ representsunrated items’ rating. In both domains, we initialize thelatent factor values of all users. Figure 2 shows users’ latentfactors where the size of f is 5. Additional informationabout user and item are demographical information andcontent information, respectively. In this example, we usegenre information as a content information. Five types ofgenre as content information present on each item, i,e, eachitem describes in five genres. If particular genre presentson an item, mention 1 otherwise 0. Similarly, for userdemographical information, two types of demographicinformation is presented: User’s age and gender. The pres-ence of user’s age is in form of seven age ranges, i.e.,〈1 − 17, 18 − 24, 25 − 34, 35 − 44, 45 − 50, 51 − 55, 56+〉,and the presence user’s gender in from of ’M’/’F’ or1/0. Figure 3 describes the content information of itemsand demographical information of users. We can see thatcontent information and demographical information bothoverlap between domains.

From Figs. 1, 2 and 3, some observations can be made asfollows:

In source domain, us1 provided high ratings on item i1and item i3 (refer in Fig. 1), and item i1 and i3 containg1 and g3 genres (refer Fig. 3 wherein the values of(i1, g1) ; (i1, g3) ; (i3, g1) ; (i3, g3) are 1). Therefore, we cansay that the user us1 prefers g1 and g3 genres comparability.

For detailed explanation, we have taken following threescenarios from the target domain as:

Scenario 1: In target domain, ut7 provided high ratingson item i10 and item i12. Both items contain g1 and g3

Fig. 1 Illustration of source andtarget domains user-item ratingmatrices

2464


Fig. 2 Representation of userslatent factor in a 2-D matrix( f = 5)

genres. i.e., we can also say that the user ut7 prefers g1

and g3 genres comparability. In terms of demographicalinformation, us1.age and ut7.age belong same age groupand gender also same. Although both users belongdistinct domains, users (us1, u

t7) are similar in terms of

characteristics and their behavior, so similarity betweenboth user’s profile S(1s ,7t ) ≈ 1. Moreover, their vectorsof latent factor (shown in Fig. 2) also be equal, i.e.,Us

1UtT7 ≈ 1. So, difference S(1s ,7t ) − Us

1UtT7 ≈ 0.

If the difference is not tend to zero, then error may bepresent.

Scenario 2: In scenario 2, ut9 provided high ratings onitem i11 and item i12, and both items contain g1 and g3genres. In this scenario, we can also say that the user ut9prefers g1 and g3 genres comparability. us1.age and ut9.agebelong in the same age group and gender also a samein terms of demographical information. In this scenario,both users (us1, u

t9) are also similar, so, S(1s ,9t ) ≈ 1. In

terms of latent vectors, Us1U

tT9 ≈ I is not true. In this

situation, we have to adjust the latent factor values so thaterror can be minimized.

Scenario 3: Similarly, ut10 provided high ratings on item i8and item i9. Both items contain g2 and g5 genres. us1.ageand ut10.age are dissimilar, and gender also in opposite. Inthis scenario, we also can say that users profile S(1s ,10t )

similarity tend to be zero, and their latent factors alsoshould not be equal. But, the condition UsT

1 Ut10 ≈ I is

satisfied. If we calculate the error between users profileand their vectors of latent factor S(1s ,10t ) − Us

1UtT10 , it

must be high. Therefore, we also have to adjust the latentfactor values so that error can be minimized.

These scenarios boost the objective of our proposedwork. How to build a user profile, how to transfer learningin CDRSs framework are being done and how to minimizean error, all things are described in Section 4.

3 Literature review

In this section, we briefly describe recommender systemsand their techniques, limitations of their techniquesfollowed the existing solutions by leveraging the knowledgefrom additional domains. After that, we describe relatedworks in detail.

RSs have been successfully applied in many areas suchas movies [3, 28, 30], social networks [31–33], music[34], books [35, 36], medical science [37, 38], e-learning[39] etc. Several techniques on RSs have been proposedfor recommendations to users. According to the literature

Fig. 3 Representation ofdemographical data and contentinformation in a 2-D matrix

2465


review paper [5], three filtering techniques are: Content-based, collaborative and hybrid. Content-based filteringapproaches analyze a set of descriptions of items previouslyrated by the user then create a profile of that user’s interestson the basis of features of the rated items. In case ofcollaborative filtering (CF), it focuses on explicit ratingswhich are given by users in terms of their opinion on seenitems. A main hypothesis of CF is that if two users aresimilar in past would similar in future. Hybrid RS techniqueprovides a combination both filtering and makes use of theadvantages of each technique. One of the gained filteringtechniques in recent years is CF because it is more versatile.

CF [1, 3, 4] can be classified into two categories:memory-based and model-based. The former categoryof methods focuses on similarity strategy between co-rated users or items, followed by topK selection andthen weighted average of similarities with others co-ratedusers or items for prediction. Various modified similarityformule [6, 7, 40] have been proposed to calculate enrichsimilarity between users or items. Although memory-basedCF provides good performance, it is not suitable for largeamount of database. In case of model-based methods,firstly, we build a model through historical records andthen prediction can be made. Various models [8, 10–13]have been proposed which are based on latent factor model.Matrix factorization (MF) [9] is one of the most popularmodels in model-based CF. MF tries to characterize userand item in the small number of latent factors inferredfrom ratings. This method was very popular in 2009 and itplayed a central role in the Netflix competition.1 AlthoughCF gains great success in recent years, it still suffers fromsparsity problem and cold-start user/item problem [41]. Thereason of the sparsity problem is that users provide ratingsin limited number of items out of millions of items. The itemcold-start problem occurs when there is a new item that hasbeen transferred to the system, so no ratings are available onthe new items. Similarly, for the user cold-start problem, auser just enter into a system and has not provided any ratingson items.

Handling the cold-start user problem, the approach [42]is to recommend a few items to a cold-start user and usethe feedback to learn a profile. Learned profile can then beused to make good recommendations to the cold user. A keyquestion is how to select items to recommend to a cold-startuser. In the literature review paper of [43], the authors haveclassified the relevant studies for cold-start user probleminto three groups: 1) makes use of additional data sources;2) selects the most prominent groups of analogous users; 3)enhances the prediction using hybrid methods. A limitationof first group is that we need some additional data sources(tags, demographical information, etc). If these are not

1https://www.netflixprize.com

available, it fails to address the problem. In the case ofsecond group, we have to use brute force algorithm to findthe optimal number of groups, so, it is not convenient. In thethird group, i.e., hybrid, it is too expensive for computationof similarities.

For handling sparsity problem of CF, several researchershave provided their solutions by using additional feedbackswith in-domain such as likes/dislikes [15, 16, 18], usersreviews [17], history records [18], etc. and tried to mitigatethe sparsity problem. In two literature papers [20, 44],several authors have focused one or more than one domainsand proposed their methods to leveraging the knowledgefrom multiple source domains in order to increase theperformance of target prediction by transfer learning [23].This type of strategy is called Cross-domain RecommenderSystems (CDRSs).

Cremonesi, et al. [24] have classified three types ofcategory based on the overlapping of users/items in CDRSs:fully users/items overlap, partially users/items overlap andnon-overlap users and items. In the case of fully orpartially users/items overlap, a first paper presented by[45] in 2008. Similarly, [46] have proposed the methodfor CDRSs using aggregating user rating vectors fromdifferent domains and apply traditional memory-based CF.Hu, et al. [47] has proposed the method for cross-domainversion of a matrix factorization, in which an augmenteduser-item rating matrix is constructed by horizontallyconcatenating all matrices. These type of methods are usedmulti-task transfer learning where both domains are usedsimultaneously. Cremonesi, et al. [24] considered to model(partially overlap users and/or items) the classical similarityrelationships as a direct graph and explore all possible pathsconnecting users or items in order to find new cross-domainrelationships. However, all the above mentioned methodsconsider overlapping users or items across domains forknowledge transfer which are not a realistic setting becausefinding same user or item in two distinct domains is toodifficult.

In the third category, i.e., non-overlap users and items,[19] have proposed the method name as Codebook Transfer(CBT) based on cluster-level rating pattern for knowledgetransfer between domains. CBT is fully based on ratingpattern, extract the rating pattern from source domain byusing the two-way k-mean algorithm, and then transferit to the target domain. The authors have assumed thattwo different domains have similar rating pattern. But in apractical scenario, every domain has its rating distribution,we can not assume to be same. Rather focusing onrating pattern transfer, several researchers [25–28] havealso focused on non-overlap users/items category by usingadditional tag information, and assume that tags overlapin both domains. The limitation of tag-based transferlearning methods in CDRSs is to finding overlap tag

2466

https://www.netflixprize.com


between domains which is too expensive. Table 1 showsthe classification of methods based on types of knowledgeexploitation from either additional feedbacks (in-domain) oradditional other related domains.

3.1 Related work

Various techniques of recommendation systems have beendeveloped in single-domain as well as cross-domainrecommender systems. This paper focuses an efficientcross-domain based recommendations using user profilewhich acts as a bridge for knowledge transfer from a sourcedomain. According to literature studied, no one used userprofile in CDRSs framework. In this section, we summarizethe works that are most representative and relevant to thestudy.

The existing work can be classified into two categories:without transfer and with transfer. In the former category,researchers [5, 8, 12] have focused only a single specificdomain, i.e., no transfer learning is applied. So, collabora-tive filtering is one of the best state-of-the-art techniques. Asmentioned earlier, two types of CF are memory-based andmodel-based. One of the traditional methods in memory-based is kNN-CF where topK neighbors are found basedon similarities between an active user and other users. Asimilarity can be calculated by Pearson correlation formula[40] as:

sima,u =∑

i∈Ia,u(ra,i − ra)(ru,i − ru)√∑

i∈Ia,u(ra,i − ra)2

√∑i∈Ia,u

(ru,i − ru)2(1)

After calculating similarity, prediction is made by aweighted average strategy as follows:

ra,i = ra +∑topK

u=1 sima,u(ru,i − ru)∣∣∣∑topKu=1 sima,u

∣∣∣(2)

where a, u ∈ M , i ∈ N , sima,u represents similaritybetween user a and u, Iu,a is co-rated items between useru and a. topK means set of most k-neighbors users that aremost similar to user a. ra and ru are mean rating of user aand u for all rated items, respectively.

Due to the behavior of available datasets are sparse, socalculate the similarity from every user is not good idea.Another solution, User Preference Clustering (UPC-CF) hasbeen proposed by [7]. UPC-CF is based on user preferenceclustering to reduce the impact of data sparsity. The Authorshave used three clusters as Co, Cp and Cn representoptimistic user group, pessimistic user group, and neutraluser group respectively. An intuition of UPC-CF method isthat users could have starkly different views on an item. TheClustering has been done based a mean rating of a user withthreshold values, for instance, threshold values are (basedon numerical rating range 1 − 5): {{1, 2, 3} , {4, 5} , {3}} forpessimistic user group, optimistic user group and neutraluser group, respectively. If user u mean rating is 4.12,then the user u belongs in optimistic user group. Aftercategorized all users based on their mean rating, the clustercenter of all three clusters is calculated as:

C∗ = u ⇐ max |Iu | (3)

Table 1 Classification of methods based on types of knowledge exploitation from either additional feedbacks (in-domain) or additional otherrelated domains

Research paper Additional domains based on the overlap of users and items Additional data

fully users/items partial users/items non-overlap users and items

Winoto and Tang [45], 2008 �Li et al. [19], 2009 �∗ (Rating pattern)

Pan et al. [48], 2010 �Cremonesi et al. [24], 2011 �Shi et al. [25], 2011 � (Tags)

Pan and Yang [15], 2013 �Enrich et al. [26], 2013 � (Tags)

Hu et al. [47], 2013 �Fernandez-Tobı [27], 2014 � (Tags)

Xin et al. [17], 2015 �Fang et al. [49], 2016 � (Tags)

Zhao et al. [50], 2017 �Zhu et al. [21], 2017 �Sahu et al. [28], 2018 � (Tags)

Yu et al. [32], 2018 �

2467


cluster center is that user who has provided maximumnumber of ratings in specific cluster of users. After calculatethe cluster center of all three clusters, a preference of anactive user is identify by using modified similarity equationas:

simU PCa,u = exp

(−∑

i∈Ia,u(ra,i − ru,i ) ∗ |ra − ru |∣∣Ia,b

∣∣)

∗|Ia | ∩ |Iu ||Ia | ∪ |Iu | (4)

Prediction can be calculate from (2) by replacing thesimilarity formula. An advantage of UPC-CF is that it doesnot require to calculate similarities from all users.

Another paper in same direction has been proposed by[12] with different strategies to find topK neighbors. Theauthors have proposed a modified version of memory-based CF name as Neighbor Users by Subspace Clustering(NUSC-CF). The authors have tried to find users in thecorresponding subspaces of items. An intubation is thatusers grouped under the same cluster share similar interests.These subspaces are then used to find a tree of neighborusers. The amount of similarity of every user to the targetuser determines his position on the tree. A disadvantageof NUSC-CF is that number of subspaces may growexponentially, and it depends on total number of items indataset.

In case of model-based approach, MF, as mentioned ear-lier, tries to describe users’ taste and items’ information insmall number of latent factors or space. These latent factorscan be estimated through learning mechanism [29] by usingprevious historical records. After estimating both latentfactor matrices, the prediction can be made using corre-sponding their inner product of user and item latentfactors.

In traditional MF, rather directly apply learning algorithmto estimate latent factors of users and items, firstly weremove user and item biases from rating for more criticaluser (who rates more critically than others) and highlypopular item bias for estimating the values more precisely.This type of method is baseline or Average Filling with bias(AF with bias) method [51] of MF. The user and item biasesare estimated as:

minbu ,bi

m∑u=1

n∑i=1

Iu,i

�{

1

2(ru,i − μ − bu − bi )

2 + λbu

2b2u + λbi

2b2i

}(5)

where bu bi and μ are user bias, item bias and average ratingof all available ratings in dataset, respectively. λbu and λbiare regularization parameters to control over-fitting.

After removing the biases, latent factor matrix U andV can be estimated using user-item rating matrix. The lossfunction in MF as follows:

minU,V

1

2

∥∥∥I�(Y −UV T

)∥∥∥2

f+ λu

2‖U‖2

f + λv

2‖U‖2

f (6)

where ‖x‖ f is Frobenius norm and yu,i ∈ Y = ru.i − bu −bi − μ

In later category, i.e., with transfer, CBT is one of thestate-of-the art techniques where transfer learning is used.An advantage of CBT is that it does not require overlap usersas well as items from a source domain. Most of memorybased CF methods based on similarity that are estimatedthrough co-rated items between users. If co-rated items maynot enough amount similarity value may not be accurate,therefore some researchers [5] have provided their solutionsby filling unknown ratings through corresponding meanvalue of provided known ratings. But, it is not good idea.So, [19] have proposed codebook transfer approach whereinratings fill through compact codebook which are extractfrom other related source domain. After filling unknownratings, traditional kNN-CF is applied. CBT has two phases:extraction of codebook and expansion of codebook.

Figure 4 shows an example of CBT approach whereintwo rating matrices have used for source and target domain.In extraction phase, rating matrix of source domain ispermuted through 2-way k-mean clustering [52] and thenextract compact user-item rating pattern called codebook. Insecond phase, extracted codebook is expended into a targetdomain. Fill missing rating entries are shown in Fig. 4 with’*’ symbol in the target domain.

The objective function of CBT as:

minUs ,B,V s

∥∥∥I s �(Rs −Us BV sT

)∥∥∥f

(7)

s.t. UsT Us = I , V sT V s = IAfter minimizing the (7), construction of codebook as

follows:

B = [Us RsV s]

[Us11T V s

](8)

where means entry-wise division. Equation 8 showsaveraging all the ratings in each user-item co-cluster asan entry in the codebook, i.e., the cluster-level ratingpattern. Us,Ut , V s, V t ∈ {0, 1} represent cluster indicatormatrices. After extraction phase of CBT from a sourcedomain, same procedure is applied in reverse order fortransferring CBT in the target domain and then applytraditional kNN-CF method for prediction. The limitationof CBT is sharing same cluster-level rating pattern whichis not realistic in practical scenario because distribution ofratings in distinct domains may not be same cluster-levelrating patten.

2468


Fig. 4 An example of CBTmethod using transfer learning

Table 2 shows summarization of related works. In previ-ous related works, modifications and enhancements of col-laborative filtering are mainly embodied in three scenarios:the similarity measure followed by topK neighbor selection[3, 6, 7], latent factor model [9, 14] and transfer knowl-edge [19] from other domain. All related work methods havebeen tried to mitigate the sparsity problem of CF with someassumptions. Although the researchers have tried to addresssparsity problem, it is a challenging and open problem ofRSs.

This paper falls in two types of scenario: transferknowledge and latent factor model. In this paper, wepropose a novel method for CF to mitigate sparsity problemand enhance the performance of rating prediction usingother related domain by transfer learning strategy in CDRSsframework. As we mentioned earlier, leveraging knowledgefrom other related domains, we need some overlappinginformation between domains. Here, we use user profileas a bridge even no users/items overlap between domains.Another aspect is latent factor model, i.e., how to learnhidden factors of user and item. So, we use probabilisticmatrix factorization model which helps to learn hiddenfactor efficiently. Several authors have focused on any firsttwo types scenario, i.e., neighbors selection and latent factormodel. We can differentiate our method and related workmethods by combine last two scenarios (latent factor modeland transfer learning).

4 Proposed UP-CDRSsmethod

In this section, we describe our proposed UP-CDRSsmethod which consists four fold: 1) Build user profile inboth domains 2) Calculate similarity between users profileof distinct domains 3) Merge both domains into a single setand build probabilistic model to find relationship betweenvariables, and then solve it by probabilistic theory to getobjective function. The objective function is solved byalternating least square approach 4) In last fold, predictionson unrated items’ rating in the target domain are made.Figure 5 shows the architecture of UP-CDRSs method.

4.1 Build user profile

Firstly, we build a user profile using demographicalinformation of a user, explicit ratings (provided by a user)and content information of user-rated items. Block diagramof building a user profile is shown in Fig. 6. We captureuser’s preferences (in form of vector) using explicit ratingsand content information of rated items of a user. Afterthat, we augment demographical information with user’spreferences’ vector. Figure 7 shows a user profile vector.

User’s preferences: The benefit of capturing user’s pref-erences is that we can be able to know the contentpreferences of a user. For capturing user’s preferences,

2469


Table 2 Summarization of previous approaches

Researchpapers

Is transfer learningapplied

Approach Remark

Candillier et al. [3], 2007 No kNN-CF, Find topK neighborsusing Pearson correlation coeffi-cients and apply weighted aver-age of corresponding similarityand rating of other co-rated user

– Similarity calculation is too expensive– Problem of data scalability

Salakhutdinov and Mnih [14], 2007 No cPMF, Use a concept of hiddenlatent factors and represent a userand an item in small number ofhidden factors

– Handle the problem of data scalability– Handle semi cold-start user problem,i.e., who has not provided enough ratings

Koren et al. [9], 2009 No MF, Use a concept of hiddenlatent factors and represent a userand an item in small number ofhidden factors

– Handle the problem of data scalability– Handle the problem of critical usersand items by removing the bias from theratings

Li et al. [19], 2009 Yes CBT, Find cluster-level ratingpattern from a source domain andthen transfer to the target domainfollowed by kNN-CF method

– Handle the problem of data sparsity– Too expensive to find compact ratingpattern

Zhang et al. [7], 2016 No UPC-CF, Clustering has beenused for finding similar usergroups. Three clusters representoptimistic user group, pessimisticuser group and neutral usergroup, and followed by modifiedmodified kNN method

– Mitigate the problem of similaritycalculation to all other users in the dataset– It may not work well in huge dataset

Koohi et al. [6], 2017 No NUSC-FC, Find the users inthe corresponding subspaces ofitems. These subspaces are thenused to find a tree of neighborusers. The amount of similarityof every user to the target userdetermines his position on thetree.

– Find subspace is too expensive

we use two formule: Relative Genre Ratings (RGR) andModified Relative Genre Frequency (MRGF) [53]. Inthis paper, we use genre information as content informa-tion because we focus on movie recommendations. Aftercalculating both values, a particular user’s preference canbe calculated by harmonic mean of both values. Formuleof RGR and MRGF as follows:

RGR: The ratio of user u for high rated items on eachgenre of total ratings on items.

RGR(u, g) = GR(u, g)

T R(u)(9)

TR(u) =∑s∈Su

ru,s, and

GR(u, g) =∑

s∈Gg⊂Su ,r≥3

ru,s

Here, T R is total ratings of user u, S is set of itemsrated by user u and the genre rating (GR) for high

rated (r >= 3) items of genre Gg corresponding tothe user u is computed.

MRGF: Rather focusing on items rating, frequency ofgenres preferred by a user also an important concern,the ratio of user u ratings (respect to frequency) forhigh rated items on each genre of total ratings (respectto frequency).

MGRF(u, g)

=∑

s∈Gg⊂Si δ3(ru,s)+2 ∗ δ4(ru,s)+3 ∗ δ5(ru,s)

3 ∗ T F(u)(10)

GF(u, g)=∑s∈Gg⊂Si δk(ri,s)

s.t. k ∈ 3, 4, 5

δk(ri,s) =(

1 k=ri,s0 dk �=ri,s

) , and TF(u)=|Si |

where Si is total number of ratings given by the user u.

user’s preference(u, g) = 2∗nf ∗RGR(u, g)∗MRGF(u, g)

RGR(u, g)+MRGF(u, g)(11)

2470


Fig. 5 Architecture ofUP-CDRSs method

where n f is a normalization factor. User’s prefer-ence(u,g) means the user u how much likely to prefer thegenre g. Similarly, we calculate all preferences of user uaccording to corresponding genres in given dataset.

Demographical information: While a user register onthe system, he/she provides some personal information,

Fig. 6 Block diagram of building the user profile

such as age, gender, country name, occupation, etc.These types of information or data called demographicinformation or demographical information.

In this paper, we use only age and gender as demo-graphical information of a user. Each user’s age belongsa particular group ranges < 1 − 17, 18 − 24, 25 −34, 34 − 44, 45 − 49, 50 − 55, 56+ >. Thisinformation can be encoded as in binary value, forinstance, user u age is 43, so binary encoded form is< 0, 0, 0, 1, 0, 0, 0 > Other demographical informationis gender, where only two options are provided, i.e., ’M’and ’F’ for male and female, respectively.

After capturing user’s preferences and extracting demo-graphical information, we concatenate into a single vectorcalled user profile (refer Fig. 7).

For better understanding, we consider an example ofmovie recommender systems (shown in Table 3). There arethree users u1 , u2 and u3 and ten movies i1, i2, · · · i10.Explicit ratings are expressed with numerical values from 1to 5. Unrated ratings are shown as ’?’. Here, genre descrip-tion as content information is considered. Five genres are

2471


Fig. 7 Illustration of a userprofile vector

given in the form of < g1, g2, g3, g4, g5 >. If genre gipresents on the movie denoted as 1 otherwise 0.

Firstly, we calculate RGR and MRGF values of all userson each genre using (9) and (10), respectively. Tables 4 and5 show RGR and MRGF values, respectively.

After calculating both values, users’ preferences can becalculated using (11). Table 6 shows users’ preferences.The value of n f = 0.8 is used. We can observe fromTable 6 is that the user u1 prefers (g3 = .9030) genrecomparatively, similarly the user u3 prefers (g5 = .9274)

genre comparatively.After calculating users’ preferences, we extract demo-

graphical information. In Fig. 3, age and gender of users aredescribed. We consider seven groups of ages < a1, a2, a3,

a4, a5, a6, a7 >. Ranges of group are < 1 − 17, 18 −24, 25−34, 34−44, 45−49, 50−55, 56+ >. Eachuser’s age belongs a particular group, for instance, If a useru age is 39 than encoded vector is < 0, 0, 0, 1, 0, 0, 0 >.Other demographical information is gender, so only twooptions are there, i.e., ’M’ and ’F’ for male and female,respectively. Here, we use only scalar value in binary form(1 and 0). Combine both information into a single vector,so size of vector is eight. for instance, u1.age = 18 andu1.gender =′ M ′ so demographic information in vectorform is < 0, 1, 0, 0, 0, 0, 0, 1 >.

After calculating both information, again we concate-nate into single vector called user profile (Pk

i ) vector. forinstance, user’s preferences vector of user u1 is< 0.4528, 0,

0.9030, 0.444, 0 >, and demographic information in vector

Table 3 An Example of movie domain rating matrix

u1 u2 u3

i1<1,0,1,0,0>5 ? ?

i2<0,0,1,1,0>? 3 2

i3<1,0,1,0,0>4 ? ?

i4<0,1,0,0,1>? 4 5

i5<0,0,1,1,0>3 ? ?

i6<1,0,0,0,1>2 5 4

i7<0,0,1,1,0>4 ? ?

i8<0,1,1,0,1>? 5 4

i9<0,1,0,0,1>? 4 ?

i10<0,0,1,1,0>3 2 ?

form is< 0, 1, 0, 0, 0, 0, 0, 1 >. Thereafter concatenateboth vectors into a single one, i.e., < 0.4528, 0, 0.9030,

0.444, 0, 0, 1, 0, 0, 0, 0, 0, 1 >.

4.2 Similarity calculation between users profile

In the second fold, we calculate similarity S(Psi ,Pt

i ′) between

all users profile of domain Psi and Pt

i ′ from domain s andt, respectively. Similarity is calculated using cosine formulaas follows:

S(Psi ,Pt

i ′) = Ps

i .Pti ′∥∥Ps

i

∥∥ ∥∥Pti ′∥∥ (12)

4.3 Probabilistic graphical model of UP-CDRSsmethod

Figure 8 shows the probabilistic graphical model [29, 54],where each node represents a random variable, and linksexpress probabilistic relationships between these variables.Random variables in our proposed method are Uk

i , V kj , Ri, j

and S(Psi ,Pt

i ′) for user latent factors, item latent factors, rat-

ings and similarity between users profile, respectively. Howto variables are related among them are shown in Fig. 8.According to graphical model theory, the joint distributionof all variables expressed as:

p(Us, V s,Ut , V t , Rs, Rt , S, σUs , σV s , σUt , σV t , σs, σt , σp) =p(Rs |Us, V s, σs)p(R

t |Ut , V t , σt )p(S|Us,Ut , σp)p(Us |σUs )

p(V s |σV s )p(Ut |σUt )p(V t |σV t )p(S|σp)

P(σUs )P(σV s )P(σUt )P(σV t )P(σs)P(σt )P(σp)

(13)

Neglecting the constant prior probability of equation (13).The log-posterior probability over the latent variables is:

Table 4 RGR values of an example

User # TR GRr>=3 RGR

u1 21 <9, 0, 19, 10, 0> <0.4286, 0, 0.9048, 0.4762, 0 >

u2 23 <5, 13, 8, 3, 18> <0.2174, 0.5652, 0.3478, 0.1304,

0.7826>

u3 15 <4, 9, 4, 0, 13> <0.2667, 0.6000, 0.2667, 0, 0.8667>

2472


log p(Us, V s,Ut , V t |Rs, Rt , S, σUs , σV s , σUt , σV t , σs, σt , σt , σp) ∝log[p(Rs |Us, V s, σs)p(R

t |Ut , V t , σt )p(S|Us,Ut , σp)

p(Us |σUs )p(V s |σV s )p(Ut |σUt )p(V t |σV t )p(S|σp)] (14)

The conditional distribution over the observed ratings as:

p(Rk | Uk, V k, σk) =Mk∏i=1

Nk∏j=1

[N (Rk

i, j | Uki V

kTj , σk)

]I ki, j

(15)

where I ki, j is binary mask. If user i rated movie j in domaink is equal to 1 otherwise 0. The conditional distribution overthe users profile similarities as:

p(S | Us,Ut , σp) =Ms∏i=1

Mt∏i ′=1

[N (S(

Psi ,Pt

i ′) | Us

i UtTi ′ , σp)

]

(16)

where N (x | μ, σk) denotes the probability density functionfor a Gaussian distribution with mean μ and variance σk .The zero-mean spherical Gaussian prior on users an itemsvectors as:

p(Uk | σUk ) =Mk∏i=1

N (Uki | 0, σUk ) , (17)

p(V k | σV k ) =Nk∏j=1

N (V kj | 0, σV k )

Substitute the (15), (17) and (16) into (14) for k ∈ {s, t},we get:

log p(Us, V s,Ut , V t |Rs, Rt , S, σUs , σV s , σUt , σV t , σs, σt , σt , σp) =

log

⎡⎣

Ms∏i=1

Ns∏j=1

[N (Rs

i, j | Usi V

sTj , σs)

]I si, j Mt∏i=1

Nt∏j=1

[N (Rt

i, j | Uti V

tTj , σt )

]I ti, j

Ms∏i=1

Mt∏i ′=1

[N (S(

Psi ,Pt

i ′) | Us

i UtTi ′ , σp)

] Ms∏i=1

N (Usi | 0, σUs )

p(V k | σV s )

Ns∏j=1

N (V sj | 0, σV s )

Mt∏i=1

N (Uti | 0, σUt )p(V t | σV t )

Nt∏j=1

N (V tj | 0, σV t ) + C

⎤⎦ (18)

where C is term containing users profile variance, ratingsvariance and prior variance. A constant term C does not

Table 5 MRGF values of anexample TF GFδ3 GFδ4 GFδ5 MRGF

u1 6 < 0, 0, 2, 2, 0 > < 1, 0, 2, 1, 0 > < 1, 0, 1, 0, 0 > < 0.8333, 0, 1.5000, 0.6667, 0 >

u2 6 < 0, 0, 1, 1, 0 > < 0, 2, 0, 0, 2 > < 1, 1, 1, 0, 2 > < 0.5, 1.1667, 0.6667, 0.1667, 1.6667 >

u3 4 < 0, 0, 0, 0, 0 > < 1, 1, 1, 0, 2 > < 0, 1, 0, 0, 1 > < 0.5000, 1.2500, 0.5000, 0, 1.7500 >

depend on the parameters. Applying the product rule of(18). we get:

− 1

2σ 2s

Ms∑i=1

Ns∑j=1

I si, j (Rsi, j −Us

i VsTj )2 − 1

2σ 2t

Mt∑i=1

Nt∑j=1

I ti, j (Rti, j −Ut

i VtTj )2

− 1

2σ 2p

Ms∑i=1

Mt∑i ′=1

(S(

Psi ,Pt

i ′) −Us

i UtTi ′

)2

− 1

2σ 2us

Ms∑i=1

Usi U

sTi − 1

2σ 2vs

Ns∑j=1

V sj V

sTj

− 1

2σ 2ut

Mt∑i=1

Uti U

tTi − 1

2σ 2vt

N t∑j=1

V tj V

tTj + C (19)

2473


Table 6 An example of user’s preferences

User # Users’ preferences

u1 < 0.4528, 0, 0.9030, 0.4444, 0 >

u2 < 0.2424, 0.6092, 0.3657, 0.1171, 0.8521 >

u3 < 0.2783, 0.6486, 0.2783, 0, 0.9274 >

We assume that all have a equal variance, i.e., σ 2s = σ 2

t =σ 2p = σ 2

us = σ 2vs = σ 2

ut = σ 2vt

. Therefore, the objectivefunction E as:

E(Us, V s,Ut , V t ) = 1

2

Ms∑i=1

Ns∑j=1


i VsTj )2

+λs

2

⎛⎝

Ms∑i=1

Usi U

sTi +

Ns∑j=1

V sj V

sTj

⎞⎠

+1

2

Mt∑i=1

Nt∑j=1


i .V tTj )2

+λt

2

⎛⎝

Mt∑i=1

Uti U

tTi +

Nt∑j=1

V tj V

tTj

⎞⎠

+α

2

Ms∑i=1

Mt∑i ′=1

(S(

Psi ,Pt

i ′) −Us

i UtTi ′

)2

(20)

Fig. 8 Probabilistic graphical model of UP-CDRSs method

where λs , λt , are constant trade-off parameters to avoidover-fitting by penalizing the magnitudes of the parameters.α also is constant trade-off parameter to control theinfluence of users profile similarity.

To minimize an error (E) of the objective function (20),we have used alternating least square approach [51] to learnall the parameters.

Let E(Usi ) = 1

2

Ns∑j=1


i VsTj )2

+λs

2

⎛⎝

Ms∑i=1

Usi U

sTi +

Ns∑j=1

V sj V

sTj

⎞⎠

+α

2

Ms∑i=1

Mt∑i ′=1

(S(

Psi ,Pt

i ′) −Us

i UtTi ′

)2

partial derivative:

∂E

∂Usi

= −Ns∑j=1


i VsTj )V s

j

−α

2

Mt∑i ′=1

[S(

Psi ,Pt

i ′) −Us

i UtTi ′

]Ui ′ + λsU

si

Similarly, in target domain E(Uti ) of user i ,

∂E

∂Uti

= −Nt∑j=1


i VtTj )V t

j

−α

2

Ms∑i=1

[S(

Psi ,Pt

i ′) −Us

i UtTi ′

]Usi + λtU

ti

Item latent factor in source and target domains can beoptimized as follows:

∂E

∂V sj

= −Ms∑i=1


i VsTj )Us

i + λsVsj

∂E

∂V tj

= −Mt∑i=1


i VtTt )Ut

i + λt Vtj

So, ∂E∂Us

i= ∂E

∂Usi

= ∂E∂V s

j= ∂E

∂V tj

= 0

After calculating the gradient of all four parameters,update the values as follows:

θ ← θ − γ ∗ ∂E

∂θ(21)

where γ is a learning rate of the objective function E , andθ ∈ {

Us, V s,Ut , V t}.

This process iteratively repeat until to convergence oflocal optimal state.

2474


4.4 Prediction

After learning all four parameters (θ) of the objectivefunction E , predict the unknown ratings in the target domaincan be computed as:

Ri, j = Uti V

tTj (22)

5 Experiments and results

In this section, firstly we describe datasets that are usedin this paper followed by data preprocessing. After thatwe describe experiment protocols and evaluation metrics.At last, we discuss compared methods and summary ofexperimental results.

5.1 Datasets

We have used two datasets which are publicly availableand also benchmark datasets for RSs: MovieLens2 andEachMovie 3. Both datasets are similar in terms of recom-mendations because both are used for movie recommendersystems. The brief overview of datasets are following:

– A MovieLens (ML) rating dataset contains 1,000,209ratings of 3,952 movies rated by 6,040 users. Rating ismade on a 5-star scale i.e., 1 − 5. Other informationabout users and items are demographical information(gender, age, occupation and zip code) and genreinformation, respectively. The presence of user’s age isin form of seven group ranges, i.e., 〈1−17, 18−24, 25−34, 35 − 44, 45 − 50, 51 − 55, 56+〉, and the presenceuser’s gender in form of ’M’/’F’. Each movie measuresin binary combination of eighteen genres.

– In EachMovie (EM) dataset contains 2,811,983 ratingsof 1,628 movies given by 72,916 users. Rating scoreis given in six ranges < 0, 0.2, .04, 0.6, 0.8, 1 >.Other details of users and items are demographicalinformation (gender, age and zip code) and genreinformation, respectively. The presence of user’s ageis in numerical value, and the presence user’s genderin form of ’M’/’F’. Each movie measures in binarycombination of ten genres.

5.2 Data preprocessing

This paper uses the transfer learning mechanism to exploitthe knowledge of source domain wherein the constraint non-overlap users and items between domains. Both mentioned

2https://grouplens.org/datasets/movielens/10m/3http://www.cs.cmu.edu/?lebanon/IR-lab.htm

datasets belong movie recommender systems, so, there isa case wherein same movies may overlap. Figure 9 showscross-domain structure of both datasets. For considering nonoverlap scenarios, we have to remove overlapping movies inone of the datasets. Some preprocessing steps as follows:

– Firstly, we find overlap movies in both datasets throughmovies name. In this process, 1,361 movies are foundwhich belong to both datasets. These overlappedmovies are discarded in one of the datasets. In ourscenario, we chose ML and discarded 1,361 overlappedmovies. Remaining 2,591 movies are considered inML (domain 1). After that, the rating matrix, in tripletform (userID, itemID, rating), is refined and we onlyconsidered 589,395 out of 1,000,209 ratings.

– In EM dataset (domain 2), we consider equal number ofusers as domain 1. In this case, 6,040 users are chosenrandomly with constraint that users must providedemographical information. After that, rating matrix isrefined, and only 284,886 out of 2,811,983 rating isconsidered .

– Building a user profile, genres information anddemographical information are used. In case of genreinformation, only 7 genres overlap, name as ’Action’as ’Act’, ’Animation’ as ’Ani’, ’Comedy’ as ’Com’,’Drama’ as ’Dra’, ’Horror’ as ’Hor’, ’Romance’ as’Rom’ and ’Thriller’ as ’Thr’ between domains.

– An intersecting thing is that one more hidden overlapgenre present between domains. Finding a hiddenoverlap genre, we use cosine similarity between genresas follows:

∗ As mentioned earlier, 1,361 movies are samein both domains. The intuition is that If twomovies have same name in distinct domainsthan the presence of genres in a movie alsobe a same. Here, we have calculated Jaccard

Fig. 9 Cross-domain structure of MovieLens and EachMovie datasets

2475

https://grouplens.org/datasets/movielens/10m/

http://www.cs.cmu.edu/?lebanon/IR-lab.htm


similarity between genres based on overlapped1361 items. Similarity is calculated as:

jaccard sim(g1, g2) = |A ∩ B||A ∪ B| (23)

∗ Seven genres name are same in both domain,we have also calculated the highest similaritywith same name of the genre. In Table 7,boldface values (e.g., 0.6130) show thatthe highest similarities between the genresin two distinct domains. ’chi’(children) and’fam’(family) same type of genre becausewe calculated high similarity ( italic-boldface0.6884) Although ’chi’ and ’fam’ genres nameare different, both genres meaning are same.We consider ’chi’ and ’fam’ same genre. So,total eight genres overlap between domains.

– In ML dataset, 7 bins (hard encoded) are used for ageinformation and gender information is provided in formof ”M” and ”F”,so, only needs scalar value (0/1). In caseof EM dataset, age is provided in form of numericalnumber. Due to consistency between datasets, we haveapplied numeric ranges filter and transformed in sevenbins same as in ML.

– Finally, a user profile (Pk∗ ) vector size is 8+7+1=16(refer Fig. 7), i.e., number of genres are 8, age rangesgroup are 7 and gender in scalar form.

– For rating-scale consistency, we have replaced therating values from {0, 0.2, 0.4, 0.6, 0.8, 1} to {1, 2, 3,

4, 5, 5} same as [19] have done in CBT method.

Table 8 shows the overall datasets description after thepreprocessing steps. We observe that sparsity level of ML islesser than sparsity level of EM. To validate the effective-ness of transfer learning mechanism, we consider ML andEM as source domain and target domain, respectively.

5.3 Experiment protocols

Furthermore, validating the effectiveness of a proposed UP-CDRSs method in CDRSs model, experiments are done onvarious sparsity levels of the target domain. In this manner,The target domain’s ratings are divided based on number ofrated ratings of each user {1%, 1.5%, 2%, 2.5%, 2.9%}, forinstance, 6,040 users and 1,628 items in a domain. If we take1% of ratings of each user than total number of ratings are (.01 ∗ 6, 040 ∗ 1, 628 ≈ 98, 331). So, total five experimentsare done on the basis of number of known ratings.

We have adopted 10-fold cross-validation process, whererating of the target domain is divided into 10 equal parts.9 parts are used for training purpose and a remain partfor testing purpose. After that training part of the targetdomain is concatenated with a source domain. Same processis applied ten times on each test part. After evaluating eachpart, mean value of all parts shows an overall performanceof the method. We have also used 95% confidence interval

Table 7 Similarities between genres

Act Ani Art Cla Com Dra Fam Hor Rom Thr

Act 0.6130 0.0130 0.0222 0.0295 0.0506 0.0579 0.0256 0.0420 0.0063 0.1551

Adv 0.2385 0.0464 0.0138 0.0767 0.0210 0.0291 0.1942 0.0154 0.0083 0.0701

Ani 0.0000 0.8333 0.0095 0.0364 0.0075 0.0000 0.2222 0.0000 0.0000 0.0000

Chi 0.0143 0.2832 0.0108 0.0764 0.0465 0.0034 0.6884 0.0000 0.0000 0.0000

Com 0.0274 0.0221 0.0641 0.0946 0.6255 0.0647 0.1000 0.0141 0.1232 0.0171

Cri 0.1034 0.0000 0.0079 0.0508 0.0300 0.0747 0.0049 0.0533 0.0000 0.1013

Doc 0.0044 0.0000 0.0419 0.0000 0.0122 0.0684 0.0057 0.0079 0.0000 0.0000

Dra 0.0440 0.0016 0.1828 0.0950 0.0921 0.5157 0.0350 0.0169 0.0946 0.0681

Fan 0.0103 0.0545 0.0157 0.0048 0.0158 0.0040 0.0896 0.0103 0.0000 0.0000

FNo 0.0211 0.0000 0.0000 0.0615 0.0000 0.0020 0.0000 0.0105 0.0000 0.0444

Hor 0.0122 0.0091 0.0041 0.0397 0.0186 0.0036 0.0000 0.6237 0.0000 0.1198

Mus 0.0044 0.2222 0.0182 0.1014 0.0272 0.0095 0.1282 0.0000 0.0298 0.0000

Mys 0.0326 0.0000 0.0185 0.0780 0.0073 0.0253 0.0000 0.0081 0.0120 0.1134

Rom 0.0462 0.0082 0.0850 0.0815 0.1364 0.1201 0.0182 0.0070 0.3891 0.0216

Sci-fi 0.1713 0.0268 0.0203 0.0391 0.0255 0.0239 0.0100 0.0690 0.0050 0.1027

Thr 0.2078 0.0086 0.0278 0.0724 0.0181 0.0611 0.0031 0.0916 0.0063 0.4818

War 0.0396 0.0000 0.0446 0.0779 0.0240 0.0650 0.0000 0.0000 0.0339 0.0177

Wes 0.0361 0.0000 0.0000 0.0288 0.0156 0.0264 0.0067 0.0000 0.0137 0.0104

2476


Table 8 Datasets description after preprocessing

ML EM

# of users 6,040 6,040

# of items 2,591 1,628

# of ratings 589,395 284,886

Sparsity level 96.27% 97.10%

# genres 8 8

while calculating an average value of all test sets. Findingthe best trade-off parameters, 20% of the training data isused as a validation set.

5.4 Evaluationmetrics

Evaluating of our proposed work, we have used MeanAbsolute Error (MAE) and Root Mean Square Error(RMSE) as the evaluation metrics which are defined as:

MAE =∑

(i, j)∈RT

∣∣∣Ri, j − Ri, j

∣∣∣|RT | (24)

RMSE =

√√√√∑

(u,i)∈rT(ru,i − ru,i

)2

|rT | (25)

where RT denotes the total number of predicted ratings inthe target domain. Ri, j denotes the actual rating and Ri, j

predicted rating on item i to user u.

5.5 Comparedmethods and parameters setting

In this subsection, we describe compared methods whichare used for comparisons of our proposed UP-CDRSsmethod. This subsection also covers parameters settingbecause applying any methods for validation, we need somethreshold values, learning parameter values, etc.

Average filling (AF) method takes very less time topredict the ratings. In a user-item rating matrix, we just fillthe mean value of all observed items’ rating provided by auser. The prediction as follows:

Ri,∗ =∑Mt

j=1 Iti, j

⊙Rti, j∑Mt

j=1 Iti, j

(26)

where symbol⊙

is element wise product. Another methodin same manner is Average filling with bais (AF with bias),i.e., rather blinding fill mean value of the ratings, learn userbias and item bias using observed ratings for critical usersand items, and predict the ratings as follows [9]:

Ri, j = μ + bi + b j (27)

where μ is a global mean value of the user-item ratingmatrix, bi and b j are user bias and item bias, respectively.

Here , we have used λbi = λb j = .001 as trade-offparameters to avoid the over-fitting problem.

Collaborative filtering with topK neighbors (kNN-CF)is one of the most traditional techniques of RSs. We havefixed the value of topK= 50. Two variants of kNN-CF areUPC-CF and NUSC-CF. A first method uses clusteringto find similar users group after that selection process oftopK neighbors. So, we have fixed the value of topK= 50same as kNN-CF. The second method uses sub-spacingconcept, so we have fixed the value of d = 5. d denotesdimensionality [6]. Another compared method is MF [9]which provides the lower rank approximations of the user-item matrix. The prediction can be done using an innerproduct of corresponding two vectors of user and item. Theprediction is estimated as follows:

Ri, j = Uti V

tjT + bi, j (28)

where Ui , Vj ∈ R1× f and bi, j = μ+bi +b j . We have used

λu = λv = .001 as trade-off parameters, and size of latentfactors f = 10 is fixed. Similarly, constraint ProbabilisticMatrix Factorization (cPMF) extended version of proba-bilistic matrix factorization (PMF) [14]. Rather learn twolatent vectors, one for user and other for item, add an addi-tional constrain user-specific feature vectors for infrequentusers. The prediction can be estimated as:

Ri, j =⎛⎝Ut

i +∑Mt

j=1 Iti, j

⊙Wj

∑Mt

j=1 Iti, j

⎞⎠ V t

jT (29)

where Wj is ∈ R1× f additional vector of latent factors. In

this method, first we scale the ratings to the interval [0,1]using the function f (x) = x−1

k−1 . k is maximum rating valueof given matrix. The trade-off parameter values are λu =λv = λw = .002 and size of latent factor f = 30 is fixed.

Compare with transfer learning mechanism in RSs,Codebook transfer knowledge (CBT) method [19] has beenproposed for knowledge transfer between domains throughcodebook. The clusters size of CBT is set to be 50. AndtopK nearest neighbors K = 30 is fixed for Pearsoncorrelation coefficients. In case of our proposed UP-CDRSsmethod, The trade-off parameter values are λs = λt = .002,α = .001, and size of latent factors f = 30 is fixed.

5.6 Summary of the experimental results

All state-of-the-art methods, described in the aforemen-tioned subsection, and our proposed work are experimentedon publicly available datasets. In addition, validating andanalysing of our UP-CDRSs methods by transfer learningmechanism, target domain’s ratings are divided based on thenumber of rated ratings of each user. Analyse the perfor-

2477


Table 9 Comparison results interms of MAE Methods % of observed ratings

1% 1.5% 2% 2.5% 2.9%

AF 1.1747 1.0921 1.0792 1.0735 1.0639

± 0.0058 ±0.004 ± 0.0036 ±0.0024 ± 0.0031

kNN-CF 1.0542 1.0374 1.0329 1.0281 1.0151

± 0.0054 ±0.0082 ± 0.0091 ± 0.0048 ±0.0078

AF with bais 0.8535 0.8485 0.8416 0.8408 0.8371

± 0.0058 ±0.0038 ± 0.0036 ± 0.0026 ±0.0028

MF 0.8006 0.7907 0.7830 0.7765 0.7392

± 0.0039 ±0.0025 ± 0.0029 ±0.004 ± 0.0039

cPMF 0.7921 0.7816 0.7790 0.7459 0.7245

v 0.0049 ± 0.0064 ± 0.0038 ± 0.0028 ± 0.0045

CBT 0.7839 0.7845 0.7746 0.7576 0.7249

± 0.0058 ± 0.0049 ± 0.0035 ± 0.0035 ± 0.0038

UPC-CF 0.8542 0.8452 0.8256 0.8011 0.7888

± 0.0074 ± 0.0041 ± 0.0052 ± 0.0075 ± 0.0091

NUSC-CF 0.7912 0.7875 0.7653 0.7622 0.7503

± 0.0082 ± 0.0042 ± 0.0046 ± 0.0074 ± 0.0081

UP-CDRSs 0.7746 0.7617 0.7538 0.7215 0.7042

± 0.0025 ±0.0035 ± 0.004 ±0.0043 ± 0.0026

mance of our UP-CDRSs method, we have used MAE andRMSE as evaluation metrics. Table 9 and 10 show MAEand RMSE performance of proposed method over existing

methods, respectively. Table 11 shows overall performancebased on average MAE and average RMSE of differentsparsity levels. Bold symbols in Tables 9, 10 and 11 show

Table 10 Comparison resultsin terms of RMSE Methods % of observed ratings

1% 1.5% 2% 2.5% 2.9%

AF 1.4442 1.3878 1.3771 1.3745 1.3659

± 0.0093 ± 0.0048 ±0.0051 ± 0.0026 ± 0.0039

kNN-CF 1.1549 1.1254 1.1081 1.0973 1.0882

± 0.0079 ± 0.0041 ± 0.0045 ± 0.0028 ± 0.0053

AF with bais 1.1139 1.1106 1.1015 1.0869 1.0741

± 0.0073 ±0.005 ± 0.0034 ± 0.0029 ± 0.004

MF 1.056 1.0483 1.0384 1.0289 1.0188

± 0.0058 ± 0.0054 ± 0.0042 ±0.0044 ± 0.0026

cPMF 1.0526 1.0485 1.0257 1.0146 1.0085

± 0.0039 ± 0.0036 ± 0.0032 ± 0.0044 ± 0.0038

CBT 1.0498 1.0347 1.0212 1.0047 0.9914

±0.0048 ± 0.0066 ± 0.0032 ± 0.0044 ± 0.0057

UPC-CF 1.1058 1.0951 1.0746 1.0731 1.0701

± 0.0028 ± 0.0091 ± 0.0079 ± 0.0076 ± 0.0025

NUSC-CF 1.1019 1.0945 1.0891 1.0745 1.0684

± 0.0042 ± 0.0046 ± 0.0018 ± 0.0078 ± 0.0086

UP-CDRSs 1.0315 1.0234 1.0179 0.9987 0.9841

± 0.0034 ± 0.0045 ± 0.0065 ± 0.005 ± 0.0061

2478


Table 11 The overall performance based on average MAE and averageRMSE of different sparsity levels

Methods The overall performance

MAE RMSE

AF 1.0967 1.3899

± 0.0038 ± 0.0051

kNN-CF 1.0335 1.1148

± 0.0071 ± 0.0049

AF with bais 0.8443 1.0974

± 0.0037 ± 0.0045

MF 0.7780 1.0381

± 0.0034 ± 0.0045

cPMF 0.7646 1.0300

± 0.0045 ± 0.0038

CBT 0.7651 1.0204

± 0.0043 ± 0.0050

UPC-CF 0.8230 1.0837

± 0.0067 ± 0.0060

NUSC-CF 0.7713 1.0850

± 0.0065 ± 0.0054

UP-CDRSs 0.7432 1.0111

± 0.0034 ± 0.0051

the higher performance (lower MAE and RMSE) of ourproposed UP-CDRSs method over existing methods.

The following observation can be made from the results:

(i) A first method AF, we can observe that it gave thevery worse results in terms of accuracy performance.Another observation is that while increasing the sizeof the training set, not much improvement in accuracywas made.

(ii) kNN-CF is one of the traditional methods of CF.It gave better results compared to the AF method.As the training data increased, the performance ofkNN-CF also increased.

(iii) AF with bias and MF both gave impressive resultsbecause both methods belong learning based mech-anism. Compare with kNN-CF, MF achieved greatperformance, i.e., 28% and 18% preference haveincreased compared with kNN-CF in MAE andRMSE, respectively.

(iv) cPMF is a variant of MF method wherein one extraleant factor is used for critical users. So, it gave betterresult compare to MF.

(v) Compare with transfer learning method, CBT pro-vided best results compare with non-transfer learningmethods. We are able to say transfer learning performswell by using the knowledge of other related domain.

(vi) UPC-PC and NUSC-CF, the variants of kNN-CF,have provided better results compare to kNN-CF.

(vii) Our UP-CDRSs method gave superior results com-pared with all state-of-the-art with and without trans-fer learning learning methods. Comparing with CBT,MAE and RMSE performance have increased by3% and 1% respectively. Comparing with NUSC-CF,3.6% MAE has increased.

6 Conclusion and future direction

In this paper, we have proposed a novel method name asUser Profile as a Bridge in Cross-domain RecommenderSystems (UP-CDRSs) for knowledge transfer from sourceto the target domain. For establishing a bridge betweendomains, we have used user profile which is build fromuser’s preferences and demographical information of a user.After building the user profile in both domains, we calculatemany similarity between users profile of distinct domains,and we then apply alternating least square approach inobjective function which is formulated by probabilisticgraphical model. After learning all latent factor of user anditem, prediction on unrated items in the target domain ismade using corresponding latent factors of user and item.We have done five experiments to validate UP-CDRSsmethod and analysed how transfer learning is effective inCDRSs framework. The experimental results show that ourproposed UP-CDRSs method performs significantly betterthan several with and without transfer learning methods.

In future, we will extend our method to use a slightlydifferent types of domain, for instance (movie v/s book) tocheck how to transfer learning framework is significantlyeffective.

Publisher’s note Springer Nature remains neutral with regard tojurisdictional claims in published maps and institutional affiliations.

References

1. Adomavicius G, Tuzhilin A (2005) Toward the next generation ofrecommender systems: a survey of the state-of-the-art and possibleextensions. IEEE Trans Knowl Data Eng 17(6):734–749

2. Li Z, Zhao H, Liu Q, Huang Z, Mei T, Chen E (2018)Learning from history and present: Next-item recommendationvia discriminatively exploiting user behaviors. In: KDD, pp 1734–1743. ACM

3. Candillier L, Meyer F, Boulle M (2007) Comparing state-of-the-art collaborative filtering systems. Lect Notes Comput Sci4571:548

4. Jiang L, Cheng Y, Li Y, Li J, Yan H, Wang X (2018) A trust-basedcollaborative filtering algorithm for E-commerce recommendationsystem. J Ambient Intell Humaniz Comput 0(0):0

5. Bobadilla J, Ortega F, Hernando A, Gutierrez A (2013)Recommender systems survey. Knowl-Based Syst 46:109–132

6. Koohi H, Kiani K, Hwangbo H, Kim Y (2017) A newmethod to find neighbor users that improves the performance ofCollaborative Filtering. Expert Syst Appl 89:254–265

2479


7. Zhang J, Lin Y, Lin M, Liu J (2016) An effective collaborativefiltering algorithm based on user preference clustering. Appl Intell45(2):230–240

8. Dakhel AM, Malazi HT, Mahdavi M (2018) A social recom-mender system using item asymmetric correlation. Appl Intell48(3):527–540

9. Koren Y, Bell R, Volinsky C (2009) Matrix factorizationtechniques for recommender systems. Comput 42(8):30–37

10. Li Y, Wang D, He H, Jiao L, Xue Y (2017) Miningintrinsic information by matrix factorization-based approaches forcollaborative filtering in recommender systems. Neurocomputing249:48–63

11. Zhang F, Lu Y, Chen J, Liu S, Ling Z (2017) Robust collaborativefiltering based on non-negative matrix factorization and R1-norm.Knowl-Based Syst 118:177–190

12. Hernando A, Bobadilla J, Ortega F (2016) A non negative matrixfactorization for collaborative filtering recommender systemsbased on a Bayesian probabilistic model. Knowl-Based Syst97:188–202

13. Himabindu TVR, Padmanabhan V, Pujari AK (2018) Conformalmatrix factorization based recommender system. InformationSciences

14. Salakhutdinov R, Mnih A (2007) Probabilistic matrix factoriza-tion. In: Proceedings of the 20th International Conference onNeural Information Processing Systems, NIPS’07, pp 1257–1264,USA. Curran Associates Inc

15. Pan W, Yang Q (2013) Transfer learning in heterogeneouscollaborative filtering domains. Artif Intell. 197:39–55

16. Pan W (2016) A survey of transfer learning for collaborativerecommendation with auxiliary data. Neurocomputing 177:447–453

17. Xin X, Liu Z, Lin C-Y, Huang H, Wei X, Guo P (2015) Cross-domain collaborative filtering with review text. In: Proceedingsof the 24th international conference on artificial intelligence,IJCAI’15, pp 1827–1833. AAAI Press

18. Guo G, Qiu H, Tan Z, Liu Y, Ma J, Wang X (2017) Resolvingdata sparsity by multi-type auxiliary implicit feedback forrecommender systems. Knowl-Based Syst 138:202–207

19. Li B, Yang Q, Xue X (2009) Can movies and books collaborate?:Cross-domain collaborative filtering for sparsity reduction. In:Proceedings of the 21st International Jont conference on artificalintelligence, IJCAI’09, pp 2052–2057, San Francisco, CA, USA.Morgan Kaufmann Publishers Inc

20. Khan MM, Ibrahim R, Ghani I (2017) Cross domain recommendersystems: a systematic literature review. ACM Comput Surv50(3):1–34

21. Zhu F, Wang Y, Chen C, Liu G, Orgun M, Wu J (2017) A DeepFramework for Cross-Domain and Cross-System Recommenda-tions. pp 3711–3717

22. He M, Zhang J, Yang P, Yao K (2018) Robust transfer learning forcross-domain collaborative filtering using multiple rating patternsapproximation. In: Proceedings of the 11th ACM internationalconference on web search and data mining - WSDM ’18, pp 225–233

23. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE TransKnowl Data Eng 22(10):1345–1359

24. Cremonesi P, Tripodi A, Turrin R (2011) Cross-domain rec-ommender systems. In: ICDMW2011: IEEE 11th internationalconference on data mining workshops, pp 496–503

25. Shi Y, Larson M, Hanjalic A (2011) Tags as bridges betweendomains: improving recommendation with tag-induced cross-domain collaborative filtering. In: Proceedings of the 19th interna-tional conference on user modeling, adaption, and personalization,UMAP’11. Springer-Verlag, Berlin, pp 305–316

26. Enrich M, Braunhofer M, Ricci F (2013) Cold-start managementwith cross-domain collaborative filtering and tags. Springer,Berlin, pp 101–112

27. Fernandez-Tobı (2014) Exploiting social tags in matrix factoriza-tion models for cross-domain collaborative filtering. In: CBREc-sys@ recsys, pp 34–41

28. Sahu AK, Dwivedi P, Kant V (2018) Tags and item features as abridge for cross-domain recommender systems. Procedia ComputSci 125:624–631

29. Bishop CM (2006) Pattern recognition and machine learning(information science and statistics). Springer-Verlag New York,Inc., Secaucus

30. Al-Shamri MYH (2016) User profiling approaches for demo-graphic recommender systems. Knowl-Based Syst 100:175–187

31. Ma H, Yang H, Lyu MR, Sorec IK (2008) Social recommenda-tion using probabilistic matrix factorization. In: Proceedings of the17th ACM conference on information and knowledge manage-ment, CIKM ’08, pp 931–940, New York, NY, USA. ACM

32. Yu X, Chu Y, Jiang F, Guo Y, Gong D (2018) Knowledge-based systems SVMs classification based two-side cross domaincollaborative filtering by inferring intrinsic user and item features.Knowl-Based Syst 141:80–91

33. Zheng X, Luo Y, Sun L, Ding X, Ji Z (2018) A novelsocial network hybrid recommender system based on hypergraphtopologic structure. World Wide Web 21(4):985–1013

34. Chou S-Y, Yang Y-H, Jang J-SR, Lin Y-C (2016) Addressingcold start for next-song recommendation. In: Proceedings of the10th ACM conference on recommender systems - RecSys ’16,pp 115–118

35. Valdez ERN, Lovelle JMC, Martınez SO, Garcıa-dıaz V, Ordonezde Pablos P, Marın CEM (2012) Implicit feedback techniques onrecommender systems applied to electronic books. Comput HumBehav 28(4):1186–1193

36. Crespo RG, Martınez OS, Lovelle JMC, Garcıa-Bustelo CPB,Gayo JEL, Ordonez de Pablos P (2011) Recommendation systembased on user interaction data applied to intelligent electronicbooks. Comput Hum Behav 27(4):1445–1449

37. Dang Thanh N, Son LH, Ali M (2017) Neutrosophic rec-ommender system for medical diagnosis based on algebraicsimilarity measure and clustering. In: 2017 IEEE Interna-tional Conference on Fuzzy Systems (FUZZ-IEEE), pp 1–6.https://doi.org/10.1109/FUZZ-IEEE.2017.8015387

38. Le HS, Thong NT (2015) Intuitionistic fuzzy recommendersystems: an effective tool for medical diagnosis. Knowl-basedSyst 74:133–150

39. Dwivedi P, Bharadwaj KK (2015) E-learning recommendersystem for a group of learners based on the unified learner profileapproach. Expert Syst 32(2):264–276

40. Liu H, Hu Z, Mian A, Tian H, Zhu X (2014) A new usersimilarity model to improve the accuracy of collaborative filtering.Knowl-Based Syst 56(Supplement C):156–166

41. Son LH (2015) HU-FCF++: a novel hybrid method for the newuser cold-start problem in recommender systems. Eng Appl ArtifIntell 41:207–222

42. Biswas S, Lakshmanan LVS, Roy SB (2017) Combating the coldstart user problem in model based collaborative filtering. CoRR,arXiv:1703.00397

43. Son LH (2016) Dealing with the new user cold-start problem inrecommender systems: a comparative review. Inf Syst 58:87–104

44. Fernandez-Tobıas I, Cantador I, Kaminskas M, Ricci F (2012)Cross-domain recommender systems: a survey of the state of theart. In: Spanish conference on information retrieval

45. Winoto P, Tang T (2008) If you like the devil wears prada the book,will you also enjoy the devil wears prada the movie? a study ofcross-domain recommendations. N Gener Comput 26(3):209–225

2480

https://doi.org/10.1109/FUZZ-IEEE.2017.8015387

http://arXiv.org/abs/1703.00397


46. Berkovsky S, Kuflik T, Ricci F (2007) Cross-Domain mediation incollaborative filtering 2 Cross-Domain mediation in collaborativefiltering. User Model 4511:355–359

47. Hu L, Cao J, Xu G, Cao L, Gu Z, Zhu C (2013) Personalizedrecommendation via cross-domain triadic factorization. In:Proceedings of the 22nd international conference on World WideWeb - WWW ’13, pp 595–606

48. Pan W, Xiang EW, Liu NN, Yang Q (2010) Transfer learning incollaborative filtering for sparsity reduction. In: Proceedings ofthe 24th AAAI conference on artificial intelligence, AAAI’10, pp230–235. AAAI Press

49. Fang Z, Gao S, Li B, Li J, Liao J (2016) Cross-domainrecommendation via tag matrix transfer. In: Proceedings - 15thIEEE international conference on data mining workshop, ICDMW2015, pp 1235–1240

50. Zhao L, Pan SJ, Yang Q (2017) A unified framework of activetransfer learning for cross-system recommendation. Artif Intell245:38–55

51. Koren Y, Bell R (2015) Advances in collaborative filtering.Recommender systems handbook, 2nd edn. pp 77–118

52. Li T, Ding C (2006) The relationships among various nonnegativematrix factorization methods for clustering. In: Proceedings -IEEE international conference on data mining, ICDM, (1):362–371

53. Al-Shamri MYH, Bharadwaj KK (2008) Fuzzy-genetic approachto recommender systems based on a novel hybrid user model.Expert Syst Appl 35(3):1386–1399

54. Huang J, Zhu K, Zhong N (2016) A probabilistic inference modelfor recommender systems. Appl Intell 45(3):686–694

2481

Documents

User profile as a bridge in cross-domain recommender systems …static.tongtianta.site/paper_pdf/2edf5326-9fbb-11e9-b630... · 2019-07-06 · In the past two decades, recommender