12
Research Article An Improved Collaborative Filtering Recommendation Algorithm and Recommendation Strategy Xiaofeng Li 1 and Dong Li 2 1 Department of Information Engineering, Heilongjiang International University, Harbin 150025, China 2 School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China Correspondence should be addressed to Xiaofeng Li; [email protected] Received 25 September 2018; Accepted 7 April 2019; Published 7 May 2019 Guest Editor: Jaegeol Yim Copyright © 2019 Xiaofeng Li and Dong Li. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e e-commerce recommendation system mainly includes content recommendation technology, collaborative filtering rec- ommendation technology, and hybrid recommendation technology. e collaborative filtering recommendation technology is a successful application of personalized recommendation technology. However, due to the sparse data and cold start problems of the collaborative recommendation technology and the continuous expansion of data scale in e-commerce, the e-commerce recommendation system also faces many challenges. is paper has conducted useful exploration and research on the col- laborative recommendation technology. Firstly, this paper proposed an improved collaborative filtering algorithm. Secondly, the community detection algorithm is investigated, and two overlapping community detection algorithms based on the central node and k-based faction are proposed, which effectively mine the community in the network. Finally, we select a part of user communities from the user network projected by the user-item network as the candidate neighboring user set for the target user, thereby reducing calculation time and increasing recommendation speed and accuracy of the recommendation system. is paper has a perfect combination of social network technology and collaborative filtering technology, which can greatly increase recommendation system performance. is paper used the MovieLens dataset to test two performance indexes which include MAE and RMSE. e experimental results show that the improved collaborative filtering algorithm is superior to other two collaborative recommendation algorithms for MAE and RMSE performance. 1. Introduction As the amount of information continues to increase, the rapid development of Internet technology has brought us into the era of information explosion. e simultaneous appearance of massive information makes it difficult for users to find the parts they are interested in; on the other hand, it also makes a large number of people’s information to be “dark in- formation” in the network, which cannot be obtained by ordinary users. erefore, people need to spend a lot of time browsing and finding information of interest. People use search engines to retrieve information, such as traditional Google, Yahoo, and Baidu. Although information retrieval technology can meet people’s needs to a certain extent, due to its universal characteristics, it still cannot meet user requests of different backgrounds, different purposes, and different periods. Personalized recommendation service technology came into being. It can provide different services for different users to meet the need for personalized services [1, 2]. ere are mainly some problems in the current recommended technologies, like the more common collaborative recom- mendation technology, such as user-based and item-based collaborative filtering algorithms. Due to the sparsity of data, new users, and new items, the performance and accuracy of the recommendation system are limited. In addition, some online recommendation systems, real-time recommendation, are not guaranteed due to large-scale processing of data. erefore, it is still a hot topic to build a high accuracy and high scalability recommendation algorithm. As an important branch of data mining, social network analysis has received more and more attention [3]. Social network analysis is a kind of link analysis technology, which uses social networks as a research object to analyze the structure and behavior of social networks. Since there is a large amount of information in the Hindawi Mobile Information Systems Volume 2019, Article ID 3560968, 11 pages https://doi.org/10.1155/2019/3560968

AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

Research ArticleAn Improved Collaborative Filtering RecommendationAlgorithm and Recommendation Strategy

Xiaofeng Li 1 and Dong Li2

1Department of Information Engineering Heilongjiang International University Harbin 150025 China2School of Computer Science and Technology Harbin Institute of Technology Harbin 150001 China

Correspondence should be addressed to Xiaofeng Li lipf2012126com

Received 25 September 2018 Accepted 7 April 2019 Published 7 May 2019

Guest Editor Jaegeol Yim

Copyright copy 2019 Xiaofeng Li and Dong Li -is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

-e e-commerce recommendation system mainly includes content recommendation technology collaborative filtering rec-ommendation technology and hybrid recommendation technology -e collaborative filtering recommendation technology is asuccessful application of personalized recommendation technology However due to the sparse data and cold start problems ofthe collaborative recommendation technology and the continuous expansion of data scale in e-commerce the e-commercerecommendation system also faces many challenges -is paper has conducted useful exploration and research on the col-laborative recommendation technology Firstly this paper proposed an improved collaborative filtering algorithm Secondly thecommunity detection algorithm is investigated and two overlapping community detection algorithms based on the central nodeand k-based faction are proposed which effectively mine the community in the network Finally we select a part of usercommunities from the user network projected by the user-item network as the candidate neighboring user set for the target userthereby reducing calculation time and increasing recommendation speed and accuracy of the recommendation system-is paperhas a perfect combination of social network technology and collaborative filtering technology which can greatly increaserecommendation system performance -is paper used the MovieLens dataset to test two performance indexes which includeMAE and RMSE -e experimental results show that the improved collaborative filtering algorithm is superior to other twocollaborative recommendation algorithms for MAE and RMSE performance

1 Introduction

As the amount of information continues to increase the rapiddevelopment of Internet technology has brought us into theera of information explosion-e simultaneous appearance ofmassive information makes it difficult for users to find theparts they are interested in on the other hand it also makes alarge number of peoplersquos information to be ldquodark in-formationrdquo in the network which cannot be obtained byordinary users -erefore people need to spend a lot of timebrowsing and finding information of interest People usesearch engines to retrieve information such as traditionalGoogle Yahoo and Baidu Although information retrievaltechnology can meet peoplersquos needs to a certain extent due toits universal characteristics it still cannot meet user requestsof different backgrounds different purposes and differentperiods Personalized recommendation service technology

came into being It can provide different services for differentusers to meet the need for personalized services [1 2] -ereare mainly some problems in the current recommendedtechnologies like the more common collaborative recom-mendation technology such as user-based and item-basedcollaborative filtering algorithms Due to the sparsity of datanew users and new items the performance and accuracy ofthe recommendation system are limited In addition someonline recommendation systems real-time recommendationare not guaranteed due to large-scale processing of data-erefore it is still a hot topic to build a high accuracy andhigh scalability recommendation algorithm As an importantbranch of data mining social network analysis has receivedmore andmore attention [3] Social network analysis is a kindof link analysis technology which uses social networks as aresearch object to analyze the structure and behavior of socialnetworks Since there is a large amount of information in the

HindawiMobile Information SystemsVolume 2019 Article ID 3560968 11 pageshttpsdoiorg10115520193560968

social network that can be used for analysis andmining socialnetwork analysis has been introduced into various applicationfields -e existing P2P trust technology has actually adoptedthe social network analysismethod It has been shown that thepersonalized recommendation based on the social networkcan solve the problem of data sparseness new users and newitems in the traditional personalized recommendation Basedon the research of existing recommendation technologies thispaper proposes to introduce social network analysis tech-nology into the recommendation system to realize a highefficiency and high scalability recommendation system tosolve the problems of new users and new items [4]

-e collaborative filtering recommendation techniquehas both practical values and shortcomings From theperspective of recommendation system implementation thememory-based collaborative recommendation is to load theuser and product information into the internal memory andcarry out real-time recommendations through the calcula-tion [5ndash7] It is characterized by simple implementation andno training overhead [8 9] However the large amount ofinformation will lead to excessive system overhead and slowrecommendation speed Although the model-based rec-ommendation is faster the system will retain the data toobtain the new model and produce large overhead whenadding new users new items or new scoring [10 11] Inrecent years there have been many improved recommen-dation algorithms based on collaborative filtering to solve theproblems For example Rashid et al [12] put forward thecollaborative filtering algorithm based on the clusteringmodel Xue et al proposed to adopt the clustering smoothcollaborative filtering algorithm -erefore how to create anaccurate fast and effective high-scalable recommendationsystem has become the trend of research-is paper combinesthe social network technology with the collaborative filteringrecommendation technique applies the community detectionidea to the collaborative recommendation and adopts thescoring pretreatment mechanism during the collaborativerecommendation to prevent the data scarcity [13] After-wards this paper introduces the improved community de-tection algorithm and organically combined the communitydetection algorithm with the collaborative recommendationalgorithm and studied the collaborative filtering recom-mendation algorithms based on the community detection

2 Improved Community Detection Algorithm

21 Overlapping Community Detection Algorithm Based onCentral Nodes

211 Basic Idea Taking the ldquocentral noderdquo as the initialcommunity C this paper adds the neighbor node with thelargest contribution to the community C to this communityand a community will be built when the global contributionreaches the maximum if there are more nodes with largercontribution to the community these nodes will be added tothese communities After extracting the community C thenodes and edges of community C are not deleted from thenetwork in order to mine the overlapping nodes betweencommunities

212 Basic Concept To illustrate the algorithm proposed inthis section first define the following

(1) Local contribution q [14]

q Lin

Lin + Lout (1)

Lin if it is not entitled to map it represents thenumber of internal sides of the community if it is aright to map it is the sum of the weights of all theedges within the community Lout if it is not entitledto map it represents the number of external sides ofthe community if it is the right to map it is the sumof the weights of all the sides of the community andthe outside q the greater the q the greater thecontribution to the community and vice versa

(2) Global contribution degree Q this represents thecurrent maximum contribution in the communitydetection process-is indicator is used to determinewhether the community structure achieves the beststate

(3) Overlap degree the overlap of Ci and Cj in thecommunity is defined as S

S Ci capCj

11138681113868111386811138681113868

11138681113868111386811138681113868

Ci cupCj

11138681113868111386811138681113868

11138681113868111386811138681113868 (2)

213 Algorithm Flow Algorithm is divided into two stages-e first stage is to dig the community in the network andthe second stage is to adjust the mining community

(1) Community mining

Step 1 to calculate the degree of each node in thenetwork select the largest node i as the initialcommunity Ci and the node i to do the tag initialglobal contribution Q 0Step 2 find all the nodes connected to the com-munity Ci in the network and put them into theneighbor node set UStep 3 for each node j in the U according toequation (1) to calculate the contribution of node jto the community qj If the offer J degrees with thelargest node degree qmax maxqj)geQ then addnode j to the community Ci Return to step 2 tocontinue otherwise turn to step 4Step 4 global contribution of Q has reached amaximum value Get community CiStep 5 if the network has no unlabeled nodes allcommunity networks have been detected the endof the process otherwise the node never labeledselect nodes as the initial community of new Ci-en return to step 2 to continue

(2) Community adjustment in fact some overlappingnodes belong to multiple communities which haveoverlapping nodes between communities commu-nity Ci and Cj when the overlap threshold reached T(07) Ci and Cj are closely related so the community

2 Mobile Information Systems

Ci and Cj merge -e specific process of communityadjustment is as follows

Step 1 to calculate the degree of overlap S betweenCi and Cj in any two communitiesStep 2 Ci and Cj are merged into a communitywhen the overlap degree of S is greater than thethreshold value of TStep 3 if any of the two communities overlapcoefficient is less than the threshold T adjust theend otherwise return to step 1 continue to adjust

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network [15](2) 115-node American College Football network

[16](3) 62-node Dolphins network [17]

Experiment 1 Experiment 2 and Experiment 3 respec-tively are testing in the datasets (1) (2) and (3) on the centernode based on the overlapping community detection algorithm

Experiment 1 the dataset Zachary Karate Club networkcontains 34 nodes -e real network contains twocommunities-rough the community dividing according to theoverlapping the community detection algorithm basedon central nodes the original network is divided intofour communities where nodes 9 10 and 31 areoverlapping nodes -ere is no significant differencebetween the final community results dividing accordingto the algorithm and two communities of the realnetwork Two communities in the funnel network areresegmented-e node sets (5 6 7 11 and 17) and (2526 29 and 32) are extracted as separate communitiesand the four communities in Figure 1 are built In factthe node set (5 6 7 11 and 17) is closely connected andcan be extracted as a separate community the node set(25 26 29 and 32) is less connected to other nodes inthe original community and can also be used as aseparate community Moreover nodes 9 10 and 31 areboundary nodes of two communities and have anequivalent contribution to two communities so theycan belong to two communities simultaneously It isshown in Figure 1Experiment 2 actually American College Footballnetwork has 115 nodes and the real network contains12 independent communities -is paper takes ad-vantage of the overlapping community based on centralnodes mines the American College Football networkthrough the detection algorithm and mines 10 com-munities in total It is shown in Figure 2-e vast majority of 10 communities dividing accordingto the overlapping community detection algorithmbased on central nodes are consistent with the AmericanCollege Football network of which seven communitieshave exactly the same nodes and the node divisionaccuracy rate is up to 85 or higher In the actual

network one community has fewer internal edges thanthe one between communities -erefore when usingthe proposed algorithm this paper shall distribute allnodes within the community to other communities -ered nodes (60 43 91 5 94 15 59 and 98) in Figure 2represent overlapping nodes between two communities-rough careful analysis it is found that those nodeshave larger contributions to the communities they be-long to -erefore the overlapping nodes are consistentwith the actual situationExperiment 3 the dataset Dolphins network is thecategoric dolphin network -e actual network is di-vided into two communities-e community is dividedby taking advantage of the proposed algorithm andfour communities are obtained It is shown in Figure 3-e overlapping community detection algorithm basedon central nodes divides the Dolphins network into twocommunities where nodes 8 and 40 are overlappingnodes Compared with two communities in the actualnetwork except the overlapping nodes the results ofthe other nonnodes are identical -e nodes 8 and 40have the same number of connections with respect tocommunications and have the same contribution to thetwo communities so it is natural for them to belong totwo communities as overlapping nodes

22 Overlapping CommunityDetectionAlgorithmBased on k-Faction

221 Basic Idea In a sense the community can be regardedas a set of interconnected ldquosmall fully coupled networksrdquo-ese ldquofully coupled networkrdquo is called the ldquofactionrdquo wherek-faction indicates the number of nodes in the fully couplednetwork is k

-is paper introduces the overlapping community de-tection algorithm based on k-faction -e algorithm firstdetects all the k-faction from the network as the initialcommunity Afterwards this paper merges the initialcommunity yet there are still some nodes not adding to anyfaction By adding these nodes to the most closely connectedcommunity this paper will get all communities on thenetwork To make the community division results moreaccurate the algorithm finally further optimizes the com-munity and the optimization criteria are to make themodularity achieve the maximum

222 Basic Concept

(1) Faction the complete graph of the network(2) -e largest faction the network does not belong to

any other faction(3) Overlap it is defined as the number of nodes in the

community of twotwo communities with a smallnumber of nodes in the community

(4) Connectivity between communities it is definedas the number of connected sides of the two

Mobile Information Systems 3

communitiesthe total number of connections withinthe two communities

(5) e point of contact with the community is closelyrelated to M the number of vertices and edges ofthe community and the number of vertices in thecommunity

(6) Module Degree Qc

Qc 1Lsumi

sumvw

avciawci Avw minuskvkwL

( ) (3)

223 Algorithm Flow e algorithm is divided intofour processes First process from the network to nd allthe 2 factions more than the largest faction and these

11

5

717

68

3

14 4

13 12

18

22

202

110

9

31

26 2529

3233

21151916

24

23

27

30

28

34

Figure 1 Overlapping community detection algorithm based on the center node

28 63

96

7

77

11488

2197

57

18

66

34

26

110 238

104

46

90 106

112

869

9

23

109

22

78

5279

105

4294

10

17

1 24

91

5

8180

30

83

36

95

102

56

20

31

43

59

60

98

15

7025 85 53

99

103

108

82

7311

756

1229

51

4

41

374558

64

11367 76

49

92

8793

115

47

89

50

74 84

54

111

68

16

314

61

65

7 107 40

48

101

33

Figure 2 American College Football network divided into 10 communities

4 Mobile Information Systems

factions as the initial community second processaccording to the community between the point and edgeoverlap to merge community third process join the nodesto join the most close community community expansionfourth process tuning of the second phase of thecommunity

(1) Finding the largest faction BronndashKerbosch algo-rithm is used to nd the maximum clique in thenetwork and the maximum number of nodes in thenetwork is more than 3 as the initial community

(2) Merger community because there are a number ofoverlapping nodes in all of the biggest factions thesefactions can be combined into a community whenthe overlap is over a threshold value of T in additionif the connection between the two communities iscloser than the threshold value of CONN the twocommunities should also be merged into the samecommunity e combined community consists oftwo steps the rst step is based on the point of themerger the second step is based on the combinationof edge

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network(2) 115-node American College Football network(3) 62-node Dolphins network

Experiment 4 Experiment 5 and Experiment 6 re-spectively are testing in the datasets (1) (2) and (3) on thecenter node based on the overlapping community detection

algorithm e eect of the algorithm is investigated Ex-perimental parameters are setting CONN 05 and T 065

Experiment 4 the classic dataset Zachary Karate Clubnetwork contains 34 nodes By using the overlappingcommunity detection algorithm based on k faction thealgorithm is divided into two communities as shown inFigure 4From Figure 4 we can learn that the proposed algo-rithm is almost identical to the original network exceptfor the node 10 In fact 10 nodes and two communityconnection numbers as overlapping nodes were addedin two communities in accordance with the actualsituationExperiment 5 American College Football network isbased on the overlapping community detection algo-rithm based on k-faction e network is divided intothe community and the results are shown in Figure 5From Figure 5 we can see that a total of 13 commu-nities and the original network compared to a total of87 of the nodes are correctly divided compared to theoriginal network of which 7 communities are exactlythe same and there are 4 almost identical In allcommunities there is not any overlapping nodeExperiment 6 dataset Dolphins network is a classicdolphin network e actual network is divided intotwo communities Use this section to carry out com-munity division and get four communities as shown inFigure 6e overlapping community detection algorithm basedon the k-faction is divided into four communities

282

27

26

3223

14

7

1857

55

49

58

20

42

610

33

61

40

8

45

21

2931

37

48

35

17

38

3

1

4324

12

5

19

56

25

3622

30

1652

4611

60

9

4

51

44

53

5462

13

1541

34

4750

3959

Figure 3 Dolphins network divided into 2 communities

Mobile Information Systems 5

33

29

30 32

26

28

26

27

34 9

15 2331

16

21

19

24

3

10

2

188 22

12

20

14

1

11

4 13

5

7

17

6

Figure 4 Community based on the overlapping community detection algorithm based on k-faction

4

75108

8599

103

82

41

73

536

11

80 95 102

8330

81

3120

56

36

94

1710

105

142

524

93

67

49

58

92

87

113

4576

3219

35

72

62

55 100

922

69

112

52

879

78 23

109

57

66

97

28

8821

71

63

77 18

96

11439 44

43

86

27

13

15

37

110

90

34

38

2

104

26

46

106

89

84

50

544768

115

74111

33

6516

3

107

101

48

61

14

740

59

60

64

98

2012

7091

5125

Figure 5 American College Football network divided into 13 communities

6 Mobile Information Systems

while the original network has two communities But itcan be found that the four communities are dividedinto two subcommunities in two communities in theoriginal network Node 4 and node 9 are overlappingnodes

-ese three experiments are the results of communitydivision showing a better effect compared with the originalnetwork and there is a higher accuracy rate

-is paper analyzes the modularity Qc Shen putforward the criterion for evaluating the closeness ofoverlapping community indicating that the greater the Qcwas the closer the community was the sparser thecommunity is connected the better the community di-vision results are Based on this view this paper calculatesthe modularity of the above two algorithms and comparesthem with other overlapping community detection al-gorithms in detail

-ese algorithms are proposed by Shen et al [18] andZhang et al [19] Evans and Lambiotte [20] and Duanbinget al [21] proposed overlapping community detection al-gorithm and the results of the experimental comparison areshown in Table 1

Algorithm one is the overlapping community detectionalgorithm based on the center node algorithm two repre-sents the overlapping community detection algorithm basedon k-faction

From Table 1 the overlapping community detectionalgorithm based on k-faction has a better effect For the34-node Club network the algorithm in the module ofShen Zhang and Evans and other algorithms proposedwere slightly smaller and for the Football network and

Dolphins network this algorithm is better than otheralgorithms

-erefore the overlapping community detection algo-rithm based on the k-faction has a better effect whether it isfrom the accuracy of the community division or the angle ofthe module

3 Collaborative Filtering RecommendationAlgorithm Based on theCommunity Detection

31 Basic Idea Aiming at the problems in the traditionalcollaborative filtering recommendation system a collabo-rative recommendation algorithm based on communitydetection is proposed in this paper In collaborative filteringrecommendation algorithms based on the community de-tection the users themselves contain their own feature in-formation User attribute content largely reflects the similar

45

5939

13

15

50 47

5462

44 51

3435

1721

37 3841 9

4

16

25

36

30

19

56

22

524660

51224

53

148

113

43

29

31

20

8

27

4940

5855

2

28

57

7

10

6

61

33 14

23 32

2618

42

Figure 6 Dolphins network divided into four communities

Table 1 Comparison of algorithm modules

Dataset34-nodeClub

network

115-nodeFootballnetwork

62-nodeDolphinsnetwork

Algorithm 1 045 057 051Algorithm 2 042 060 055Actual network partitioning 042 056 040Shen 067 mdash 054Zhang 042 058 mdashEvans 046 mdash mdashChen 038 mdash 033

Mobile Information Systems 7

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 2: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

social network that can be used for analysis andmining socialnetwork analysis has been introduced into various applicationfields -e existing P2P trust technology has actually adoptedthe social network analysismethod It has been shown that thepersonalized recommendation based on the social networkcan solve the problem of data sparseness new users and newitems in the traditional personalized recommendation Basedon the research of existing recommendation technologies thispaper proposes to introduce social network analysis tech-nology into the recommendation system to realize a highefficiency and high scalability recommendation system tosolve the problems of new users and new items [4]

-e collaborative filtering recommendation techniquehas both practical values and shortcomings From theperspective of recommendation system implementation thememory-based collaborative recommendation is to load theuser and product information into the internal memory andcarry out real-time recommendations through the calcula-tion [5ndash7] It is characterized by simple implementation andno training overhead [8 9] However the large amount ofinformation will lead to excessive system overhead and slowrecommendation speed Although the model-based rec-ommendation is faster the system will retain the data toobtain the new model and produce large overhead whenadding new users new items or new scoring [10 11] Inrecent years there have been many improved recommen-dation algorithms based on collaborative filtering to solve theproblems For example Rashid et al [12] put forward thecollaborative filtering algorithm based on the clusteringmodel Xue et al proposed to adopt the clustering smoothcollaborative filtering algorithm -erefore how to create anaccurate fast and effective high-scalable recommendationsystem has become the trend of research-is paper combinesthe social network technology with the collaborative filteringrecommendation technique applies the community detectionidea to the collaborative recommendation and adopts thescoring pretreatment mechanism during the collaborativerecommendation to prevent the data scarcity [13] After-wards this paper introduces the improved community de-tection algorithm and organically combined the communitydetection algorithm with the collaborative recommendationalgorithm and studied the collaborative filtering recom-mendation algorithms based on the community detection

2 Improved Community Detection Algorithm

21 Overlapping Community Detection Algorithm Based onCentral Nodes

211 Basic Idea Taking the ldquocentral noderdquo as the initialcommunity C this paper adds the neighbor node with thelargest contribution to the community C to this communityand a community will be built when the global contributionreaches the maximum if there are more nodes with largercontribution to the community these nodes will be added tothese communities After extracting the community C thenodes and edges of community C are not deleted from thenetwork in order to mine the overlapping nodes betweencommunities

212 Basic Concept To illustrate the algorithm proposed inthis section first define the following

(1) Local contribution q [14]

q Lin

Lin + Lout (1)

Lin if it is not entitled to map it represents thenumber of internal sides of the community if it is aright to map it is the sum of the weights of all theedges within the community Lout if it is not entitledto map it represents the number of external sides ofthe community if it is the right to map it is the sumof the weights of all the sides of the community andthe outside q the greater the q the greater thecontribution to the community and vice versa

(2) Global contribution degree Q this represents thecurrent maximum contribution in the communitydetection process-is indicator is used to determinewhether the community structure achieves the beststate

(3) Overlap degree the overlap of Ci and Cj in thecommunity is defined as S

S Ci capCj

11138681113868111386811138681113868

11138681113868111386811138681113868

Ci cupCj

11138681113868111386811138681113868

11138681113868111386811138681113868 (2)

213 Algorithm Flow Algorithm is divided into two stages-e first stage is to dig the community in the network andthe second stage is to adjust the mining community

(1) Community mining

Step 1 to calculate the degree of each node in thenetwork select the largest node i as the initialcommunity Ci and the node i to do the tag initialglobal contribution Q 0Step 2 find all the nodes connected to the com-munity Ci in the network and put them into theneighbor node set UStep 3 for each node j in the U according toequation (1) to calculate the contribution of node jto the community qj If the offer J degrees with thelargest node degree qmax maxqj)geQ then addnode j to the community Ci Return to step 2 tocontinue otherwise turn to step 4Step 4 global contribution of Q has reached amaximum value Get community CiStep 5 if the network has no unlabeled nodes allcommunity networks have been detected the endof the process otherwise the node never labeledselect nodes as the initial community of new Ci-en return to step 2 to continue

(2) Community adjustment in fact some overlappingnodes belong to multiple communities which haveoverlapping nodes between communities commu-nity Ci and Cj when the overlap threshold reached T(07) Ci and Cj are closely related so the community

2 Mobile Information Systems

Ci and Cj merge -e specific process of communityadjustment is as follows

Step 1 to calculate the degree of overlap S betweenCi and Cj in any two communitiesStep 2 Ci and Cj are merged into a communitywhen the overlap degree of S is greater than thethreshold value of TStep 3 if any of the two communities overlapcoefficient is less than the threshold T adjust theend otherwise return to step 1 continue to adjust

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network [15](2) 115-node American College Football network

[16](3) 62-node Dolphins network [17]

Experiment 1 Experiment 2 and Experiment 3 respec-tively are testing in the datasets (1) (2) and (3) on the centernode based on the overlapping community detection algorithm

Experiment 1 the dataset Zachary Karate Club networkcontains 34 nodes -e real network contains twocommunities-rough the community dividing according to theoverlapping the community detection algorithm basedon central nodes the original network is divided intofour communities where nodes 9 10 and 31 areoverlapping nodes -ere is no significant differencebetween the final community results dividing accordingto the algorithm and two communities of the realnetwork Two communities in the funnel network areresegmented-e node sets (5 6 7 11 and 17) and (2526 29 and 32) are extracted as separate communitiesand the four communities in Figure 1 are built In factthe node set (5 6 7 11 and 17) is closely connected andcan be extracted as a separate community the node set(25 26 29 and 32) is less connected to other nodes inthe original community and can also be used as aseparate community Moreover nodes 9 10 and 31 areboundary nodes of two communities and have anequivalent contribution to two communities so theycan belong to two communities simultaneously It isshown in Figure 1Experiment 2 actually American College Footballnetwork has 115 nodes and the real network contains12 independent communities -is paper takes ad-vantage of the overlapping community based on centralnodes mines the American College Football networkthrough the detection algorithm and mines 10 com-munities in total It is shown in Figure 2-e vast majority of 10 communities dividing accordingto the overlapping community detection algorithmbased on central nodes are consistent with the AmericanCollege Football network of which seven communitieshave exactly the same nodes and the node divisionaccuracy rate is up to 85 or higher In the actual

network one community has fewer internal edges thanthe one between communities -erefore when usingthe proposed algorithm this paper shall distribute allnodes within the community to other communities -ered nodes (60 43 91 5 94 15 59 and 98) in Figure 2represent overlapping nodes between two communities-rough careful analysis it is found that those nodeshave larger contributions to the communities they be-long to -erefore the overlapping nodes are consistentwith the actual situationExperiment 3 the dataset Dolphins network is thecategoric dolphin network -e actual network is di-vided into two communities-e community is dividedby taking advantage of the proposed algorithm andfour communities are obtained It is shown in Figure 3-e overlapping community detection algorithm basedon central nodes divides the Dolphins network into twocommunities where nodes 8 and 40 are overlappingnodes Compared with two communities in the actualnetwork except the overlapping nodes the results ofthe other nonnodes are identical -e nodes 8 and 40have the same number of connections with respect tocommunications and have the same contribution to thetwo communities so it is natural for them to belong totwo communities as overlapping nodes

22 Overlapping CommunityDetectionAlgorithmBased on k-Faction

221 Basic Idea In a sense the community can be regardedas a set of interconnected ldquosmall fully coupled networksrdquo-ese ldquofully coupled networkrdquo is called the ldquofactionrdquo wherek-faction indicates the number of nodes in the fully couplednetwork is k

-is paper introduces the overlapping community de-tection algorithm based on k-faction -e algorithm firstdetects all the k-faction from the network as the initialcommunity Afterwards this paper merges the initialcommunity yet there are still some nodes not adding to anyfaction By adding these nodes to the most closely connectedcommunity this paper will get all communities on thenetwork To make the community division results moreaccurate the algorithm finally further optimizes the com-munity and the optimization criteria are to make themodularity achieve the maximum

222 Basic Concept

(1) Faction the complete graph of the network(2) -e largest faction the network does not belong to

any other faction(3) Overlap it is defined as the number of nodes in the

community of twotwo communities with a smallnumber of nodes in the community

(4) Connectivity between communities it is definedas the number of connected sides of the two

Mobile Information Systems 3

communitiesthe total number of connections withinthe two communities

(5) e point of contact with the community is closelyrelated to M the number of vertices and edges ofthe community and the number of vertices in thecommunity

(6) Module Degree Qc

Qc 1Lsumi

sumvw

avciawci Avw minuskvkwL

( ) (3)

223 Algorithm Flow e algorithm is divided intofour processes First process from the network to nd allthe 2 factions more than the largest faction and these

11

5

717

68

3

14 4

13 12

18

22

202

110

9

31

26 2529

3233

21151916

24

23

27

30

28

34

Figure 1 Overlapping community detection algorithm based on the center node

28 63

96

7

77

11488

2197

57

18

66

34

26

110 238

104

46

90 106

112

869

9

23

109

22

78

5279

105

4294

10

17

1 24

91

5

8180

30

83

36

95

102

56

20

31

43

59

60

98

15

7025 85 53

99

103

108

82

7311

756

1229

51

4

41

374558

64

11367 76

49

92

8793

115

47

89

50

74 84

54

111

68

16

314

61

65

7 107 40

48

101

33

Figure 2 American College Football network divided into 10 communities

4 Mobile Information Systems

factions as the initial community second processaccording to the community between the point and edgeoverlap to merge community third process join the nodesto join the most close community community expansionfourth process tuning of the second phase of thecommunity

(1) Finding the largest faction BronndashKerbosch algo-rithm is used to nd the maximum clique in thenetwork and the maximum number of nodes in thenetwork is more than 3 as the initial community

(2) Merger community because there are a number ofoverlapping nodes in all of the biggest factions thesefactions can be combined into a community whenthe overlap is over a threshold value of T in additionif the connection between the two communities iscloser than the threshold value of CONN the twocommunities should also be merged into the samecommunity e combined community consists oftwo steps the rst step is based on the point of themerger the second step is based on the combinationof edge

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network(2) 115-node American College Football network(3) 62-node Dolphins network

Experiment 4 Experiment 5 and Experiment 6 re-spectively are testing in the datasets (1) (2) and (3) on thecenter node based on the overlapping community detection

algorithm e eect of the algorithm is investigated Ex-perimental parameters are setting CONN 05 and T 065

Experiment 4 the classic dataset Zachary Karate Clubnetwork contains 34 nodes By using the overlappingcommunity detection algorithm based on k faction thealgorithm is divided into two communities as shown inFigure 4From Figure 4 we can learn that the proposed algo-rithm is almost identical to the original network exceptfor the node 10 In fact 10 nodes and two communityconnection numbers as overlapping nodes were addedin two communities in accordance with the actualsituationExperiment 5 American College Football network isbased on the overlapping community detection algo-rithm based on k-faction e network is divided intothe community and the results are shown in Figure 5From Figure 5 we can see that a total of 13 commu-nities and the original network compared to a total of87 of the nodes are correctly divided compared to theoriginal network of which 7 communities are exactlythe same and there are 4 almost identical In allcommunities there is not any overlapping nodeExperiment 6 dataset Dolphins network is a classicdolphin network e actual network is divided intotwo communities Use this section to carry out com-munity division and get four communities as shown inFigure 6e overlapping community detection algorithm basedon the k-faction is divided into four communities

282

27

26

3223

14

7

1857

55

49

58

20

42

610

33

61

40

8

45

21

2931

37

48

35

17

38

3

1

4324

12

5

19

56

25

3622

30

1652

4611

60

9

4

51

44

53

5462

13

1541

34

4750

3959

Figure 3 Dolphins network divided into 2 communities

Mobile Information Systems 5

33

29

30 32

26

28

26

27

34 9

15 2331

16

21

19

24

3

10

2

188 22

12

20

14

1

11

4 13

5

7

17

6

Figure 4 Community based on the overlapping community detection algorithm based on k-faction

4

75108

8599

103

82

41

73

536

11

80 95 102

8330

81

3120

56

36

94

1710

105

142

524

93

67

49

58

92

87

113

4576

3219

35

72

62

55 100

922

69

112

52

879

78 23

109

57

66

97

28

8821

71

63

77 18

96

11439 44

43

86

27

13

15

37

110

90

34

38

2

104

26

46

106

89

84

50

544768

115

74111

33

6516

3

107

101

48

61

14

740

59

60

64

98

2012

7091

5125

Figure 5 American College Football network divided into 13 communities

6 Mobile Information Systems

while the original network has two communities But itcan be found that the four communities are dividedinto two subcommunities in two communities in theoriginal network Node 4 and node 9 are overlappingnodes

-ese three experiments are the results of communitydivision showing a better effect compared with the originalnetwork and there is a higher accuracy rate

-is paper analyzes the modularity Qc Shen putforward the criterion for evaluating the closeness ofoverlapping community indicating that the greater the Qcwas the closer the community was the sparser thecommunity is connected the better the community di-vision results are Based on this view this paper calculatesthe modularity of the above two algorithms and comparesthem with other overlapping community detection al-gorithms in detail

-ese algorithms are proposed by Shen et al [18] andZhang et al [19] Evans and Lambiotte [20] and Duanbinget al [21] proposed overlapping community detection al-gorithm and the results of the experimental comparison areshown in Table 1

Algorithm one is the overlapping community detectionalgorithm based on the center node algorithm two repre-sents the overlapping community detection algorithm basedon k-faction

From Table 1 the overlapping community detectionalgorithm based on k-faction has a better effect For the34-node Club network the algorithm in the module ofShen Zhang and Evans and other algorithms proposedwere slightly smaller and for the Football network and

Dolphins network this algorithm is better than otheralgorithms

-erefore the overlapping community detection algo-rithm based on the k-faction has a better effect whether it isfrom the accuracy of the community division or the angle ofthe module

3 Collaborative Filtering RecommendationAlgorithm Based on theCommunity Detection

31 Basic Idea Aiming at the problems in the traditionalcollaborative filtering recommendation system a collabo-rative recommendation algorithm based on communitydetection is proposed in this paper In collaborative filteringrecommendation algorithms based on the community de-tection the users themselves contain their own feature in-formation User attribute content largely reflects the similar

45

5939

13

15

50 47

5462

44 51

3435

1721

37 3841 9

4

16

25

36

30

19

56

22

524660

51224

53

148

113

43

29

31

20

8

27

4940

5855

2

28

57

7

10

6

61

33 14

23 32

2618

42

Figure 6 Dolphins network divided into four communities

Table 1 Comparison of algorithm modules

Dataset34-nodeClub

network

115-nodeFootballnetwork

62-nodeDolphinsnetwork

Algorithm 1 045 057 051Algorithm 2 042 060 055Actual network partitioning 042 056 040Shen 067 mdash 054Zhang 042 058 mdashEvans 046 mdash mdashChen 038 mdash 033

Mobile Information Systems 7

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 3: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

Ci and Cj merge -e specific process of communityadjustment is as follows

Step 1 to calculate the degree of overlap S betweenCi and Cj in any two communitiesStep 2 Ci and Cj are merged into a communitywhen the overlap degree of S is greater than thethreshold value of TStep 3 if any of the two communities overlapcoefficient is less than the threshold T adjust theend otherwise return to step 1 continue to adjust

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network [15](2) 115-node American College Football network

[16](3) 62-node Dolphins network [17]

Experiment 1 Experiment 2 and Experiment 3 respec-tively are testing in the datasets (1) (2) and (3) on the centernode based on the overlapping community detection algorithm

Experiment 1 the dataset Zachary Karate Club networkcontains 34 nodes -e real network contains twocommunities-rough the community dividing according to theoverlapping the community detection algorithm basedon central nodes the original network is divided intofour communities where nodes 9 10 and 31 areoverlapping nodes -ere is no significant differencebetween the final community results dividing accordingto the algorithm and two communities of the realnetwork Two communities in the funnel network areresegmented-e node sets (5 6 7 11 and 17) and (2526 29 and 32) are extracted as separate communitiesand the four communities in Figure 1 are built In factthe node set (5 6 7 11 and 17) is closely connected andcan be extracted as a separate community the node set(25 26 29 and 32) is less connected to other nodes inthe original community and can also be used as aseparate community Moreover nodes 9 10 and 31 areboundary nodes of two communities and have anequivalent contribution to two communities so theycan belong to two communities simultaneously It isshown in Figure 1Experiment 2 actually American College Footballnetwork has 115 nodes and the real network contains12 independent communities -is paper takes ad-vantage of the overlapping community based on centralnodes mines the American College Football networkthrough the detection algorithm and mines 10 com-munities in total It is shown in Figure 2-e vast majority of 10 communities dividing accordingto the overlapping community detection algorithmbased on central nodes are consistent with the AmericanCollege Football network of which seven communitieshave exactly the same nodes and the node divisionaccuracy rate is up to 85 or higher In the actual

network one community has fewer internal edges thanthe one between communities -erefore when usingthe proposed algorithm this paper shall distribute allnodes within the community to other communities -ered nodes (60 43 91 5 94 15 59 and 98) in Figure 2represent overlapping nodes between two communities-rough careful analysis it is found that those nodeshave larger contributions to the communities they be-long to -erefore the overlapping nodes are consistentwith the actual situationExperiment 3 the dataset Dolphins network is thecategoric dolphin network -e actual network is di-vided into two communities-e community is dividedby taking advantage of the proposed algorithm andfour communities are obtained It is shown in Figure 3-e overlapping community detection algorithm basedon central nodes divides the Dolphins network into twocommunities where nodes 8 and 40 are overlappingnodes Compared with two communities in the actualnetwork except the overlapping nodes the results ofthe other nonnodes are identical -e nodes 8 and 40have the same number of connections with respect tocommunications and have the same contribution to thetwo communities so it is natural for them to belong totwo communities as overlapping nodes

22 Overlapping CommunityDetectionAlgorithmBased on k-Faction

221 Basic Idea In a sense the community can be regardedas a set of interconnected ldquosmall fully coupled networksrdquo-ese ldquofully coupled networkrdquo is called the ldquofactionrdquo wherek-faction indicates the number of nodes in the fully couplednetwork is k

-is paper introduces the overlapping community de-tection algorithm based on k-faction -e algorithm firstdetects all the k-faction from the network as the initialcommunity Afterwards this paper merges the initialcommunity yet there are still some nodes not adding to anyfaction By adding these nodes to the most closely connectedcommunity this paper will get all communities on thenetwork To make the community division results moreaccurate the algorithm finally further optimizes the com-munity and the optimization criteria are to make themodularity achieve the maximum

222 Basic Concept

(1) Faction the complete graph of the network(2) -e largest faction the network does not belong to

any other faction(3) Overlap it is defined as the number of nodes in the

community of twotwo communities with a smallnumber of nodes in the community

(4) Connectivity between communities it is definedas the number of connected sides of the two

Mobile Information Systems 3

communitiesthe total number of connections withinthe two communities

(5) e point of contact with the community is closelyrelated to M the number of vertices and edges ofthe community and the number of vertices in thecommunity

(6) Module Degree Qc

Qc 1Lsumi

sumvw

avciawci Avw minuskvkwL

( ) (3)

223 Algorithm Flow e algorithm is divided intofour processes First process from the network to nd allthe 2 factions more than the largest faction and these

11

5

717

68

3

14 4

13 12

18

22

202

110

9

31

26 2529

3233

21151916

24

23

27

30

28

34

Figure 1 Overlapping community detection algorithm based on the center node

28 63

96

7

77

11488

2197

57

18

66

34

26

110 238

104

46

90 106

112

869

9

23

109

22

78

5279

105

4294

10

17

1 24

91

5

8180

30

83

36

95

102

56

20

31

43

59

60

98

15

7025 85 53

99

103

108

82

7311

756

1229

51

4

41

374558

64

11367 76

49

92

8793

115

47

89

50

74 84

54

111

68

16

314

61

65

7 107 40

48

101

33

Figure 2 American College Football network divided into 10 communities

4 Mobile Information Systems

factions as the initial community second processaccording to the community between the point and edgeoverlap to merge community third process join the nodesto join the most close community community expansionfourth process tuning of the second phase of thecommunity

(1) Finding the largest faction BronndashKerbosch algo-rithm is used to nd the maximum clique in thenetwork and the maximum number of nodes in thenetwork is more than 3 as the initial community

(2) Merger community because there are a number ofoverlapping nodes in all of the biggest factions thesefactions can be combined into a community whenthe overlap is over a threshold value of T in additionif the connection between the two communities iscloser than the threshold value of CONN the twocommunities should also be merged into the samecommunity e combined community consists oftwo steps the rst step is based on the point of themerger the second step is based on the combinationof edge

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network(2) 115-node American College Football network(3) 62-node Dolphins network

Experiment 4 Experiment 5 and Experiment 6 re-spectively are testing in the datasets (1) (2) and (3) on thecenter node based on the overlapping community detection

algorithm e eect of the algorithm is investigated Ex-perimental parameters are setting CONN 05 and T 065

Experiment 4 the classic dataset Zachary Karate Clubnetwork contains 34 nodes By using the overlappingcommunity detection algorithm based on k faction thealgorithm is divided into two communities as shown inFigure 4From Figure 4 we can learn that the proposed algo-rithm is almost identical to the original network exceptfor the node 10 In fact 10 nodes and two communityconnection numbers as overlapping nodes were addedin two communities in accordance with the actualsituationExperiment 5 American College Football network isbased on the overlapping community detection algo-rithm based on k-faction e network is divided intothe community and the results are shown in Figure 5From Figure 5 we can see that a total of 13 commu-nities and the original network compared to a total of87 of the nodes are correctly divided compared to theoriginal network of which 7 communities are exactlythe same and there are 4 almost identical In allcommunities there is not any overlapping nodeExperiment 6 dataset Dolphins network is a classicdolphin network e actual network is divided intotwo communities Use this section to carry out com-munity division and get four communities as shown inFigure 6e overlapping community detection algorithm basedon the k-faction is divided into four communities

282

27

26

3223

14

7

1857

55

49

58

20

42

610

33

61

40

8

45

21

2931

37

48

35

17

38

3

1

4324

12

5

19

56

25

3622

30

1652

4611

60

9

4

51

44

53

5462

13

1541

34

4750

3959

Figure 3 Dolphins network divided into 2 communities

Mobile Information Systems 5

33

29

30 32

26

28

26

27

34 9

15 2331

16

21

19

24

3

10

2

188 22

12

20

14

1

11

4 13

5

7

17

6

Figure 4 Community based on the overlapping community detection algorithm based on k-faction

4

75108

8599

103

82

41

73

536

11

80 95 102

8330

81

3120

56

36

94

1710

105

142

524

93

67

49

58

92

87

113

4576

3219

35

72

62

55 100

922

69

112

52

879

78 23

109

57

66

97

28

8821

71

63

77 18

96

11439 44

43

86

27

13

15

37

110

90

34

38

2

104

26

46

106

89

84

50

544768

115

74111

33

6516

3

107

101

48

61

14

740

59

60

64

98

2012

7091

5125

Figure 5 American College Football network divided into 13 communities

6 Mobile Information Systems

while the original network has two communities But itcan be found that the four communities are dividedinto two subcommunities in two communities in theoriginal network Node 4 and node 9 are overlappingnodes

-ese three experiments are the results of communitydivision showing a better effect compared with the originalnetwork and there is a higher accuracy rate

-is paper analyzes the modularity Qc Shen putforward the criterion for evaluating the closeness ofoverlapping community indicating that the greater the Qcwas the closer the community was the sparser thecommunity is connected the better the community di-vision results are Based on this view this paper calculatesthe modularity of the above two algorithms and comparesthem with other overlapping community detection al-gorithms in detail

-ese algorithms are proposed by Shen et al [18] andZhang et al [19] Evans and Lambiotte [20] and Duanbinget al [21] proposed overlapping community detection al-gorithm and the results of the experimental comparison areshown in Table 1

Algorithm one is the overlapping community detectionalgorithm based on the center node algorithm two repre-sents the overlapping community detection algorithm basedon k-faction

From Table 1 the overlapping community detectionalgorithm based on k-faction has a better effect For the34-node Club network the algorithm in the module ofShen Zhang and Evans and other algorithms proposedwere slightly smaller and for the Football network and

Dolphins network this algorithm is better than otheralgorithms

-erefore the overlapping community detection algo-rithm based on the k-faction has a better effect whether it isfrom the accuracy of the community division or the angle ofthe module

3 Collaborative Filtering RecommendationAlgorithm Based on theCommunity Detection

31 Basic Idea Aiming at the problems in the traditionalcollaborative filtering recommendation system a collabo-rative recommendation algorithm based on communitydetection is proposed in this paper In collaborative filteringrecommendation algorithms based on the community de-tection the users themselves contain their own feature in-formation User attribute content largely reflects the similar

45

5939

13

15

50 47

5462

44 51

3435

1721

37 3841 9

4

16

25

36

30

19

56

22

524660

51224

53

148

113

43

29

31

20

8

27

4940

5855

2

28

57

7

10

6

61

33 14

23 32

2618

42

Figure 6 Dolphins network divided into four communities

Table 1 Comparison of algorithm modules

Dataset34-nodeClub

network

115-nodeFootballnetwork

62-nodeDolphinsnetwork

Algorithm 1 045 057 051Algorithm 2 042 060 055Actual network partitioning 042 056 040Shen 067 mdash 054Zhang 042 058 mdashEvans 046 mdash mdashChen 038 mdash 033

Mobile Information Systems 7

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 4: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

communitiesthe total number of connections withinthe two communities

(5) e point of contact with the community is closelyrelated to M the number of vertices and edges ofthe community and the number of vertices in thecommunity

(6) Module Degree Qc

Qc 1Lsumi

sumvw

avciawci Avw minuskvkwL

( ) (3)

223 Algorithm Flow e algorithm is divided intofour processes First process from the network to nd allthe 2 factions more than the largest faction and these

11

5

717

68

3

14 4

13 12

18

22

202

110

9

31

26 2529

3233

21151916

24

23

27

30

28

34

Figure 1 Overlapping community detection algorithm based on the center node

28 63

96

7

77

11488

2197

57

18

66

34

26

110 238

104

46

90 106

112

869

9

23

109

22

78

5279

105

4294

10

17

1 24

91

5

8180

30

83

36

95

102

56

20

31

43

59

60

98

15

7025 85 53

99

103

108

82

7311

756

1229

51

4

41

374558

64

11367 76

49

92

8793

115

47

89

50

74 84

54

111

68

16

314

61

65

7 107 40

48

101

33

Figure 2 American College Football network divided into 10 communities

4 Mobile Information Systems

factions as the initial community second processaccording to the community between the point and edgeoverlap to merge community third process join the nodesto join the most close community community expansionfourth process tuning of the second phase of thecommunity

(1) Finding the largest faction BronndashKerbosch algo-rithm is used to nd the maximum clique in thenetwork and the maximum number of nodes in thenetwork is more than 3 as the initial community

(2) Merger community because there are a number ofoverlapping nodes in all of the biggest factions thesefactions can be combined into a community whenthe overlap is over a threshold value of T in additionif the connection between the two communities iscloser than the threshold value of CONN the twocommunities should also be merged into the samecommunity e combined community consists oftwo steps the rst step is based on the point of themerger the second step is based on the combinationof edge

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network(2) 115-node American College Football network(3) 62-node Dolphins network

Experiment 4 Experiment 5 and Experiment 6 re-spectively are testing in the datasets (1) (2) and (3) on thecenter node based on the overlapping community detection

algorithm e eect of the algorithm is investigated Ex-perimental parameters are setting CONN 05 and T 065

Experiment 4 the classic dataset Zachary Karate Clubnetwork contains 34 nodes By using the overlappingcommunity detection algorithm based on k faction thealgorithm is divided into two communities as shown inFigure 4From Figure 4 we can learn that the proposed algo-rithm is almost identical to the original network exceptfor the node 10 In fact 10 nodes and two communityconnection numbers as overlapping nodes were addedin two communities in accordance with the actualsituationExperiment 5 American College Football network isbased on the overlapping community detection algo-rithm based on k-faction e network is divided intothe community and the results are shown in Figure 5From Figure 5 we can see that a total of 13 commu-nities and the original network compared to a total of87 of the nodes are correctly divided compared to theoriginal network of which 7 communities are exactlythe same and there are 4 almost identical In allcommunities there is not any overlapping nodeExperiment 6 dataset Dolphins network is a classicdolphin network e actual network is divided intotwo communities Use this section to carry out com-munity division and get four communities as shown inFigure 6e overlapping community detection algorithm basedon the k-faction is divided into four communities

282

27

26

3223

14

7

1857

55

49

58

20

42

610

33

61

40

8

45

21

2931

37

48

35

17

38

3

1

4324

12

5

19

56

25

3622

30

1652

4611

60

9

4

51

44

53

5462

13

1541

34

4750

3959

Figure 3 Dolphins network divided into 2 communities

Mobile Information Systems 5

33

29

30 32

26

28

26

27

34 9

15 2331

16

21

19

24

3

10

2

188 22

12

20

14

1

11

4 13

5

7

17

6

Figure 4 Community based on the overlapping community detection algorithm based on k-faction

4

75108

8599

103

82

41

73

536

11

80 95 102

8330

81

3120

56

36

94

1710

105

142

524

93

67

49

58

92

87

113

4576

3219

35

72

62

55 100

922

69

112

52

879

78 23

109

57

66

97

28

8821

71

63

77 18

96

11439 44

43

86

27

13

15

37

110

90

34

38

2

104

26

46

106

89

84

50

544768

115

74111

33

6516

3

107

101

48

61

14

740

59

60

64

98

2012

7091

5125

Figure 5 American College Football network divided into 13 communities

6 Mobile Information Systems

while the original network has two communities But itcan be found that the four communities are dividedinto two subcommunities in two communities in theoriginal network Node 4 and node 9 are overlappingnodes

-ese three experiments are the results of communitydivision showing a better effect compared with the originalnetwork and there is a higher accuracy rate

-is paper analyzes the modularity Qc Shen putforward the criterion for evaluating the closeness ofoverlapping community indicating that the greater the Qcwas the closer the community was the sparser thecommunity is connected the better the community di-vision results are Based on this view this paper calculatesthe modularity of the above two algorithms and comparesthem with other overlapping community detection al-gorithms in detail

-ese algorithms are proposed by Shen et al [18] andZhang et al [19] Evans and Lambiotte [20] and Duanbinget al [21] proposed overlapping community detection al-gorithm and the results of the experimental comparison areshown in Table 1

Algorithm one is the overlapping community detectionalgorithm based on the center node algorithm two repre-sents the overlapping community detection algorithm basedon k-faction

From Table 1 the overlapping community detectionalgorithm based on k-faction has a better effect For the34-node Club network the algorithm in the module ofShen Zhang and Evans and other algorithms proposedwere slightly smaller and for the Football network and

Dolphins network this algorithm is better than otheralgorithms

-erefore the overlapping community detection algo-rithm based on the k-faction has a better effect whether it isfrom the accuracy of the community division or the angle ofthe module

3 Collaborative Filtering RecommendationAlgorithm Based on theCommunity Detection

31 Basic Idea Aiming at the problems in the traditionalcollaborative filtering recommendation system a collabo-rative recommendation algorithm based on communitydetection is proposed in this paper In collaborative filteringrecommendation algorithms based on the community de-tection the users themselves contain their own feature in-formation User attribute content largely reflects the similar

45

5939

13

15

50 47

5462

44 51

3435

1721

37 3841 9

4

16

25

36

30

19

56

22

524660

51224

53

148

113

43

29

31

20

8

27

4940

5855

2

28

57

7

10

6

61

33 14

23 32

2618

42

Figure 6 Dolphins network divided into four communities

Table 1 Comparison of algorithm modules

Dataset34-nodeClub

network

115-nodeFootballnetwork

62-nodeDolphinsnetwork

Algorithm 1 045 057 051Algorithm 2 042 060 055Actual network partitioning 042 056 040Shen 067 mdash 054Zhang 042 058 mdashEvans 046 mdash mdashChen 038 mdash 033

Mobile Information Systems 7

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 5: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

factions as the initial community second processaccording to the community between the point and edgeoverlap to merge community third process join the nodesto join the most close community community expansionfourth process tuning of the second phase of thecommunity

(1) Finding the largest faction BronndashKerbosch algo-rithm is used to nd the maximum clique in thenetwork and the maximum number of nodes in thenetwork is more than 3 as the initial community

(2) Merger community because there are a number ofoverlapping nodes in all of the biggest factions thesefactions can be combined into a community whenthe overlap is over a threshold value of T in additionif the connection between the two communities iscloser than the threshold value of CONN the twocommunities should also be merged into the samecommunity e combined community consists oftwo steps the rst step is based on the point of themerger the second step is based on the combinationof edge

(3) Experimental analysis and comparison we selectthree classic network datasets

(1) 34-node Zacharyrsquos Karate Club network(2) 115-node American College Football network(3) 62-node Dolphins network

Experiment 4 Experiment 5 and Experiment 6 re-spectively are testing in the datasets (1) (2) and (3) on thecenter node based on the overlapping community detection

algorithm e eect of the algorithm is investigated Ex-perimental parameters are setting CONN 05 and T 065

Experiment 4 the classic dataset Zachary Karate Clubnetwork contains 34 nodes By using the overlappingcommunity detection algorithm based on k faction thealgorithm is divided into two communities as shown inFigure 4From Figure 4 we can learn that the proposed algo-rithm is almost identical to the original network exceptfor the node 10 In fact 10 nodes and two communityconnection numbers as overlapping nodes were addedin two communities in accordance with the actualsituationExperiment 5 American College Football network isbased on the overlapping community detection algo-rithm based on k-faction e network is divided intothe community and the results are shown in Figure 5From Figure 5 we can see that a total of 13 commu-nities and the original network compared to a total of87 of the nodes are correctly divided compared to theoriginal network of which 7 communities are exactlythe same and there are 4 almost identical In allcommunities there is not any overlapping nodeExperiment 6 dataset Dolphins network is a classicdolphin network e actual network is divided intotwo communities Use this section to carry out com-munity division and get four communities as shown inFigure 6e overlapping community detection algorithm basedon the k-faction is divided into four communities

282

27

26

3223

14

7

1857

55

49

58

20

42

610

33

61

40

8

45

21

2931

37

48

35

17

38

3

1

4324

12

5

19

56

25

3622

30

1652

4611

60

9

4

51

44

53

5462

13

1541

34

4750

3959

Figure 3 Dolphins network divided into 2 communities

Mobile Information Systems 5

33

29

30 32

26

28

26

27

34 9

15 2331

16

21

19

24

3

10

2

188 22

12

20

14

1

11

4 13

5

7

17

6

Figure 4 Community based on the overlapping community detection algorithm based on k-faction

4

75108

8599

103

82

41

73

536

11

80 95 102

8330

81

3120

56

36

94

1710

105

142

524

93

67

49

58

92

87

113

4576

3219

35

72

62

55 100

922

69

112

52

879

78 23

109

57

66

97

28

8821

71

63

77 18

96

11439 44

43

86

27

13

15

37

110

90

34

38

2

104

26

46

106

89

84

50

544768

115

74111

33

6516

3

107

101

48

61

14

740

59

60

64

98

2012

7091

5125

Figure 5 American College Football network divided into 13 communities

6 Mobile Information Systems

while the original network has two communities But itcan be found that the four communities are dividedinto two subcommunities in two communities in theoriginal network Node 4 and node 9 are overlappingnodes

-ese three experiments are the results of communitydivision showing a better effect compared with the originalnetwork and there is a higher accuracy rate

-is paper analyzes the modularity Qc Shen putforward the criterion for evaluating the closeness ofoverlapping community indicating that the greater the Qcwas the closer the community was the sparser thecommunity is connected the better the community di-vision results are Based on this view this paper calculatesthe modularity of the above two algorithms and comparesthem with other overlapping community detection al-gorithms in detail

-ese algorithms are proposed by Shen et al [18] andZhang et al [19] Evans and Lambiotte [20] and Duanbinget al [21] proposed overlapping community detection al-gorithm and the results of the experimental comparison areshown in Table 1

Algorithm one is the overlapping community detectionalgorithm based on the center node algorithm two repre-sents the overlapping community detection algorithm basedon k-faction

From Table 1 the overlapping community detectionalgorithm based on k-faction has a better effect For the34-node Club network the algorithm in the module ofShen Zhang and Evans and other algorithms proposedwere slightly smaller and for the Football network and

Dolphins network this algorithm is better than otheralgorithms

-erefore the overlapping community detection algo-rithm based on the k-faction has a better effect whether it isfrom the accuracy of the community division or the angle ofthe module

3 Collaborative Filtering RecommendationAlgorithm Based on theCommunity Detection

31 Basic Idea Aiming at the problems in the traditionalcollaborative filtering recommendation system a collabo-rative recommendation algorithm based on communitydetection is proposed in this paper In collaborative filteringrecommendation algorithms based on the community de-tection the users themselves contain their own feature in-formation User attribute content largely reflects the similar

45

5939

13

15

50 47

5462

44 51

3435

1721

37 3841 9

4

16

25

36

30

19

56

22

524660

51224

53

148

113

43

29

31

20

8

27

4940

5855

2

28

57

7

10

6

61

33 14

23 32

2618

42

Figure 6 Dolphins network divided into four communities

Table 1 Comparison of algorithm modules

Dataset34-nodeClub

network

115-nodeFootballnetwork

62-nodeDolphinsnetwork

Algorithm 1 045 057 051Algorithm 2 042 060 055Actual network partitioning 042 056 040Shen 067 mdash 054Zhang 042 058 mdashEvans 046 mdash mdashChen 038 mdash 033

Mobile Information Systems 7

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 6: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

33

29

30 32

26

28

26

27

34 9

15 2331

16

21

19

24

3

10

2

188 22

12

20

14

1

11

4 13

5

7

17

6

Figure 4 Community based on the overlapping community detection algorithm based on k-faction

4

75108

8599

103

82

41

73

536

11

80 95 102

8330

81

3120

56

36

94

1710

105

142

524

93

67

49

58

92

87

113

4576

3219

35

72

62

55 100

922

69

112

52

879

78 23

109

57

66

97

28

8821

71

63

77 18

96

11439 44

43

86

27

13

15

37

110

90

34

38

2

104

26

46

106

89

84

50

544768

115

74111

33

6516

3

107

101

48

61

14

740

59

60

64

98

2012

7091

5125

Figure 5 American College Football network divided into 13 communities

6 Mobile Information Systems

while the original network has two communities But itcan be found that the four communities are dividedinto two subcommunities in two communities in theoriginal network Node 4 and node 9 are overlappingnodes

-ese three experiments are the results of communitydivision showing a better effect compared with the originalnetwork and there is a higher accuracy rate

-is paper analyzes the modularity Qc Shen putforward the criterion for evaluating the closeness ofoverlapping community indicating that the greater the Qcwas the closer the community was the sparser thecommunity is connected the better the community di-vision results are Based on this view this paper calculatesthe modularity of the above two algorithms and comparesthem with other overlapping community detection al-gorithms in detail

-ese algorithms are proposed by Shen et al [18] andZhang et al [19] Evans and Lambiotte [20] and Duanbinget al [21] proposed overlapping community detection al-gorithm and the results of the experimental comparison areshown in Table 1

Algorithm one is the overlapping community detectionalgorithm based on the center node algorithm two repre-sents the overlapping community detection algorithm basedon k-faction

From Table 1 the overlapping community detectionalgorithm based on k-faction has a better effect For the34-node Club network the algorithm in the module ofShen Zhang and Evans and other algorithms proposedwere slightly smaller and for the Football network and

Dolphins network this algorithm is better than otheralgorithms

-erefore the overlapping community detection algo-rithm based on the k-faction has a better effect whether it isfrom the accuracy of the community division or the angle ofthe module

3 Collaborative Filtering RecommendationAlgorithm Based on theCommunity Detection

31 Basic Idea Aiming at the problems in the traditionalcollaborative filtering recommendation system a collabo-rative recommendation algorithm based on communitydetection is proposed in this paper In collaborative filteringrecommendation algorithms based on the community de-tection the users themselves contain their own feature in-formation User attribute content largely reflects the similar

45

5939

13

15

50 47

5462

44 51

3435

1721

37 3841 9

4

16

25

36

30

19

56

22

524660

51224

53

148

113

43

29

31

20

8

27

4940

5855

2

28

57

7

10

6

61

33 14

23 32

2618

42

Figure 6 Dolphins network divided into four communities

Table 1 Comparison of algorithm modules

Dataset34-nodeClub

network

115-nodeFootballnetwork

62-nodeDolphinsnetwork

Algorithm 1 045 057 051Algorithm 2 042 060 055Actual network partitioning 042 056 040Shen 067 mdash 054Zhang 042 058 mdashEvans 046 mdash mdashChen 038 mdash 033

Mobile Information Systems 7

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 7: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

while the original network has two communities But itcan be found that the four communities are dividedinto two subcommunities in two communities in theoriginal network Node 4 and node 9 are overlappingnodes

-ese three experiments are the results of communitydivision showing a better effect compared with the originalnetwork and there is a higher accuracy rate

-is paper analyzes the modularity Qc Shen putforward the criterion for evaluating the closeness ofoverlapping community indicating that the greater the Qcwas the closer the community was the sparser thecommunity is connected the better the community di-vision results are Based on this view this paper calculatesthe modularity of the above two algorithms and comparesthem with other overlapping community detection al-gorithms in detail

-ese algorithms are proposed by Shen et al [18] andZhang et al [19] Evans and Lambiotte [20] and Duanbinget al [21] proposed overlapping community detection al-gorithm and the results of the experimental comparison areshown in Table 1

Algorithm one is the overlapping community detectionalgorithm based on the center node algorithm two repre-sents the overlapping community detection algorithm basedon k-faction

From Table 1 the overlapping community detectionalgorithm based on k-faction has a better effect For the34-node Club network the algorithm in the module ofShen Zhang and Evans and other algorithms proposedwere slightly smaller and for the Football network and

Dolphins network this algorithm is better than otheralgorithms

-erefore the overlapping community detection algo-rithm based on the k-faction has a better effect whether it isfrom the accuracy of the community division or the angle ofthe module

3 Collaborative Filtering RecommendationAlgorithm Based on theCommunity Detection

31 Basic Idea Aiming at the problems in the traditionalcollaborative filtering recommendation system a collabo-rative recommendation algorithm based on communitydetection is proposed in this paper In collaborative filteringrecommendation algorithms based on the community de-tection the users themselves contain their own feature in-formation User attribute content largely reflects the similar

45

5939

13

15

50 47

5462

44 51

3435

1721

37 3841 9

4

16

25

36

30

19

56

22

524660

51224

53

148

113

43

29

31

20

8

27

4940

5855

2

28

57

7

10

6

61

33 14

23 32

2618

42

Figure 6 Dolphins network divided into four communities

Table 1 Comparison of algorithm modules

Dataset34-nodeClub

network

115-nodeFootballnetwork

62-nodeDolphinsnetwork

Algorithm 1 045 057 051Algorithm 2 042 060 055Actual network partitioning 042 056 040Shen 067 mdash 054Zhang 042 058 mdashEvans 046 mdash mdashChen 038 mdash 033

Mobile Information Systems 7

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 8: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

relationship between friends At the same time the userrsquosfeature attributes are stable and can reflect the relationshipbetween users -erefore in the collaborative filtering rec-ommendation the userrsquos tag information feature is partic-ularly important

By introducing the user feature attribute the similarityof the feature attributes of the userrsquos neighbors canbe extracted and the weight of the unrelated items of thetarget item in the similar items can be reduced so that thecalculation of the neighbor user set is more accurate Rec-ommendation items in collaborative filtering recommen-dation based on community detection are classified and it isshown in the following equation

I C1 cup C2 cup cupCK

C1 i11 i12 i1ji1113966 1113967

CK ik1 ik2 ikjk1113966 1113967

(4)

where U represents the set of all users in the system and CK

indicates the kth class

32 User Feature Matrix Construction In the case of similarcalculations the scoring value of the improvement factorshould be fully considered -erefore this paper designs animproved user similarity calculation method based on thecollaborative filtering recommendation system Using thelinear combinationmethod the corresponding weightX is setand the traditional item H similarity result and the user labelcategory similarity result are combined to jointly serve as thesimilarity between the two items -e equation is as follows

sim(i j) (1minus λ) times simR(i j) + λ times simitemminuscate(i j) (5)

where simR(i j) is the similarity calculation of the userscoring matrix according to the traditional three similaritymeasurement methods

Sarwar used the smallest dataset of MovieLens tocompare the three similarities and used MAE as a mea-surement indicator -e experimental results show that theoptimal MAE can be obtained by using the corrected cosinesimilarity for scoring prediction

-e experiment is similar to the dataset in the paper-erefore this paper also uses the modified cosine similarityas the item similar calculation method

simitemminuscate(i j) is used to calculate the similarity in theuser-category matrix Since the matrix table is a binary valueit is reasonable to use the cosine similarity calculation -ecalculation equation is as follows

sim(i j)cate cos( irarr

jrarr

) i

rarrmiddot jrarr

irarr

middot jrarr

(6)

where irarr

and jrarr

are the corresponding row vectors of user i

and user j in the user-category matrix respectively andindicate the type contained by user i and user j

Community detection algorithm is used to divide agiven target social network into a number of communitiesthat match the actual situation Each community representsa social circle that is not used by everyone Potential

collaborative filtering is recommended for target users inthese relatively closely related and smaller communities Inthis way the subsequently constructed user-label matrixshrinks the matrix size Moreover most of the nodes in thecommunity have similar tag attributes so the sparseness ofthe collaborative filtering matrix can be reduced Construct auser-label scoring matrix using the proposed user-labelassociation strong concept in the community where thetarget user is located and calculate the similarity between theuser and other users in the community according to theimproved similarity calculation formula -e target userrsquoscommunity the user-label association strength concept isconstructed by using the proposed user-label associationstrength concept and the similarity between the user andother users in the community is calculated according to theimproved similarity calculation formula -en recommendthe top K users with the highest similarity as potentialfriends for the target users

4 Experiment Design and Discussion

41 Experimental Dataset In order to test the algorithm ofthis paper we have downloaded the classic MovieLens fromthe Internet and dataset and stored in local database

MovieLens (httpmovielensumnedu) was found in1997 to a recommendation system web film based on thenumber of users tens of thousands of daily access to thesystem and the system can accept user score of the film andmovie recommendation list for the user Currently the sitersquosusers have more than 43000 people -e user rating of thefilm has more than 3500

We selected 943 users from the user-rating database and1682 movies a total of 100000 scoring information each ofwhich has at least 20 score information score from 1 to 5 ofthe integer the greater the score the higher the degree of theuserrsquos preference for the film conversely the lower

42 Experimental Evaluation Index Different similarityindicators will result in different prediction scores Meanabsolute error (MAE) and root mean square error (RMSE)are two common indicators for measuring the accuracy ofsimilarity methods -e smaller the value the better theprediction accuracy -e usual approach is to divide theoriginal dataset into the training set and test set then use thetraining set to get the prediction model and the test set toevaluate the prediction results that is the prediction resultsand the actual score of the MAE and RMSE values It isdefined as follows

(1) Mean absolute error (MAE) is the average of theabsolute error of the userrsquos predicted score rui

prime andthe true score rui in the scoring test set

MAE 1

|p|1113944

(ui)isinprui minus ruiprime

11138681113868111386811138681113868111386811138681113868 (7)

(2) Root-mean-square error (RMSE) is the mean squareroot of the true score value rui and the predictedscore value rui

prime of the user in the test set

8 Mobile Information Systems

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 9: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

RMSE

1|p|

1113944(ui)isinp

rui minus ruiprime( 1113857

2

11139741113972

(8)

|p| in equations (7) and (8) represents the size of thetest set

43 Experimental Program To facilitate the evaluation ofperformance indicators the original MovieLens datasetsare divided into training sets and test sets -e authorrandomly selects a part from the user scoring set as thetraining set and the remaining scoring data shall becomethe test set -e training set is used for model trainingwhereas the test set is used for testing the pros and cons-is paper tests the collaborative filtering algorithm basedon the community detection as well as MAE and RMSEindicators of the collaborative filtering algorithm by usingPearsonrsquos correlation and cosine similarity -ere are twofactors significantly affected by the collaborative filteringrecommendation the scarcity of the dataset and the size ofneighbor user sets (ie the value of k)-ese two factors arealso considered in the validation of the collaborative fil-tering algorithm

Based on the above factors this paper designed thefollowing experimental scheme

Experiment 1 select different proportions of trainingsets and test sets in the MovieLens dataset If theneighbor user set size k is constant consider comparingthe performance of the above algorithms in the case ofdifferent data sparsity -ereby it is possible to verifythe extent to which the recommendation effect is af-fected under the condition that the system obtains avalid amount of informationExperiment 2 select a certain percentage of trainingand test sets in the MovieLens dataset When theneighbor user set size k is changed the performance ofthe above algorithm is compared to verify the influenceof the nearest neighbor user set size on the recom-mendation effect

In order to avoid the training set and test set of a certainpartition the experimental results are brought by chanceWe use the same method to divide the MovieLens datasetinto 5 and get 5 sets of different training sets and thecorresponding test sets So each training set and test set has5 experiments and take the average results of the 5 ex-periments is at the final result of the experiment

44 Experimental Results and Analysis In order to suc-cessfully carry out the experiment the need is to set someparameters In this paper we recommend that the size |U| ofthe candidate user setU in the algorithm be set to about 30of the number of users in the training set For conveniencethe collaborative filtering algorithm based on communitydetection in this paper uses CFCD (collaborative filteringcommunity detection) -e cooperative filtering algorithm

based on cosine similarity is expressed by CFC (collaborativefiltering cosine) -e collaborative filtering algorithm basedon Pearsonrsquos correlation is expressed by CFP (collaborativefiltering Pearson)

Experiment 1 with the MovieLens dataset the size ofthe training set is changed if the neighbor user set sizek value remains unchanged Table 2 is based on thepremise that the neighbor user set size k is 30 -eresults of experiments on the proportions of eachtraining set are shown in Table 2

We draw the line and bar chart corresponding to Ta-ble 2 -e details are shown in Figures 7 and 8 It can beseen from Figure 7 that the MAE value of the CFCDalgorithm proposed in this paper is smaller than that ofthe CFP algorithm and CFC algorithm Figure 8 showsthat the RMSE value of the CFCD algorithm proposedin this paper is also smaller than that of the CFP al-gorithm and the CFC algorithm -us the CFCD al-gorithm is better than the CFP algorithm and CFCalgorithmExperiment 2 with the MovieLens dataset in thecase of a certain size of the training set to change thesize of the K value of the nearest neighbor user setTable 3 shows the comparison of MAE and RMSEvalues at different k values for the training set ratio of80

We draw out Table 3 corresponding to the line chart andbar chart in detail It is shown in Figures 9 and 10

Experimental results show the MAE and RMSE values ofthe collaborative recommendation algorithm based oncommunity detection are smaller than the other two algo-rithms -e reaction is on the line graph and in the case of acertain proportion of the training set the value of k in-creases -e MAE and RSME polylines of CFCD are belowthose of CFC and CFP indicating that the proposed algo-rithm CFCD is better than the other two algorithms

When k takes about 30 in fact the MAE and RSME ofthe three algorithms are the smallest when the size of theneighbor user set is 30 the recommended effect is bestFigures 9 and 10 are the corresponding line graphs ofTable 3

It can be seen that CFCD algorithmrsquos MAE and RMSEvalues are lower than those of the other two algorithms Itcan be shown that the CFCD algorithm is better than theCFC and CFP algorithms

5 Conclusions

-is paper introduces social network-related technologiesinto collaborative recommendation technology and pro-poses a collaborative recommendation algorithm basedon community detection In order to solve the problemsexisting in the traditional and recommended technolo-gies this paper proposes a collaborative recommendationmethod based on community detection based on com-munity discovery technology and collaborative recom-mendation technology in social networks -e algorithm

Mobile Information Systems 9

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 10: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

adopts the method of rough selection of neighboring usersets which eectively reduce the system calculation timeand at the same time the scoring preprocessing method isadopted to prevent the problem caused by data sparsenesse performance of the proposed algorithm is veried byexperiments is paper designed two experiments usingthe MovieLens dataset e performance indicators MAE

and RMSE of the community-based collaborative recom-mendation technology and the collaborative recommen-dation technique based on Pearsonrsquos similarity and cosinesimilarity are tested separately e experimental resultsshow that the proposed technique is superior to the per-formance of collaborative recommendation based onPearsonrsquos similarity and Cosine similarity

Table 2 Experimental results for dierent training set ratios(k 30)

Ratio oftraining set ()

MAE RMSECFCD CFC CFP CFCD CFC CFP

20 0814 0847 0847 0995 1011 100130 0803 0848 0825 0965 0995 098740 0794 0834 0814 0968 0978 096850 0780 0835 0800 0917 0957 093560 0771 0830 0798 0867 0934 091470 0770 0828 0787 0827 0899 086780 0765 0823 0777 0812 0844 0822

MA

E

Training set ratio ()

CFCDCFPCFC

087

085

083

079

077

075

081

200 40 60 10080

Figure 7 MAE results for training set ratio changes (k 30)

20 30Training set ratio ()

40 50 60 70 80

087

085

083

079

077

075

081

RMSE

CFCDCFPCFC

Figure 8 RMSE results for training set ratio changes (k 30)

083082081080079078077076

0 20 40 60 80 100 120

MA

E

k

CFCDCFPCFC

Figure 9 K value variation of the MAE results (training set ratio of80)

10 30

085

084

083

081

08

079

082

k

RMSE

CFCDCFPCFC

20 40 50 60 70 80 90 100

Figure 10 K value variation of the RMSE results (training set ratioof 80)

Table 3 Experimental results for the training set ratio of 80

KMAE RMSE

CFCD CFC CFP CFCD CFC CFP10 0772 0829 0708 0819 0845 082320 0767 0823 0777 0812 0846 082430 0765 0824 0776 0813 0845 082740 0763 0826 0778 0815 0847 082350 0771 0823 0779 0816 0847 082460 0772 0826 0780 0817 0848 082570 0775 0824 0781 0816 0845 082680 0778 0823 0782 0817 0849 082790 0776 0824 0783 0818 0846 0824100 0781 0831 0783 0819 0851 0827

10 Mobile Information Systems

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 11: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

Further research will be carried out in the followingareas

(1) Information acquisition in addition to obtainingbasic user information it is necessary to dig out moreinformation implied by users In case if the new userrating information is insufficient the userrsquos non-evaluation information may be considered as theuserrsquos supplementary information so that the newuser can be more accurately recommended

(2) Recommended technical aspects in order to improvethe accuracy and scalability of the recommendationsystem and ensure that the system performs real-timerecommendation under large-scale data it is neces-sary to conduct in-depth exploration and research onthe recommended technology

Data Availability

-e data used to support the findings of this study are in-cluded within the article Readers can access the data sup-porting the conclusions of the study fromMovieLens (httpmovielensumnedu)

Conflicts of Interest

-e authors declare that they have no conflicts of interest

References

[1] H Zarzour F M Mohamed S Chaouki and C ChemamldquoAn improved collaborative filtering recommendationalgorithm for big datardquo Computational Intelligence and ItsApplications pp 660ndash668 2018

[2] H Zarzour Z A Sharif M A Ayyoub and Y Jararweh ldquoAnew collaborative filtering recommendation algorithm basedon dimensionality reduction and clustering techniquesrdquo inProceedings of the 2018 9th International Conference on In-formation and Communication Systems (ICICS) pp 102ndash106Lille France April 2018

[3] Z Chen Z Ning Q Xiong and M S Obaidat ldquoA collab-orative filtering recommendation-based scheme for WLANswith differentiated access servicerdquo IEEE Systems Journalvol 12 no 1 pp 1004ndash1014 2018

[4] C Lu ldquoRecommendation algorithm based on collaborativefiltering under social network environmentrdquo BioTechnologyAn Indian Journal vol 10 no 13 pp 7087ndash7092 2014

[5] Q-J Tan H Wu C Wang and Q Guo ldquoA personalizedhybrid recommendation strategy based on user behaviors andits applicationrdquo in Proceedings of the 2017 InternationalConference on Security Pattern Analysis and Cyberneticspp 181ndash186 SPAC Shenzhen China December 2017

[6] L Guo J LiangYing ZY Luo and L Sun ldquoCollaborativefiltering recommendation based on trust and emotionrdquoJournal of Intelligent Information Systems pp 1ndash23 2018

[7] J Aranda I Givoni J Handcock and D Tarlow An OnlineSocial Network-Based Recommendation System University ofToronto Ontario Canada 2007

[8] J Stan F Muhlenbach and C Largeron ldquoRecommendersystems using social network analysis challenges and futuretrendsrdquo in Encyclopedia of Social Network Analysis andMining pp 1ndash22 Springer New York NY USA 2014

[9] K G Saranya and G Sudha Sadasivam ldquoPersonalized newsarticle recommendation with novelty using collaborativefiltering based rough set theoryrdquo Mobile Networks andApplications vol 22 no 4 pp 719ndash729 2017

[10] L O Colombo-Mendoza R Valencia-Garcıa R Colomo-Palacios and G Alor-Hernandez ldquoA knowledge-basedmulti-criteria collaborative filtering approach for discover-ing services in mobile cloud computing platformsrdquo Journal ofIntelligent Information Systems pp 1ndash25 2018

[11] H-G Ko I-Y Ko and D Lee ldquoMulti-criteria matrixlocalization and integration for personalized collaborativefiltering in IoT environmentsrdquo Multimedia Tools and Appli-cations vol 77 no 4 pp 4697ndash4730 2018

[12] A M Rashid K L Shyong and KL George ldquoA highlyscalable hybrid model amp memory-based CF algorithmrdquo inProceeding of the WEBKDD of ACM Macau China June2014

[13] G-R Xue C Lin Q Yang et al ldquoScalable collaborativefiltering using cluster-based smoothingrdquo in Proceedings ofthe 20th Annual International ACM SIGIR Conferencepp 15ndash19 Tampere Finland August 2015

[14] A Clauset ldquoFinding local community structure in networksrdquoPhysical Review E vol 72 no 2 pp 56ndash70 2015

[15] W W Zachary ldquoAn information flow model for conflict andfission in small groupsrdquo Journal of Anthropological Researchvol 33 no 4 pp 452ndash473 1977

[16] M Girvan and M E J Newman ldquoCommunity structure insocial and biological networksrdquo Proceedings of the NationalAcademy of Sciences vol 99 no 12 pp 7821ndash7826 2012

[17] M E J Newman and M Girvan ldquoFinding and evaluatingcommunity structure in networksrdquo Physical Review E vol 69no 2 pp 25ndash32 2014

[18] H W Shen X-Q Cheng and J F Guo ldquoQuantifying andidentifying the overlapping community structure in net-worksrdquo Journal of Statistical Mechanics eory and Experi-ment vol 2009 no 7 Article ID P07042 42 pages 2009

[19] S Zhang R-S Wang and X-S zhang ldquoIdentification ofoverlapping community structure in complex networks usingfuzzy-means clusteringrdquo Physica A Statistical Mechanics andIts Applications vol 374 no 1 pp 483ndash490 2007

[20] T S Evans and R Lambiotte ldquoLine graphs link partitionsand overlapping communitiesrdquo Physical Review vol 80 no 1pp 145ndash148 2009

[21] C Duanbing Y Fu and M S Shang ldquoAn efficient algorithmfor overlapping community detection in complexrdquo GlobalCongress on Intelligent Systems pp 244ndash247 2009

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 12: AnImprovedCollaborativeFilteringRecommendation ...downloads.hindawi.com/journals/misy/2019/3560968.pdf · the collaborative recommendation technology and the continuous expansion

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom