Upload
claire-watson
View
226
Download
2
Embed Size (px)
Citation preview
CoNMF: Exploiting User Comments for Clustering Web2.0 Items
Presenter: He Xiangnan
28 June 2013Email: [email protected]
School of Computing
National University of Singapore
Xiangnan He
Introduction
• Motivations:– Users comment on items based on their own interests.
– Most users’ interests are limited.
– The categories of items can be inferred from the comments.
• Proposed problem:– Clustering items by exploiting user comments.
• Applications:– Improve search diversity.
– Automatic tag generation from comments.
– Group-based recommendation
2WING, NUS
Xiangnan He
Challenges
• Traditional solution:– Represent items as a feature space.
– Apply any clustering algorithm, e.g. k-means.
• Key challenges:– Items have heterogeneous features:
1. Own features (e.g. words for articles, pixels for images)
2. Comments Usernames Textual contents
– Simply concatenate all features does not preform well.
– How to meaningfully combine the heterogeneous views to produce better clustering (i.e. multi-view clustering)?
3WING, NUS
Xiangnan He
Proposed solution
• Extend NMF (Nonnegative Matrix Factorization) to support multi-view clustering…
4WING, NUS
Xiangnan He
NMF (Non-negative Matrix Factorization)
5WING, NUS
• Factorize data matrix V (#doc×#words) as:
–where W is #doc×k and H is k×#words, and each entry is non-negative
• Alternating optimization:– With Lagrange multipliers, differentiate on W and H respectively.
Local optimum, not global!
• Goal is minimizing the objective function:
–where || || denotes the Frobenius norm
Xiangnan He
• Difference with SVD(LSI):
Characteristics of NMF• Matrix Factorization with a non-negative constraint
Reduce the dimension of the data; derive the latent space
Characteristic SVD NMF
Orthogonal basis Yes No
Negative entry Yes No
Post clustering Yes No
• Theoretically proved suitable for clustering (Chis et al. 2005)• Practically shown superior performance than SVD and k-means in
document clustering (Xu et al. 2003)
Xiangnan He
Extensions of NMF
• Relationships with other clustering algorithms:– K-means: Orthogonal NMF = K-means– PLSI: KL-Divergence NMF = PLSI– Spectral clustering
• Extensions:–Tri-factor of NMF( V = W S H ) (Ding et al. 2006)–NMF with sparsity constraints (Hoyer 2004)–NMF with graph regularization (Cai et al. 2011)– However, studies on NMF-based multi-view clustering approaches are quite limited. (Liu et al. 2013)
• My proposal:– Extend NMF to support multi-view clustering
7WING, NUS
Xiangnan He
Proposed solution - CoNMF
• Idea:– Couple the factorization process of NMF
• Example:– Single NMF:
Factorization equation : Objective function: Constraints: all entries of W and H are non-negative.
8WING, NUS
- 2-view CoNMF: Factorization equation:
Objective function:
Xiangnan He
CoNMF Framework
– Mutual-based: Point-wise:
Cluster-wise:
9WING, NUS
• Objective function:
–Similar alternating optimization with Lagrange multipliers can solve it.
• Coupling the factorization process of multiple matrices(i.e. views) via regularization.
• Different options of regularization:– Centroid-based (Liu et al. 2013):
Xiangnan He
Experiments• Last.fm dataset:
• 3-views:
• Ground-truth:– Music type of each artist provided by Last.fm
• Evaluation metrics:– Accuracy and F1
• Average performance of 20 runs.10WING, NUS
#Items #Users #Comments #Clusters
9,694 131,898 2,500,271 21
View #Items #Features Token type
Items-Desc. words 9,694 14,076 TF – IDF
Items-Comm. words 9,694 31,172 TF – IDF
Items-Users 9,694 131,898 Boolean
Xiangnan He
Statistics of datasets
11WING, NUS
Statistics of #items/user Statistics of #clusters/user
P(T<=3) = 0.6229P(T<=5) = 0.8474P(T<=10) = 0.9854
Verify our assumption: each user usually comments on limited music types.
Xiangnan He
Experimental results (Accuracy)
Initialization Method Desc. Comm. Users Comb.Random kmeans 0.25 0.28 0.34 0.415
12WING, NUS
SVD 0.29 0.31 0.28 0.294
Random NMF 0.24 0.27 0.32 0.313
K-means NMF 0.26 0.32 0.40 0.417
K-means CoNMF – point 0.460
K-means CoNMF – cluster 0.420
NMF Multi-NMF(SDM'13) 0.369
Random MM-LDA(WSDM'09) 0.366
1. Users>Comm.>Desc., while combined is best.2. SVD performs badly on users (non-textual).3. Users>Comm.>Desc., while combined does worse.4. Initialization is important for NMF.
5. CoNMF-point performs best.
6. Other two state-of-the-art baselines.
Xiangnan He
Experimental results (F1)
13WING, NUS
Initialization Method Desc. Comm. Users Combined
Random kmeans 0.15 0.16 0.15 0.254
SVD 0.25 0.25 0.24 0.249
Random NMF 0.13 0.18 0.21 0.216
K-means NMF 0.15 0.21 0.27 0.298
K-means CoNMF –point 0.320
K-means CoNMF – cluster 0.284
NMF Multi-NMF(SDM'13) 0.265
Random MM-LDA(WSDM'09) 0.286
Xiangnan He
Conclusions
• Comments benefit clustering.• Mining different views from the comments is
important:– The two views (commenting words and users) contribute differently for clustering.
– For this Last.fm dataset, users is more useful.
– Combining all views works best.
• For NMF-based methods, initialization is important.
14WING, NUS
Xiangnan He
Ongoing
• More experiments on other datasets.• Improve the CoNMF framework through adding the
sparseness constraints.• The influence of normalization on CoNMF.
15WING, NUS
Xiangnan He
Thanks!
QA?
16WING, NUS
Xiangnan He
References(I)
• Ding Chris, Xiaofeng He, and Horst D. Simon. 2005. On the equivalence of nonnegative matrix factorization and spectral clustering. In Proc. SIAM Data Mining Conf 2005.
• Wei Xu, Xin Liu, and Yihong Gong. 2003. Document clustering based on non-negative matrix factorization. In Proc. of SIGIR 2003
• Chris Ding, Tao Li, Wei Peng. 2006. Orthogonal nonnegative matrix tri-factorizations for clustering. In Proc. of SIGKDD 2006
• Patrik O. Hoyer. 2004. Non-negative Matrix Factorization with Sparseness Constraints. Journal of Machine Learning Researh 2004
• Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph Regularized Nonnegative Matrix Factorization for Data Representation. IEEE Trans. Pattern Anal. Mach. Intell. 2011
• Jialu Liu, Chi Wang, Jing Gao and Jiawei Han. 2013. Multi-View Clustering via Joint Nonnegative Matrix Factorization, In Proceedings of SIAM Data Mining Conference (SDM’13)
17WING, NUS