View
69
Download
3
Category
Tags:
Preview:
DESCRIPTION
The Infinite Hierarchical Factor Regression Model. Piyush Rai and Hal Daume III NIPS 2008. Presented by Bo Chen March 26, 2009. Outline. Introduction The Infinite Hierarchical Factor Regression Model Indian Buffet Process and Beta Process Experiment Summary. Introduction. - PowerPoint PPT Presentation
Citation preview
The Infinite Hierarchical Factor Regression Model
Piyush Rai and Hal Daume IIINIPS 2008
Presented by Bo ChenMarch 26, 2009
Outline
• Introduction • The Infinite Hierarchical Factor Regression
Model• Indian Buffet Process and Beta Process• Experiment• Summary
Introduction• The latent factor representation benefits: 1. Discovering the latent process underlying the data 2. Simpler predictive modeling through a compact data
representation. Large P, Small N. N>=10 · d · C
• The fundamental advantages over standard FA model: 1. not assume known number of factors; 2. not assume factors are independent; 3. not assume all features are relevant to the factor
analysis.
Algorithm Model
:
Graphical Model
T is used to eliminate the spurious genes or noise features.So Tp determines whether the p-th customer will enter restaurant to eat anydish.
Indian Buffet Process--from latent classes to latent features
• For a finite feature model:
(Tom Griffiths, 2006)
• Indian restaurant with countably many infinite dishes
Differences between DP and IBP
DP class matrix
IBP ‘class’ matrix
1. Latent feature 2. Clustering 3. others
Different styles match different problems.
Two-Parameter Finite Model
the first customer samples Poisson( ) dishes the i-th customer
samples a previously sampled dish with probability
then samples new dishes
(Z. Ghahramani et. al., 2006)
Beta Process V.S. IBP
• Beta Process:
the first customer samples Poisson( ) dishes the i-th customer
samples a previously sampled dish with probability
then samples new dishes
Hierarchical Factor Prior• Kingman’s Coalescent It is a distribution over the genealogy of a countably infinite set of
individuals. Construct tree structure
• Brownian diffusion A Markov process which encodes message (mean and covariance)
in each node of the above tree.
Y. W. Teh, H. Daume III, and D. M. Roy. Bayesian Agglomerative Clustering with Coalescents. In NIPS, 2008.
Feature Selection Prior• Some genes are spurious
Before selecting dishes, these ‘spurious’ customers
should leave the restaurant.
Provided by Piyush Rai
Experimental results
E-coli data:100 samples 50 genes8 underlying factors
Breast cancer data:251 samples226 genes5 underlying factors
1. The hierarchy can be used to find factors in order of their prominence.2. Hierarchical modeling results in better predictive performance for the
factor regression task.3. The factor hierarchy leads to faster convergence since most of the unlikely
configurations will never be visited as they are constrained by the hierarchy.
The Comparison of Factor Loading Matrice Learned from Different Methods
Ground Truth NIPS Method
Sparse BPFA on Factor loading VB Sparse BPFA on Factor score VB
Factor Regression
Training and test data are combined together and test responsesare treated as missing values to be imputed.
The Existing Similar FA Models• Putting binary matrix on factor score matrix
David Knowles and Zoubin Ghahramani. Infinite Sparse Factor Analysis and Infinite Independent Components Analysis, ICA 2007John Paisley et. al., Nonparametric Factor Analysis with Beta Process Priors, in submission 2009.
Summary: 1. For ‘large P, small N’ problems, the first one is faster to learn thesmall factor score matrix with KxN. Considering MCMC solution, it is difficult for the second one to handle the problem with tens of thousands of genes . 2. The second one can give an explanation to the relationship between geneand factor (pathway).
• Putting binary matrix on factor loading matrix
Piyush Rai and Hal Daume III. The Infinite Hierarchical Factor Regression Model, NIPS 2008.
The New Developments of IBP
F. Doshi, K. T. Miller, J. Van Gael and Y.W. Teh, Variational Inference for the Indian Buffet Process, AISTATS 2009.
Jurgen Van Gael, Yee Whye Teh, Zoubin Ghahramani , The Infinite Factorial Hidden Markov Model, NIPS 2008.
K. A. Heller and Zoubin Ghahramani, A Nonparametric Bayesian Approach to Modeling Overlapping Clusters, AISTATS 2007.
Recommended