Upload
chad-stafford
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
1
Self-paced Learning for Latent Variable Models
Presented by Zhou Yu
M.Pawan Kumar Ben Packer Daphne Koller , Stanford University
2
Aim: To learn an accurate set of parameters for latent variable models
Intuitions from Human Learning: • all information at once may be confusing => bad local minima• start with “easy” examples the learner is prepared to handle
“Self-paced” schedule of examples is automatically set by learner
• task-specific• onerous on user
• “easy for human” “easy for computer”• “easy for Learner A” “easy for Learner B”
Adopted from Kumer’s poster
3
Latent Variable Models
x
y
h
x : input or observed variables
y : output or observed variables
h : hidden/latent variables
4
• Latent Variable Model can be used in a lot of topics – Object Localization– Action Recognition– Human Pose Detection
xh
Latent Variable Models
y = “Deer”
h = Bounding Box
x = Entire image
5
Learning Latent Variable Models
Goal: Given D = {(x1,y1), …, (xn,yn)}, learn parameters w.
Maximize log likelihood:
maxw i log P(xi,yi;w)
= maxw (i log P(xi,yi,hi;w) - i log P(hi |xi,yi,,w) )
Iterate:• Find expected value of hidden variables using current w • Update w to maximize log likelihood subject
to this expectation
• Expectation Maximization:
6
Learning Latent Variable ModelsGoal: Given D = {(x1,y1), …, (xn,yn)}, learn parameters w.
• Latent Structural SVM
minw;»>012jjwjj2 + C
n
P ni=1 »i ;
8yi 2 y;8hi ) 2 H;i = 1;:::ns:t:maxhi 2H wT (©(xi ;yi ;hi ) ¡ ©(xi ; yi ; hi )) ¸ ¢ (yi ; yi ) ¡ »i ;
xhy = “Deer”
h = Bounding Box
x = Entire image
Solver: Concave-convex procedure (CCCP)
7
Self-Paced Learning
Note: vi =1 means it is easy, vi =0 means it is hard
8
Self-Paced Learning
CCCPAll at once
Self-paced learningEasy first
easy hardIteration 1 Iteration 2 Iteration 3
Iteration 1 Iteration 2 Iteration 3
9
Optimization in Self-paced learningusing ACS
Initialize K to be largeIterate:
Run inference over hAlternatively update w and v (ACS alternate convex
search): v set by sorting li(w), comparing to threshold 1/K Perform normal update for w over subset of dataUntil convergenceAnneal K K/μ
Until all vi = 1, cannot reduce objective within tolerance
10
Initialization
How we get the w0
1. Initially setting vi =1 for all samples
2. Run original CCCP to solve the structure latent SVM for a fixed number of iteration T0
Concern: It is not reasonable to use the model which we think is not good as the initial value. Different initializations could result in different performance in the end.
11
ExperimentObject Localization
6 Different mammals(approximately 45 images per mammal)
Self-paced learningCCCP
easy
easy
hard
hard
12
ExperimentPascal VOC 2007
5 categories out of 20, random sample 50 percent of the data
A: Use human labeled information to decide which are easy samples, non truncated non occluded are easy. Use this as initialization in self-paced learning modelB: Use CCCP results as initialization for Self-paced learning modelC: CCCP
Car Cat Chair Cow Bird0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
ABC
AP
13
Some random cat images from Google
Experiment
Original Image
Results after 10 iterations
14
Conclusion
• Latent variable models– Latent structural SVM– Eg: object detection, human pose estimation,
human action recognition, tracking.
• Initialization– What is a good initialization?– Maybe Multiple initialization?