Self-paced Learning for Latent Variable Models Presented by Zhou Yu TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA

1

Self-paced Learning for Latent Variable Models

Presented by Zhou Yu

M.Pawan Kumar Ben Packer Daphne Koller , Stanford University

2

Aim: To learn an accurate set of parameters for latent variable models

Intuitions from Human Learning: • all information at once may be confusing => bad local minima• start with “easy” examples the learner is prepared to handle

“Self-paced” schedule of examples is automatically set by learner

• task-specific• onerous on user

• “easy for human” “easy for computer”• “easy for Learner A” “easy for Learner B”

Adopted from Kumer’s poster

3

Latent Variable Models

x

y

h

x : input or observed variables

y : output or observed variables

h : hidden/latent variables

4

• Latent Variable Model can be used in a lot of topics – Object Localization– Action Recognition– Human Pose Detection

xh

Latent Variable Models

y = “Deer”

h = Bounding Box

x = Entire image

5

Learning Latent Variable Models

Goal: Given D = {(x1,y1), …, (xn,yn)}, learn parameters w.

Maximize log likelihood:

maxw i log P(xi,yi;w)

= maxw (i log P(xi,yi,hi;w) - i log P(hi |xi,yi,,w) )

Iterate:• Find expected value of hidden variables using current w • Update w to maximize log likelihood subject

to this expectation

• Expectation Maximization:

6

Learning Latent Variable ModelsGoal: Given D = {(x1,y1), …, (xn,yn)}, learn parameters w.

• Latent Structural SVM

minw;»>012jjwjj2 + C

n

P ni=1 »i ;

8yi 2 y;8hi ) 2 H;i = 1;:::ns:t:maxhi 2H wT (©(xi ;yi ;hi ) ¡ ©(xi ; yi ; hi )) ¸ ¢ (yi ; yi ) ¡ »i ;

xhy = “Deer”

h = Bounding Box

x = Entire image

Solver: Concave-convex procedure (CCCP)

7

Self-Paced Learning

Note: vi =1 means it is easy, vi =0 means it is hard

8

Self-Paced Learning

CCCPAll at once

Self-paced learningEasy first

easy hardIteration 1 Iteration 2 Iteration 3

Iteration 1 Iteration 2 Iteration 3

9

Optimization in Self-paced learningusing ACS

Initialize K to be largeIterate:

Run inference over hAlternatively update w and v (ACS alternate convex

search): v set by sorting li(w), comparing to threshold 1/K Perform normal update for w over subset of dataUntil convergenceAnneal K K/μ

Until all vi = 1, cannot reduce objective within tolerance

10

Initialization

How we get the w0

1. Initially setting vi =1 for all samples

2. Run original CCCP to solve the structure latent SVM for a fixed number of iteration T0

Concern: It is not reasonable to use the model which we think is not good as the initial value. Different initializations could result in different performance in the end.

11

ExperimentObject Localization

6 Different mammals(approximately 45 images per mammal)

Self-paced learningCCCP

easy

easy

hard

hard

12

ExperimentPascal VOC 2007

5 categories out of 20, random sample 50 percent of the data

A: Use human labeled information to decide which are easy samples, non truncated non occluded are easy. Use this as initialization in self-paced learning modelB: Use CCCP results as initialization for Self-paced learning modelC: CCCP

Car Cat Chair Cow Bird0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

ABC

AP

13

Some random cat images from Google

Experiment

Original Image

Results after 10 iterations

14

Conclusion

• Latent variable models– Latent structural SVM– Eg: object detection, human pose estimation,

human action recognition, tracking.

• Initialization– What is a good initialization?– Maybe Multiple initialization?

Documents

Self-paced Learning for Latent Variable Models Presented by Zhou Yu TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA