Transcript
Page 1: Efficient Discriminative Learning of Parts-based Models M. Pawan Kumar Andrew Zisserman Philip Torr vgg

Efficient Discriminative Learning of Parts-based Models M. Pawan Kumar Andrew Zisserman Philip Torr

http://www.robots.ox.ac.uk/~vgg http://cms.brookes.ac.uk/research/visiongroup

Aim: To efficiently learn parts-based models which discriminate between positive and negative poses of

the object category

Results - Sign LanguageEfficient Reformulation

Results - Buffy

Parts-based Model G = (V, E) Restricted to Tree

The Learning Problem

Q(f) = ∑ Qa(f(a)) + ∑ Qab(f(a), f(b))

f : V Pose of V (h values)

Qa(f(a)) : Unary potential for f(a) Computed using featuresQab(f(a), f(b)): Pairwise potential

for validity of (f(a),f(b)) Restricted to Potts

Qa(f(a)) : waT(f(a)) Qa(f(a),f(b)) : wab

T(f(a),f(b)) Q(f) : wT(f)

min ||w|| + C∑ i

wT(f+i) + ≥ 1 - +

i

wT(f-ij) + ≤ -1 + -

i

Maximize margin, minimize hinge lossHigh energy for all positive examplesLow energy for all negative examples

Related WorkLocal Iterative Support Vector Machine (ISVM-1)• Start with a small subset of negative examples (1 per image)• Solve for w and b• Replace negative examples with current MAP estimates• Converges to local optimum

• Start with a small subset of negative examples (1 per image)• Solve for w and b• Add current MAP estimates to set of negative examples• Converges to global optimum

Global Iterative Support Vector Machine (ISVM-2)

Drawback: Requires obtaining MAP estimate of each image at each iteration (computationally expensive)

Our: 86.4% Buehler et al.,2008: 87.7%

Our: 39.2% Ferrari et al.,2008: 41.0%

100 training images, 95 test images

ISVM-1

ISVM-2

Our

ISVM-1

ISVM-2

Our

196 training images, 204 test images

ISVMs run for twice as long

For all j (exponential in |V|)

= 1, if (f(a),f(b)) Lab, = 0, otherwise.

b

a

wT(f-ij) + ≤ -1 + -

i, for all j

Miba(k) ≥ wbb(l), for all l

Miba(k) ≥ wbb(l) + wab,

for all (k,l) Lab

waa(k) + ∑b Miba(k) + ≤ -1 + -

i

Exponential in |V|

Linear in |V|

Linear in h

Linear in |Lab|

b

a

b

a

max abT1 - ab

TKabab

s.t. abTy = 0, ab ≥ 0

0 ≤ ∑ iab(k) + ∑ i

ab(k,l)

∑ iab(k) + ∑ i

ab(k,l) ≤ C

Problem (1)

∑k iab(k) = ∑ li

ba(l) Constraint (3) Results in a large minimal problem

Dual Decomposition

Master

Problem(1) Problem (2)

minimal problem size = 2

Update Lagrange multiplier of (3)

SVM-like problems

Modified SVMLight

min ∑ i gi(x), subject to x P

min ∑ gi(xi), s.t. xi P, xi = x max min ∑ gi(xi) + i(xi - x), s.t. xi P

KKT Condition: ∑ i = 0 Solve min ∑ gi(xi) + ixi i = i +xi* Project

Problem (1) learns the unary weight vector wa and pairwise weight wab

Problem (2) learns the unary weight vector wb and pairwise weight wab

Miba(k) analogous to messages in Belief Propagation (BP)

Efficient BP using distance transform: Felzenszwalb and Huttenlocher, 2004

Solving the Dual

Implementation DetailsFeatures Shape: HOG Appearance: (x,x2), x = fraction of skin pixelsData Positive examples: Provided by user Negative examples: All other posesOcclusion Each putative pose can be occluded (twice the number of labels)

(f(a),f(b))

max baT1 - ba

TKbaba

s.t. baTy = 0, ba ≥ 0

0 ≤ ∑ iba(k) + ∑ i

ba(k,l)

∑ iba(k) + ∑ i

ba(k,l) ≤ C

Problem (2)

Recommended