62
Structured Regression for Efficient Object Detection Christoph Lampert www.christoph-lampert.org Max Planck Institute for Biological Cybernetics, Tübingen December 3rd, 2009 [C.L., Matthew B. Blaschko, Thomas Hofmann. CVPR 2008] [Matthew B. Blaschko, C.L. ECCV 2008] [C.L., Matthew B. Blaschko, Thomas Hofmann. PAMI 2009]

Structured regression for efficient object detection

  • Upload
    zukun

  • View
    736

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Structured regression for efficient object detection

Structured Regression forEfficient Object Detection

Christoph Lampertwww.christoph-lampert.org

Max Planck Institute for Biological Cybernetics, Tübingen

December 3rd, 2009

• [C.L., Matthew B. Blaschko, Thomas Hofmann. CVPR 2008]• [Matthew B. Blaschko, C.L. ECCV 2008]• [C.L., Matthew B. Blaschko, Thomas Hofmann. PAMI 2009]

Page 2: Structured regression for efficient object detection

Category-Level Object Localization

Page 3: Structured regression for efficient object detection

Category-Level Object Localization

What objects are present? person, car

Page 4: Structured regression for efficient object detection

Category-Level Object Localization

Where are the objects?

Page 5: Structured regression for efficient object detection

Object Localization ⇒ Scene Interpretation

A man inside of a car A man outside of a car⇒ He’s driving. ⇒ He’s passing by.

Page 6: Structured regression for efficient object detection

Algorithmic Approach: Sliding Window

f(y1) = 0.2 f(y2) = 0.8 f(y3) = 1.5

Use a (pre-trained) classifier function f :• Place candidate window on the image.• Iterate:

I Evaluate f and store result.I Shift candidate window by k pixels.

• Return position where f was largest.

Page 7: Structured regression for efficient object detection

Algorithmic approach: Sliding Window

f(y1) = 0.2 f(y2) = 0.8 f(y3) = 1.5

Drawbacks:• single scale, single aspect ratio→ repeat with different window sizes/shapes• search on grid→ speed–accuracy tradeoff• computationally expensive

Page 8: Structured regression for efficient object detection

New view: Generalized Sliding Window

Assumptions:• Objects are rectangular image regions of arbitrary size.• The score of f is largest at the correct object position.

Mathematical Formulation:

yopt = argmaxy∈Y

f(y)

with Y = {all rectangular regions in image}

Page 9: Structured regression for efficient object detection

New view: Generalized Sliding Window

Mathematical Formulation:

yopt = argmaxy∈Y

f(y)

with Y = {all rectangular regions in image}

• How to choose/construct/learn the function f ?• How to do the optimization efficiently and robustly?

(exhaustive search is too slow, O(w2h2) elements).

Page 10: Structured regression for efficient object detection

New view: Generalized Sliding Window

Mathematical Formulation:

yopt = argmaxy∈Y

f(y)

with Y = {all rectangular regions in image}

• How to choose/construct/learn the function f ?• How to do the optimization efficiently and robustly?

(exhaustive search is too slow, O(w2h2) elements).

Page 11: Structured regression for efficient object detection

New view: Generalized Sliding Window

Use the problem’s geometric structure:

• Calculate scores forsets of boxes jointly.

• If no element cancontain the maximum,discard the box set.

• Otherwise, split thebox set and iterate.

→ Branch-and-boundoptimization

• finds global maximum yopt

Page 12: Structured regression for efficient object detection

New view: Generalized Sliding Window

Use the problem’s geometric structure:• Calculate scores for

sets of boxes jointly.

• If no element cancontain the maximum,discard the box set.

• Otherwise, split thebox set and iterate.

→ Branch-and-boundoptimization

• finds global maximum yopt

Page 13: Structured regression for efficient object detection

New view: Generalized Sliding Window

Use the problem’s geometric structure:• Calculate scores for

sets of boxes jointly.

• If no element cancontain the maximum,discard the box set.

• Otherwise, split thebox set and iterate.

→ Branch-and-boundoptimization

• finds global maximum yopt

Page 14: Structured regression for efficient object detection

Representing Sets of Boxes

• Boxes: [l, t, r,b] ∈ R4.

Boxsets: [L,T,R,B] ∈ (R2)4

Splitting:• Identify largest interval. Split at center: R 7→ R1∪R2.• New box sets: [L,T,R1,B] and [L,T,R2,B].

Page 15: Structured regression for efficient object detection

Representing Sets of Boxes

• Boxes: [l, t, r,b] ∈ R4. Boxsets: [L,T,R,B] ∈ (R2)4

Splitting:• Identify largest interval. Split at center: R 7→ R1∪R2.• New box sets: [L,T,R1,B] and [L,T,R2,B].

Page 16: Structured regression for efficient object detection

Representing Sets of Boxes

• Boxes: [l, t, r,b] ∈ R4. Boxsets: [L,T,R,B] ∈ (R2)4

Splitting:• Identify largest interval.

Split at center: R 7→ R1∪R2.• New box sets: [L,T,R1,B] and [L,T,R2,B].

Page 17: Structured regression for efficient object detection

Representing Sets of Boxes

• Boxes: [l, t, r,b] ∈ R4. Boxsets: [L,T,R,B] ∈ (R2)4

Splitting:• Identify largest interval. Split at center: R 7→ R1∪R2.

• New box sets: [L,T,R1,B] and [L,T,R2,B].

Page 18: Structured regression for efficient object detection

Representing Sets of Boxes

• Boxes: [l, t, r,b] ∈ R4. Boxsets: [L,T,R,B] ∈ (R2)4

Splitting:• Identify largest interval. Split at center: R 7→ R1∪R2.• New box sets: [L,T,R1,B]

and [L,T,R2,B].

Page 19: Structured regression for efficient object detection

Representing Sets of Boxes

• Boxes: [l, t, r,b] ∈ R4. Boxsets: [L,T,R,B] ∈ (R2)4

Splitting:• Identify largest interval. Split at center: R 7→ R1∪R2.• New box sets: [L,T,R1,B] and [L,T,R2,B].

Page 20: Structured regression for efficient object detection

Calculating Scores for Box Sets

Example: Linear Support-Vector-Machine f(y) := ∑pi∈y wi.

+

fupper(Y) =∑

pi∈y∩min(0,wi) +

∑pi∈y∪

max(0,wi)

Can be computed in O(1) using integral images.

Page 21: Structured regression for efficient object detection

Calculating Scores for Box Sets

Histogram Intersection Similarity: f(y) := ∑Jj=1 min(h′j,h

yj ).

fupper(Y) =∑J

j=1 min(h′j,hy∪

j )

As fast as for a single box: O(J) with integral histograms.

Page 22: Structured regression for efficient object detection

Evaluation: Speed (on PASCAL VOC 2006)

Sliding Window Runtime:• always: O(w2h2)

Branch-and-Bound (ESS) Runtime:• worst-case: O(w2h2)• empirical: not more than O(wh)

Page 23: Structured regression for efficient object detection

Extensions:

Action classification: (y, t)opt = argmax(y,t)∈Y×T fx(y, t)

• J. Yuan: Discriminative 3D Subvolume Search for Efficient Action Detection, CVPR 2009.

Page 24: Structured regression for efficient object detection

Extensions:

Localized image retrieval: (x , y)opt = argmaxy∈Y, x∈D fx(y)

• C.L.: Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval, ICCV 2009

Page 25: Structured regression for efficient object detection

Extensions:

Hybrid – Branch-and-Bound with Implicit Shape Model

• A. Lehmann, B. Leibe, L. van Gool: Feature-Centric Efficient Subwindow Search, ICCV 2009

Page 26: Structured regression for efficient object detection
Page 27: Structured regression for efficient object detection

Generalized Sliding Window

yopt = argmaxy∈Y

f(y)

with Y = {all rectangular regions in image}

• How to choose/construct/learn f ?• How to do the optimization efficiently and robustly?

Page 28: Structured regression for efficient object detection

Traditional Approach: Binary Classifier

Training images:• x+

1 , . . . ,x+n show the object

• x−1 , . . . ,x−m show something else

Train a classifier, e.g.• support vector machine,• boosted cascade,• artificial neural network,. . .

Decision function f : {images} → R• f > 0 means “image shows the object.”• f < 0 means “image does not show

the object.”

Page 29: Structured regression for efficient object detection

Traditional Approach: Binary Classifier

Drawbacks:

• Train distribution6= test distribution

• No control over partialdetections.

• No guarantee to even findtraining examples again.

Page 30: Structured regression for efficient object detection

Object Localization as Structured Output Regression

Ideal setup:• function

g : {all images}→ {all boxes}

to predict object boxes from images• train and test in the same way, end-to-end

gcar

=

Page 31: Structured regression for efficient object detection

Object Localization as Structured Output Regression

Ideal setup:• function

g : {all images}→ {all boxes}

to predict object boxes from images• train and test in the same way, end-to-end

Regression problem:• training examples (x1, y1), . . . , (xn, yn) ∈ X × Y

I xi are images, yi are bounding boxes• Learn a mapping

g : X→Ythat generalizes from the given examples:

I g(xi) ≈ yi , for i = 1, . . . ,n,

Page 32: Structured regression for efficient object detection

Structured Support Vector Machine

SVM-like framework by Tsochantaridis et al.:• Positive definite kernel k : (X × Y)× (X × Y)→R.ϕ : X × Y → H : (implicit) feature map induced by k.

• ∆ : Y × Y → R: loss function

• Solve the convex optimization problem

minw,ξ12‖w‖

2 + Cn∑

i=1ξi

subject to margin constraints for i =1, . . . , n :

∀y ∈ Y \ {yi} : ∆(y, yi) + 〈w,ϕ(xi , y)〉−〈w,ϕ(xi , yi)〉 ≤ ξi ,

• unique solution: w∗ ∈ H

• I. Tsochantaridis, T. Joachims, T. Hofmann, Y. Altun: Large Margin Methods for Structured and InterdependentOutput Variables, Journal of Machine Learning Research (JMLR), 2005.

Page 33: Structured regression for efficient object detection

Structured Support Vector Machine

• w∗ defines compatiblity function

F(x , y)=〈w∗,ϕ(x , y)〉

• best prediction for x is the most compatible y:

g(x) := argmaxy∈Y

F(x , y).

• evaluating g : X → Y is like generalized Sliding Window:I for fixed x, evaluate quality function for every box y ∈ Y.I for example, use previous branch-and-bound procedure!

Page 34: Structured regression for efficient object detection

Joint Image/Box-Kernel: Example

Joint kernel: how to compare one (image,box)-pair (x , y) withanother (image,box)-pair (x ′, y ′)?

kjoint

(,

)= k

(,

)is large.

kjoint

(,

)= k

(,

)is small.

kjoint

(,

)= kimage

(,

)could also be large.

Page 35: Structured regression for efficient object detection

Loss Function: Example

Loss function: how to compare two boxes y and y ′?

∆(y, y ′) := 1− area overlap between y and y ′

= 1− area(y∩ y ′)area(y∪ y ′)

Page 36: Structured regression for efficient object detection

Structured Support Vector Machine

• S-SVM Optimization: minw,ξ12‖w‖

2 + Cn∑

i=1ξi

subject to for i =1, . . . , n :

∀y ∈ Y \ {yi} : ∆(y, yi) + 〈w,ϕ(xi , y)〉−〈w,ϕ(xi , yi)〉 ≤ ξi ,

• Solve via constraint generation:• Iterate:

I Solve minimization with working set of contraintsI Identify argmaxy∈Y ∆(y, yi) + 〈w,ϕ(xi , y)〉I Add violated constraints to working set and iterate

• Polynomial time convergence to any precision ε

• Similar to bootstrap training, but with a margin.

Page 37: Structured regression for efficient object detection

Structured Support Vector Machine

• S-SVM Optimization: minw,ξ12‖w‖

2 + Cn∑

i=1ξi

subject to for i =1, . . . , n :

∀y ∈ Y \ {yi} : ∆(y, yi) + 〈w,ϕ(xi , y)〉−〈w,ϕ(xi , yi)〉 ≤ ξi ,

• Solve via constraint generation:• Iterate:

I Solve minimization with working set of contraintsI Identify argmaxy∈Y ∆(y, yi) + 〈w,ϕ(xi , y)〉I Add violated constraints to working set and iterate

• Polynomial time convergence to any precision ε

• Similar to bootstrap training, but with a margin.

Page 38: Structured regression for efficient object detection

Evaluation: PASCAL VOC 2006

Example detections for VOC 2006 bicycle, bus and cat.

Precision–recall curves for VOC 2006 bicycle, bus and cat.

• Structured regression improves detection accuracy.• New best scores (at that time) in 6 of 10 classes.

Page 39: Structured regression for efficient object detection

Why does it work?

Learned weights from binary (center) and structured training (right).

• Both methods assign positive weights to object region.• Structured training also assigns negative weights to

features surrounding the bounding box position.• Posterior distribution over box coordinates becomes more

peaked.

Page 40: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

aeroplane

Page 41: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

bicycle

Page 42: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

bird

Page 43: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

boat

Page 44: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

bottle

Page 45: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

bus

Page 46: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

car

Page 47: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

cat

Page 48: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

chair

Page 49: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

cow

Page 50: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

diningtable

Page 51: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

dog

Page 52: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

horse

Page 53: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

motorbike

Page 54: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

person

Page 55: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

pottedplant

Page 56: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

sheep

Page 57: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

sofa

Page 58: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

train

Page 59: Structured regression for efficient object detection

More Recent Results (PASCAL VOC 2009)

tvmonitor

Page 60: Structured regression for efficient object detection

Extensions:

Image segmentation with connectedness constraint:

CRF segmentation connected CRF segmentation

• S. Nowozin, C.L.: Global Connectivity Potentials for Random Field Models, CVPR 2009.

Page 61: Structured regression for efficient object detection

Summary

Object Localization is a step towards image interpretation.

Conceptual approach instead of algorithmic :• Branch-and-bound evaluation:

I don’t slide a window, but solve an argmax problem,⇒ higher efficiency

• Structured regression training:I solve the prediction problem, not a classification proxy.⇒ higher localization accuracy

• Modular and kernelized:I easily adapted to other problems/representations, e.g.

image segmentations

Page 62: Structured regression for efficient object detection