Download pptx - Object Detection

Object Detection01 – Advance Hough

TransformationJJCAO

2

Line and curve detection• The HTis a standard tool in image analysis that allows

recognition of global patterns in an image space by recognition of local patterns in a transformed parameter space.

• HT: Elegant method for direct object recognition– Edges need not be connected– Complete object need not be visible– Key Idea: Edges VOTE for the possible model

Detect partially occluded lines

3HT for Lines• polar representation of lines

Parameter space

(b,m)

Image space

y=mx+b

(x,y)

Hough_Grd• Recall: when we detect an

edge point, we also know its gradient direction

• But this means that the line is uniquely determined!

• Modified Hough transform:

• For each edge point (x,y) θ = gradient orientation at (x,y)ρ = x cos θ + y sin θA(θ, ρ) = A(θ, ρ) + 1

end Θ=[0-360] so there is a conversion

Hough transform for circles

),(),( yxIryx

x

y

(x,y)),(),( yxIryx

xy

r

image space Hough parameter space

6

Generalizing the H.T.

(xc,yc)

Pi

firi ai

xc = xi + ricos(ai)

yc = yi + risin(ai)

• Suppose, there were m different gradient orientations: (m <= n)

f1

f2

.

.

.fm

(r11,a1

1),(r12,a1

2),…,(r1n1,a1

n1)(r2

1,a21),(r2

2,a12),…,(r2

n2,a1n2)

.

.

.(rm

1,am1),(rm

2,am2),…,(rm

nm,amnm)

fj

rj

aj

R-table

Generalized Hough TransformFind Object Center given edges

Create Accumulator Array

Initialize:

For each edge point

For each entry in table, compute:

Increment Accumulator:

Find Local Maxima in

),( cc yxA),(0),( cccc yxyxA

),,( iii yx f

1),(),( cccc yxAyxA

),( cc yxA

ik

ikic

ik

ikic

ryy

rxx

a

a

sin

cos

ikr

),( cc yx ),,( iii yx f

•Assumption: translation is the only transformation here, i.e., orientation and scale are fixed

Voting schemes

• Let each feature vote for all the models that are compatible with it

• Hopefully the noise features will not vote consistently for any single model

• Missing data doesn’t matter as long as there are enough features remaining to agree on a good model

Application in recognition• Instead of indexing displacements by gradient

orientation, index by “visual codeword”

Combined Object Categorization and Segmentation with an Implicit Shape Model_ECCV04Object Detection Using a Max-Margin Hough Transform_CVPR09

training image

visual codeword withdisplacement vectors

Application in recognition• Instead of indexing displacements by gradient

orientation, index by “visual codeword”

test image

Combined Object Categorization and Segmentation with an Implicit Shape Model_ECCV04Object Detection Using a Max-Margin Hough Transform_CVPR09

Implicit shape models: Training

1. Build codebook of patches around extracted interest points using clustering (more on this later in the course)


1. Build codebook of patches around extracted interest points using clustering

2. Map the patch around each interest point to closest codebook entry


1. Build codebook of patches around extracted interest points using clustering

2. Map the patch around each interest point to closest codebook entry

3. For each codebook entry, store all positions it was found, relative to object center

Implicit shape models: Testing1. Given test image, extract patches, match to

codebook entry 2. Cast votes for possible positions of object

center3. Search for maxima in voting space4. Extract weighted segmentation mask based

on stored masks for the codebook occurrences

Implicit shape models: Details• Supervised training

– Need reference location and segmentation mask for each training car

• Voting space is continuous, not discrete– Clustering algorithm needed to find maxima

• How about dealing with scale changes?– Option 1: search a range of scales, as in Hough transform for

circles– Option 2: use scale-covariant interest points

• Verification stage is very important– Once we have a location hypothesis, we can overlay a more

detailed template over the image and compare pixel-by-pixel, transfer segmentation masks, etc.

Hough transform: Discussion• Pros

– Can deal with non-locality and occlusion– Can detect multiple instances of a model– Some robustness to noise: noise points unlikely to

contribute consistently to any single bin• Cons

– Complexity of search time increases exponentially with the number of model parameters

– Non-target shapes can produce spurious peaks in parameter space

– It’s hard to pick a good grid size

• Hough transform vs. RANSAC vs. Geometric hashing

On Geometric Hashing and the generalized hough transform_tsmc94

19

Detection of multiple object instances

Detection of multiple object instances using Hough transform_cvpr10

Olga BarinovaGraphics&Media Lab

Moscow State University

Victor LempitskyYandex company

MoscowVisual Geometry

Group, University of Oxford – postdoc

Pushmeet KohliMachine Learning and Perception

Microsoft Research Cambridge

• Slides from CVPR 2010 [zip]• Talk at CVPR 2010 [link]• C++ code for pedestrians detection original Visual Studio 2005 solution or Linux

Port by Dr. Rodrigo Benenson.• C++ code for lines detection the latest version, which is much faster and more

accurate

http://graphics.cs.msu.ru/files/research/presentation_CVPR_v2.0.zip

http://videolectures.net/cvpr2010_barinova_odmo/

http://graphics.cs.msu.ru/files/tmp/ObjectDetection.zip

https://github.com/rodrigob/barinova_pedestrians_detection

https://github.com/rodrigob/barinova_pedestrians_detection

http://graphics.cs.msu.ru/files/download/LINES_DETECTION_V2.0.zip

http://graphics.cs.msu.ru/files/download/LINES_DETECTION_V2.0.zip

20

Major flaw of HT• Lacks a consistent probabilistic model

– Does not allow hypotheses to explain away the voting elements

• Maximum in Hough image corresponds to a correctly detected object

• The voting elements that were generated by this object also cast votes for other hypotheses

• The strength of those spurious votes is not inhibited => pseudo maximum

• Various non-maxima suppression (NMS) heuristics have to be used to localized peaks in the Hough image, which involve specification and tuning of several parameters:

– sweep-plane approach (Real-time line detection through an improved Hough transform voting scheme_pr08)

– …