Download pdf - CS231A TA SESSION - cvgl.stanford.edu

CS231A TA SESSIONPROBLEM SET 4

Christopher B. Choy

http://chrischoy.org/

QUESTION 1

HOUGH TRANSFORMTransform a point in a cartesian coordinate toall lines (line parameters) that pass the point in the polar coordinate

H : (x, y) → {(r, θ)|r(θ) = x cos θ + y sin θ}

y = + x + → r(θ) = x cos θ + y sin θcos θsin θ

rsin θ

GENERALIZED HOUGH TRANSFORMfeature to points in the 2D Hough space

is the reference origin of the shapeFor points on the boundary, store : R-Table

As a function of the gradient Rotation and scale : 4D Hough Space

Φ(x) S = { }a⃗ a⃗

x ⃗ = +r ⃗ a⃗ x ⃗ Φ( )x ⃗

D.H. Ballard, "Generalizing the Hough Transform to Detect Arbitrary Shapes", Pattern Recognition, Vol.13, No.2, p.111-122, 1981

IMPLICIT SHAPE MODELR-Table as a function of an image patchtransform an image patch to object centers

B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV 2004

QUESTION 2

BAG OF VISUAL WORDS

Feature Extraction

Clustering

Encoding

Spatial Binning

Kernel SVM k(x, y) = ⟨Ψ(x),Ψ(y)⟩

BAG OF VISUAL WORDSFeature Extraction

Local image featureRobust to typical image transformations

Dense SIFT SIFT at every location vs a key point

Interest points might not correspond toforeground

Clustering (Dictionary)Visual words should be distinctive anddiverseCommon visual words (wheel) will form acluster

K-means clustering

BAG OF VISUAL WORDSEncoding

Maps local features to visual wordsHard quantization

assign to the NN

Spatial PyramidEncode spatial informationDivide the image into subsectionsFor each subsection, create aVW histogram

SIMILARITY METRIC FOR DISTRIBUTIONSTrain a linear SVM on features (distribution): Distance metric??

How similar is a distribution from a distribution : -divergence!P Q f

, convexf : (0,7) →

(P||Q) = ∫ f ( ) dQDfdPdQ

Kullback-Leibler Divergence

Divergencef (x) = x log x

χ2

f (x) = (x + 1)2

FOR A DISCRETE CASEIntersection kernel :

kernel :

k(x, y) = *min( , )xi yi

χ2 k(x, y) = *12

( +xi yi)2

+xi yi

In the problem 2, we will use kernel for the similarity metricχ2

HOMOGENEOUS KERNEL AND FINITE APPROXIMATION expensive, non linear!

But in the feature space it becomes linear ( dimension)K(x, y)

Ψ(Þ) 7

Feature map using the Homogeneous KernelGeneric histogram kernel is

Homogeneous if intersection, , Hellinger's, Jensen-Shannon's

Find a finite feature map that

Map distributions using then it becomes a linear classificationproblem in the feature space!

Ψ(x)K(x, y) = k( , )*i xi yi

k(cx, cy) = ck(x, y)χ2

(Þ)Ψ̂ k(x, y) = ⟨Ψ(x),Ψ(y)⟩ a ⟨ (x), (y)⟩Ψ̂ Ψ̂

(Þ)Ψ̂

HOMOGENEOUS KERNELUse the homogeneity,

Ex, intersection kernel

Define as the Fourier transformation of

Continuous, infinite dimensional featureSample at points

In the problem 2, we use vl_homkermap()

k(x, y) = k ( , ) = (log(y/x))xy‾‾√ x/y‾ ‾‾3 y/x‾ ‾‾3 xy‾‾√

k(x, y) = min(x, y) = (log )xy‾‾√yx

(w) = e+|w|/2

κ(λ) (w)Ψ(x) = e+iλ log x xκ(λ)‾ ‾‾‾‾3

λ = +nL, (+n + 1)L, . . . , nL(x) !Ψ̂ 2n+1

k(x, y) a ⟨ (x), (y)⟩Ψ̂ Ψ̂ n = 1

BAG OF VISUAL WORDS

Feature Extraction

Clustering

Encoding

Spatial Binning

Kernel SVM k(x, y) = ⟨Ψ(x),Ψ(y)⟩

BAG OF VISUAL WORDSInstall VL Feat, an extensive CV library

Set up the path correctly in the starter code p2.m

Vary options and see how it affects the performance

http://www.vlfeat.org/

http://www.vlfeat.org/

Code extract_dense_sift.m

Image feature extraction

Dense SIFT features

use vl_dsift()

randomly select 10,000 features

Code create_dictionary.m

Create dictionary of visual words

Use k-means to find the centroid of clusters.

use vl_kmeans()

Code create_histograms.m

Represent images as histograms

Spatial pyramid