89
Advanced Topics – I (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Advanced Topics – I (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Embed Size (px)

DESCRIPTION

Advanced Topics – I (EENG 4010) Computer Vision & Image Analysis (EENG 5640). Introduction to Computer Vision. Image Processing System. Image. Image. Computer Vision/ Image Analysis/ Image Understanding System. Image/ Scene Description. Image. Pattern Classification Label. - PowerPoint PPT Presentation

Citation preview

Page 1: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Advanced Topics – I (EENG 4010)Computer Vision & Image Analysis

(EENG 5640)

Page 2: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Introduction to Computer Vision

Image Processing

System

Computer Vision/ Image Analysis/ Image Understanding System

Image Image

Image

Image/ Scene

Description

Pattern Recognition

System

Pattern Vector (with Image measurements as components in the

current application)

Pattern Classification

Label

Computer Vision generally involves pattern recognition

Page 3: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Typical Computer Vision Applications

Medical Imaging Automated Manufacturing (some experts use

Machine vision as the term to describe Computer Vision for Industrial applications; others use it as synonym for computer vision)

Remote Sensing Character Recognition Robotics

Page 4: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Binary Image Analysis

Grey scale to Binary transformation (Otsu’s method)

Counting holes Counting objects Connected Component Labeling Algorithms

– Recursive Algorithm– Two Pass Row by Row Labeling Algorithm

Page 5: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Two-Pass Algorithm: Illustrative Example

1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 11 1 1 1 1 1 1 1 1 1 1 1

1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 11 1 1 11 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 2 2 2 2 21 1 1 1 1 2 2 2 2 21 1 2 21 1 1 1 1 1 2 2 2 2 2 2

1 1 2 23 3 3 1 1 1 4 4 4 2 2 23 3 4 43 3 4 43 3 3 3 3 3 3 3 3 3 33 3 3 3 3 3 3 3 3 3 3

3

1

4

2

1 2 3 4 --0 1 1 2

LabelParent

Page 6: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Binary Image analysis (Contd.)

Morphological Processing– Dilation, Erosion, Opening and Closing

Operations– Example to Illustrate the effects of the operations

Region Properties– Area– Perimeter– Circularity

Page 7: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Medical Application of Morphology

Page 8: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Industrial Application of Morphology

Page 9: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Grey Level Image Processing

Image Enhancement Methods– Histogram Equalization and Contrast Stretching– Mitigation of Noise Effects

Image Smoothing Median Filtering Frequency Domain Operations (Low Pass Filtering)

Image Sharpening and Edge Detection– High Pass Filtering– Differencing Masks (Prewitt, Sobel, Roberts, Marr-

Hildreth operators)– Canny Edge Detection and Linking

Page 10: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Histogram Equalization- Original Image

Page 11: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Histogram Equalization- Equalized Image

Page 12: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Low Contrast Image

Page 13: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Contrast Stretching (Linear Interpolation between 79-136)

Page 14: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Histogram Equalization

Page 15: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Color Fundamentals

[0, 0, 0]Black

[0, 1, 0]Green

[1, 0, 0]Red

[0, 0, 1]Blue

[1, 0, 1]Magenta

[1, 1, 1]WhiteWhite

[1, 1, 0]yellow

[1, 1, 0]yellow

[0, 1, 1]Cyan

[0, 1, 0]Green

[1, 1, 0]yellow

[1, 1, 0]yellow

[1, 0, 0]Red

[1, 0, 1]Magenta[0, 0, 1]

Blue

[0, 1, 1]Cyan

[1, 1, 1]WhiteWhite

Page 16: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

RGB and HSI (HSV) Systems

Page 17: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

RGB-HSI Convesion

Page 18: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

RGB to HSI Conversion- Final Formulae

Page 19: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

HIS-RGB Conversion

Method is given. You need to reason out why? Or explore web for answer.

Page 20: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

YIQ and YUV Systems for TVs, etc.

YIQ system is used in TV Signals. Its components are:Luminance Y = 0.30R + 0.59G + 0.11BR-Cyan I = 0.60R - 0.28G - 0.32BMagenta-Green Q = 0.21R - 0.52G + 0.31B

In some digital products and JPEG/MPEG Compression algorithms, YUV System as follows is used: Y = 0.30R + 0.59G + 0.11B U = 0.493* (B – Y) V = 0.877*(R – Y)

Advantage: Luminance and Chromaticity components can be coded with different number of bits.

Page 21: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Optical Illusion - I

Page 22: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Optical Illusion - II

Page 23: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Optical Illusion - III

Page 24: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Texture

Pattern caused by a regular spatial arrangement of pixel colors or intensities.

Two approaches– Structural or Syntactic (usually used in case of

synthetic images by defining a grammar on texels). – Statistical or quantitative (more useful in natural

texture analysis; can be used to identify texture primitives (texels) in the image.

Page 25: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Quantitative Texture Measures

Edge related– Edginess (proportion of strong edges in a small

window around pixels.– Edge direction histograms (the pattern vector

constituted by the proportion of the edgels in the horizontal, vertical, and other quantized directions among the total pixels in a chosen window around a pixel (I,j) under consideration

Co-occurrence matrix based

Page 26: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Co-occurrence matrix based Measures

Construction of Co-occurrence matrix Cd [i,j] where d is the displacement of j from i (e.g. (0,1), (1, 1), etc.

Normalized and symmetric co-occurrence matrices Nd [i,j] and Sd [i,j].

Zucker and Terzopoulos’s Chi-square metric to choose the best d (i.e. d with most structure).

Numeric measures from Nd [i,j]

Page 27: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Choice of the Best Co-occurrence Matrix and Computation of Features

Page 28: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Laws’ Texture Energy Measures

Simple because masks are used 2-D masks are created using 1-D masks:

– L5 (Level) = [ 1 4 6 4 1]– E5 (Edge) = [-1 -2 0 2 1]– S5 (Spot) = [-1 0 2 0 -1]– W5 (Wave) = [-1 2 0 -2 1] (not in the text!)– R5 (Ripple) = [ 1 – 4 6 -4 1]

(e.g. 5x5 matrix of L5E5 mask is obtained by multiplying transpose of L5 by E5).

Page 29: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Laws’ Algorithm for Texture Energy Pattern Vector Construction

Page 30: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Laws’ Texture Segmentation Results- I

Page 31: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Laws’ Texture Segmentation Results- II

Page 32: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Laws’ Texture Segmentation Results- III

Page 33: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Gabor Filter Based Texture Analysis

Where

Gabor Filter is mathematically represented by (refer Wikipedia):

and

θ Orientation of the normal to parallel stripes

λ Wavelength or inverse of the frequency of the cosine function

Spatial aspect ratio Sigma of the Gaussian functionσ

ψ Phase offset of the cosine function

Page 34: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Image Segmentation

Image Segmentation

Contour-Based Methods (e.g. Canny Edge Detection and Linking)

Region-Based Methods

Partitioning/Clustering (e.g. K-Means Clustering, Isodata clustering, Ohlander’ et al.’s recursive histogram-based technique)

Region Growing (e.g. Haralick and Shapiro Method)

Page 35: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Clustering Algorithms

Clustering (partitioning) of pixels in the pattern space Each pixel is represented by a pattern vector of

properties. For example, in case of a colored image, we could have

could be of any dimensionality (even 1, i.e. could be a scalar as in case of a grey level image).

Depending upon the problem, may include other measurements on texture, shading, etc. that constitute additional dimensional components of the pattern vector.

Note: i and j denote pixel row & columns.

Page 36: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

K-Means Clustering algorithm

Page 37: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Isodata Clustering Algorithm

T

Page 38: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

ISODATA Clustering Problem

r

g

3

12

31

133

10

011

10

012

1

1.

42

4

8.23

2

2X

To which cluster does X belong to?

If split threshold TS = 3.0 and Merge Threshold TM = 1.0, what will be the new cluster configuration? Get new cluster means in case of aSplit.

Page 39: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Image Databases- Content-Based Image Retrieval

Any Problem with Traditional Text (in Caption) Based Retrieval?

Typical SQL (Structured Query Language) Query:SELECT * FROM IMAGEDBWHERE CATEGORY = ‘GEMS’ AND SOURCE = ‘SMITHSONIAN’ AND (KEYWORD = ‘AMETHYST’ OR KEYWORD = ‘CRYSTAL’

OR KEYWORD = ‘PURPLE’)

This will retrieve the gem collection of the Smithsonian Institute from its IMAGEDB database restricting its search based on the logical combination of the keyword specified.

Looks like no problem here!

Page 40: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Limitations of the Key Word Based Retrieval

Human coding of key words is expensive; but still some keywords by which one likes to retrieve the image cannot be visualized and hence may be left out. Key words may sometimes retrieve unexpected images as well!

What kind of images do you expect with the key word ‘pigs’?

Page 41: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Unexpected Retrieval- An Example

Page 42: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Content-Based Image Retrieval

Uses Query-By-Example (QBE) Concept IBM’s QBIC (Query By Image Content) is the first

system In QBE systems, you specify an example plus

some constraints Typical example images for specification-

– A digital Photograph– User painted drawing– A line-drawing sketch

Page 43: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Matching- Image Distance (Similarity Measures)

4 Major classes: Color Similarity Texture Similarity Shape Similarity Object and relationship similarity

Page 44: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Color Similarity Measures

QBIC lets the user choose up to 5 colors from the color table and specify their percentages

Color histograms (K-bin) can be used

Here h(I) and h(Q) are K-bin histograms of images I and Q, and A is (K x K) similarity matrix.

))()(())()((),( QhIhAQhIhQID Thist

Page 45: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Color-Layout-Based Similarity

Distances between corresponding grid squares of the database and example images are found and summed up.

Each grid square spans over multiple pixels. Then how do you compare grid squares?– Use Mean Color– Use Mean and Standard Deviation– Use Multi-bin Histogram

))(),((),(_ gCgCDQID QI

gcolorcolorgridded

Page 46: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Texture-Based Similarity Measure

Pick-and-Click distance

Grid based texture similarity can be found by the same process as in the gridded color case

2__ ||)()(||min),( QTiTQID Iiclickandpick

))(),((),(_ gTgTDQID QI

gtexturetexturegridded

Page 47: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Shape-Based Similarity Measures

Histogram approach is difficult to apply particularly when you want scale and rotation invariance.

Boundary Matching Granlund’s Fourier Descriptors for

Translation, Scale, starting point (for boundary tracing), and rotation invariant matching.

Page 48: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Boundary (Sketch) Matching

Obtain a normalized image- reduce the original image to a fixed size, e.g., (64x64) & median filter

2 stage edge detection-global and local thresholds.

Perform linking and thinning. Find Correlation between line drawing (L)’s grid

square and various shifts (n) of the DB image A’s grid square & sum up best correlations.

))]()),((([max

1),(

gLgAshiftDQID QI

gnncorrelation

sketch

Page 49: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Line Sketch of a Horse

Page 50: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Retrieved Images of Paintings

Page 51: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Granlund’s Fourier Descriptors

Let be the points on the boundary of the query shape.

The k-th discrete Fourier coefficient is given by

Leaving out and , we can compute translation, rotation, starting-point and scale invariant shape descriptors ) as follows:

Page 52: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Invariant Properties of Fourier Descriptors

Translation

Rotation |.| takes care of the problem because

Scale By using for scaling, we are eliminating c.

Starting point- Once again |.| operation helps!

Page 53: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Relational Similarity

Spatial Relationship- relational graphs indicating inter-object relationships can be constructed. Once objects are identified, relationships can be matched, by graph matching techniques.

Abstract Relationship- Happy face; it involves separation and identification of a face region first, and then checking whether it is a happy or sad face.

Page 54: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Matching in 2D

Transformation- Mapping from one coordinate space to another. – linear or nonlinear (called warps)– one to one correspondence between points if linear– Invertible or non-invertible

Image Registration- Process of establishing point by point correspondence between two images of a scene.

Page 55: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Affine Transformation (Mapping)

Wikipedia Definition: Affine (Latin affinis Connected with) mapping between two vector (affine) spaces is a linear transformation followed by translation.

It preserves:– Collinearity of points– Ratios of distances along a line

Page 56: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Image Operations Represented by Affine Transformation

Can we write the affine transformation

Y = A.X + t (Can we absorb t into A)? For scaling and rotation, yes. For translation, does not seem to be

possible. What is the way out?

Page 57: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Homogeneous Coordinates

Introduced by August Ferdinand Mobius (1790-1868), a German mathematician and theoretical astronomer.

(x, y) (w.x, w.y, w); (2.x, 2.y, 2) = (3.x, 3.y, 3) (x, y, z) (w.x, w.y, w.z, w); Same concept an

be extended to any n-dimensional space. You can represent a point at infinity (how?)

Page 58: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Usage of Homogeneous Coordinates

The rotation, scaling, and translation can be modeled as matrix multiplication operations as follows:

If control (easily distinguishable) points of two images are identified, and registered with an affine mapping, it is easy to identify the presence of the same object(s) in both- Recognition by Alignment.

Page 59: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Shears and Reflections

(0, 0)

(0, 1)(ey, 1)

(1, 0)

(1, 1) (1+ey, 1)

x

y

(0, 0)

(0, 1) (1, 1)

(1, 0)x

y

(1, 1+ex)

(1, ex)

Horizontal ShearVertical Shear

Reflections about x-axis (x, y) (x, -y) Reflections about y-axis (x, y) (-x, y)You can express these also as an affine transformations.

Page 60: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

General Affine Transformation

You may consolidate all the previous affine transformations (translation, rotation, scale, shear, and reflection) into the general affine transformation as follows:

Page 61: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Best 2D Transformation with Least-Squares Fitting

Let (xj, yj), j = 1, n the control points in an image, and (x’j, y’j) are the corresponding points in the transformed image. Then the least-squares fit method seeks to minimize the error:

Setting the 6 partial derivatives of the form corresponding to the 6 translation parameters to zero, we get 6 equations for 6 unknowns.

Page 62: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

2D-Object Recognition via Affine Mapping- Local Feature Focus Method

Local-Feature-Focus-Method{ For each pair of model and image features { Find the maximal subset of matching neighboring features;

Find Best T;

If enough features align, confirm the presence of the model object; } }

Model F

Model EImage

F1

F2 F3

F4

E1

E2

E3

E4

G1G2

G3G4G8

G5G6 G7

Page 63: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

2D-Object Recognition via Affine Mapping- Pose Clustering Method

Pose-Clustering-Method (P, L){ // P is the set of image features // L is the set of stored model features For each pair (Pi, Pj) of image features For each pair (Lm, Ln) of model features of the same type { Compute the affine (RST) parameters Each will be a point in the parameter space } Examine the parameter space for large cluster modes return all k ‘s corresponding to dominant modes}

‘L’ Junction ‘Y’ Junction ‘T’ Junction

Arrow Junction

‘X’ Junction

Some Typical Model Features

Page 64: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

2D-Object Recognition via Affine Mapping- Geometric Hashing

• Useful when model database (DB) is very large and an object in the image is known to be an affine transform of one of the DB models e00

e01

e10

Te00

Te01

Te10

Txx

• If e00, e01, and e10 are any three non-collinear feature points from the model feature point set M, any point x e M can be represented as follows using the affine basis set (coordinate set) constructed from these 3 points: x = (e10 – e00) + (e01 – e00) + e00

• We use the property that under an affine transform, the same relation holds: Tx = (Te10 – Te00) + (Te01 – Te00) + Te00

Page 65: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Geometric Hashing Method (Contd.)

GH-Offline-Preprocessing (D, H){ //D- Database Model Set // H- Initially empty hash table for each model M in D { Extract feature set FM; For each non-collinear triplet E of FM For each other point x of FM

{ Calculate (, ) for x with respect to E Store (M, E) in the table H at index (, ) } }}

Page 66: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Geometric Hashing Method (Contd.)

GH-Online-Recognition (D, H){ //- Database Model Set; H- Hash table constructed in the offline processing for each possible (M, E) tuple in the database { set Bin-Count (M, E) = 0; } Extract image feature set FI; for each non-collinear triplet E of FI and for each other point x of IM

{ Calculate (, ) for x with respect to E; Retrieve (M, E) pairs in the table H at index (, ); Increment the Bin-Count of those (M, E) pairs; } Return (M, E) values with highest Bin-Count values.}

Page 67: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Practical Problems in Geometric Hashing

Errors in feature point coordinates Missing and extra feature points Occlusions and multiple objects Unstable bases Weird affine transforms on subsets

points and consequent hypotheses for presence of hallucinated objects (This problem is present in pose clustering and focus feature methods also).

(a) Image Points

(b) Hallucinated Object

Page 68: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

General Framework for 2D-Object Recognition via Relational Matching

Consistent Labeling Problem is a 5-tuple

(P, L, RP, RL, f) P- Object Parts (found in the image) L- Object Labels (names of stored model features) RP – Set of Relationships between Parts;

RL – Set of Constraint Relationships between Labels f is a mapping such that

if (pi, pj) RP, then (f(pi), f(pj)) RL

Page 69: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Brute Force Method for Consistent Labeling - Interpretation Tree Search

Bool Interpretation-Tree-search(P, L, RP, RL, f){ p = first(P); for each I in L { f’ = f U {(p, I)};//add part-label to interpretation OK = true; for each N-tuple (p1, …, pN) in RP containing p { if then OK = false; break; } if OK then { P’ = P – {p}; if is-empty(P’) then output(f’); else Interpretation-Tree-Search(P’, L, RP, RL, f’) } }

C1 C2

C3C4

C5 C6

C7 C8

H1H2

H3 H4

P2P1

P3P4

P4 P5Nil

(a) Labels

(b) Image Parts

(c) Interpretation Tree

P1 = H1P1 = C1

Ln Rpfpf ))('),...,('( 1

Page 70: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Discrete Relaxation Labeling

Discrete Relaxation Labeling(P, S, R){ // Pi , i = 1, …, D, is the set P of the detected image features // S is the set of sets S(Pi ), i = 1, …, D, of initially compatible labels for Pis // R = set of relationships over which compatibility is determined repeat { for each (Pi, S(Pi)) { for each label Lk S(Pi)

for each relation R(Pi, Pj) over image parts If there exists Lm S(Pj) with R(Lk, Lm) in model

then keep Lk in S(Pi) else delete Lk from S(Pi) } } until no change in any S(Pi) return (S)}

Page 71: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Continuous (Probabilistic) Relaxation

)'()',('}),(|{

lprllrClq iterj

Llij

Rjijij

iteri

jP

0iter)()(0 lprlpr ii

Loop on and until labels of all parts stabilize and become unique i iter

iLl

iteri

iteri

iteri

iteriiter

i lqlpr

lqlprlpr

'

1

))'(1)('(

))(1)((

compatibility values

Page 72: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Extraction of 3D Information from 2D Images- Shape from “X” Techniques

Direct 3D perception- Range Imaging (Costly) How do human perceive depth/3D shape?

– Stereo– Shading– Monocular cues, e.g., Relative depth information

from occlusions of the background objects by foreground objects, perspective view with farther objects appearing smaller with distance

Shape from X (X = binocular stereo/ photometric stereo/shading/texture/ boundary/motion

Page 73: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Binocular Stereo

f

d

P

z

x'

x’’ x’/f = OP/(f + z)x’’/f = (OP + d) / (f + z)(x’’-x’)/f = d/(f + z)z = f . d /(x’’ – x’) - f

Depth can be inferred from the disparity (x’’-x’)Only problem remains to be solved is the correspondence problem.

Page 74: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Surface Orientation from Reflectance Models

x

yz

ie

g

i = angle of incidencee = angle of emittanceg = phase anglen, ng, and ns are unit vectors along the surface normal, view direction, and source direction, respectively.

n ng ns

For specular (smooth mirror-like) surfaces, maximum amount of light is reflected in the direction of what is called specular angle, and it reduces in the directions away from this one. An estimate of the cosine of the difference between the specular and viewing angles is given by:C = 2cos(i) cos(e) – cos (g). For dusty/matte (Lambertian) surfaces, reflectivity in any direction is proportional to angle of incidence. A general formula that includes both effects is: L(i, e, g) = s Cn + (1 – s) cos (i) 0 <= s <= 1. Larger then, sharper the peaking in the specular direction.

Page 75: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Reflectance Map for Lambertian Surfaces

p

qR = r0 n.ns … (1)

The spatially varying reflectance factor r0 is called albedo. For a surface z = f(x, y), letp = z/x and q = z/y

x yx

yz

z= (z/x). x+ p. x

(x, 0, p.x )T || l to (1, 0, p)T = rx, say (0, y, p.y)T || l to (0, 1, q)T = ry, say

n rx x ry = (-p, -q, 1)T

ns (-ps, -qs, 1) rPutting in (1), we get R(p, q)

n

0.9

0.8

0.7

Page 76: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Photometric Stereo

With 3 sources of lightRk(x, y) = Ik(x, y) = ro(nk.n) k = 1, …, 3

I = r0 N. n where I = [I1(x, y), I2(x, y), I3(x, y) ]T and

ro = |N-1 I|n = 1/ r0 . N-1 I p

q

R1(p, q) = 0.9

R2(p, q) = 0.75

R3(p, q) = 0.5

Solution point

Page 77: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Shape from Shading

We know that (-p, -q, 1) is the vector in the direction of surface normal and (-ps, -qs, 1) is the corresponding vector for source direction, the reflectance (i.e. image intensity) for a Lambertian surface is given by:

At each pixel site (k, l) we need to find the best (pkl, qkl) pair that gives an Rkl matching with the image intensity Ekl. In other words, we minimize

This may not have unique solution. Hence, we used in photometric stereo 3 images to resolve the ambiguity. Here we use surface continuity for the same:

Overall we minimize We get the final solution by setting and

Page 78: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Shape from Shading- Ikeuchi’s Relaxation Approach

Continuing from the previous slide by setting and , we get

and

Now, the equations for obtaining p and q iteratively by Ikeuchi’s relaxation approach are:

Advantage: Relaxation method can enforce the boundary conditions and get good solutions. Limitation: In the current formulation, occluding boundary with p and q at causes a problem. How to solve this problem?

Mask to compute average p and q

Page 79: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Photometric Stereo by Relaxation Approach

Same relaxations equations can be extended to the photometric stereo problem involving n (= 2 or more) images:

Here and represent the intensity at the pixel site in the image captured with the th light source, and the corresponding reflectance map, respectively.

Advantage: Relaxation method can enforce the boundary conditions and get good solutions. Limitation: In the current formulation, occluding boundary with p and q at causes a problem. How to solve this problem? See the next slide!

Page 80: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Solution to the Problem of Infinite Gradients at Occluding Boundary

Ikeuchi suggested to formulate the problem in polar coordinates and solve!If P is a point on a surface patch, and OP is a unit normal, the x, y, and z components of this vector are given by:(sin . cos , sin . sin , cos ) … (1)We can represent similarly the x, y, and z components of the unit vector in the direction the light source with polar coordinates (s, s) as:(sin s. cos s, sin s. sin s, cos s)Since, for Lambertian surfaces, R = ro. cos i where i is theangle between the two directions, we can rewrite R as thefollowing function of and :R (, ) = ro. (sin sin s cos cos s + sin sin s sin sin s + cos cos s )= ro. ( sin sin s cos( - s ) + cos cos s ). At the occluding boundary = /2 and = tan-1y/x. The sameequations as before hold with and replacing p and q.From and to p and q may be done in the end using (1).

x

y

z

O

P

Page 81: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Needle Diagram (Display of Unit Normal Vectors in the Image Space)

(a) Image of a resin Droplet on a flower of a plant

(b) Needle diagram for the image in (a)

Page 82: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Depth Construction from the (p, q) tuples (Needle Diagram)

Since p = z/x and q = z/y,

Because of imperfect (p, q) values, this integral may not yield correct values. The values may be sensitive to the path chosen for integration. Integral around a close loop of pixels may not vanish. Hence, better approach is to choose p and q that minimizes: . From the calculus of variations, an

integral

of the form can be minimized by solving the Euler equation:

. For our problem, the Euler equation would yield:

Page 83: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Z-map Construction Algorithm

1. Convolve the p and q maps with horizontal and vertical Prewitt/Sobel masks to get px and qy maps and add them up to get s= px+qy map.

2. Start with a random configuration of z-values on the image grid. Or, preferably obtain a crude z-map by starting with z (0, 0) = 0 and applying the integral for computing z from p and q values by visiting the cell sites in raster scan fashion.

3. Convolve the z-image with horizontal Prewitt/Sobel mask to get z/x -image. Convolve again the resultant image with same mask to get 2z/x2. Convolve again the z-image with vertical Prewitt/Sobel mask to get z/y -image. Convolve again the resultant image with same mask to get 2z/y2. Add the two resultant images to get 2z-image.

4. For all (i, j), update zij by zij + .(2zIj – sij) where is a small constant.

5. If none of the updated zij s change significantly, stop.

Otherwise, go to step 3.

Page 84: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Moving Objects- Optical Flow

Whether the observer or the scene objects are moving, the relative motion gives a lot of information about depth because the points closer to the observer seem to be moving faster. Motion stereo is the name of phenomenon (or body of techniques) for depth perception based on motion information.

Brightness patterns in the image move as the objects that give rise to them move. Optical flow is the apparent motion of the brightness pattern.Basic optical flow equation: Expanding the left hand side by Taylor series and equating terms, we get

Velocity in the direction of brightness gradientis given by We can’t determine flow in the iso-

brightness (right angles) direction. This is called aperture problem.

Constraint line

Ixu+Iyv+It = 0

(Ix,Iy)

Ix= I/xIy= I/yIt= I/t

u

v

-

Page 85: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Optical Flow Estimation- Horn and Sjoberg’s Relaxation Method

As before, we need to minimize conjointly two constraints:

(i, j)(i+1, j+1)

(i, j)(i+1, j+1)

(i, j)(i+1, j+1)

(i, j)(i+1, j+1)

(i, j)(i+1, j+1)

(i, j)(i+1, j+1)Frame t Frame t Frame tFrame t+1Frame t+1 Frame t+1

Ix Computation Iy Computation

It Computation

Page 86: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Shape/Structure from Motion

x

y

zP = (X, Y, Z)T

xyzT

O

Now, equating each component of , we get:

If are the image point corresponding to the 3-D object point P, we have from perspective projection: and (assuming f = 1, and Z >> f).

Now, the optical flow components and are: and

Now, using (1), we can get expressions forand in terms of Z and 6 motion parameters.

Page 87: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Shape from Motion- Pure Translation Case

From the imaging geometry, the (x, y) coordinates of the image points corresponding to the 3D-Object point (X, Y, Z) are given by:

Now, if we ignore terms related to rotation from the equations of u and v on the previous slide, we get

In order to match the estimated (u, v) values with computed values, we need to minimize the function:

Page 88: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Shape/Object Representation and Recognition

Objects/Shapes

2-D (Planar) 3-D

Representation of closed boundaries (Recognition Sensitive to Noise and Occlusions) e.g., Fourier Descriptors

Noise and Occlusion-tolerant Representation of parts and their interrelationships (e.g., Connectivity, adjacency). Parts could be regular shaped objects such as circles, squares, triangles recognizable by Hough transform, or curve segments represented using polynomial forms (B-Splines)

Volumetric/binary voxel representation (e.g. for 3-D medical image constructed from 2D-slices)

Surface based representations(e.g. Coon’s surface patches represented by 2-D polynomials, generalized cylinders )

Page 89: Advanced Topics – I  (EENG 4010) Computer Vision & Image Analysis (EENG 5640)

Reconstruction Imaging- Computer Aided Tomography (CAT) Scans

g(x, y)

X-ray source