View
220
Download
1
Tags:
Embed Size (px)
Citation preview
ComputerVision
Segmentation
Marc PollefeysCOMP 256
Some slides and illustrations from D. Forsyth, T. Darrel,...
ComputerVision
Aug 26/28 - Introduction
Sep 2/4 Cameras Radiometry
Sep 9/11 Sources & Shadows Color
Sep 16/18 Linear filters & edges
(hurricane Isabel)
Sep 23/25 Pyramids & Texture Multi-View Geometry
Sep30/Oct2 Stereo Project proposals
Oct 7/9 Tracking (Welch) Optical flow
Oct 14/16 - -
Oct 21/23 Silhouettes/carving (Fall break)
Oct 28/30 - Structure from motion
Nov 4/6 Project update Proj. SfM
Nov 11/13 Camera calibration Segmentation
Nov 18/20 Fitting Prob. segm.&fit.
Nov 25/27 Matching templates (Thanksgiving)
Dec 2/4 Matching relations Range data
Dec ? Final project
Tentative class schedule
ComputerVision Segmentation and Grouping
• Motivation: not information is evidence
• Obtain a compact representation from an image/motion sequence/set of tokens
• Should support application
• Broad theory is absent at present
• Grouping (or clustering)– collect together
tokens that “belong together”
• Fitting– associate a model
with tokens– issues
• which model?• which token goes
to which element?• how many
elements in the model?
ComputerVision General ideas
• tokens– whatever we need to
group (pixels, points, surface elements, etc., etc.)
• top down segmentation– tokens belong
together because they lie on the same object
• bottom up segmentation– tokens belong
together because they are locally coherent
• These two are not mutually exclusive
ComputerVision
Why do these tokens belong together?
ComputerVision
ComputerVision
Basic ideas of grouping in humans
• Figure-ground discrimination– grouping can be
seen in terms of allocating some elements to a figure, some to ground
– impoverished theory
• Gestalt properties– elements in a
collection of elements can have properties that result from relationships (Muller-Lyer effect)
• gestaltqualitat
– A series of factors affect whether elements should be grouped together
• Gestalt factors
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
Technique: Shot Boundary Detection
• Find the shots in a sequence of video– shot boundaries
usually result in big differences between succeeding frames
• Strategy:– compute interframe
distances– declare a boundary
where these are big
• Possible distances– frame differences– histogram differences– block comparisons– edge differences
• Applications:– representation for
movies, or video sequences
• find shot boundaries• obtain “most
representative” frame
– supports search
ComputerVision
Technique: Background Subtraction
• If we know what the background looks like, it is easy to identify “interesting bits”
• Applications– Person in an office– Tracking cars on a
road– surveillance
• Approach:– use a moving
average to estimate background image
– subtract from current frame
– large absolute values are interesting pixels
• trick: use morphological operations to clean up pixels
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision Segmentation as clustering
• Cluster together (pixels, tokens, etc.) that belong together
• Agglomerative clustering– attach closest to
cluster it is closest to– repeat
• Divisive clustering– split cluster along best
boundary– repeat
• Point-Cluster distance– single-link clustering– complete-link
clustering– group-average
clustering
• Dendrograms– yield a picture of
output as clustering process continues
ComputerVision Simple clustering algorithms
ComputerVision
ComputerVision K-Means
• Choose a fixed number of clusters• Choose cluster centers and point-cluster allocations
to minimize error
• can’t do this by search, because there are too many possible allocations.
x j i2
jelements of i'th cluster
iclusters
ComputerVision
K-means clustering using intensity alone and color alone
Image Clusters on intensity Clusters on color
ComputerVision
K-means using color alone, 11 segments
Image Clusters on color
ComputerVision
K-means usingcolor alone,11 segments.
ComputerVision
K-means using colour andposition, 20 segments
ComputerVision Graph theoretic clustering
• Represent tokens using a weighted graph.– affinity matrix
• Cut up this graph to get subgraphs with strong interior links
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision Measuring Affinity
Intensity
Texture
Distance
aff x, y exp 12 i
2
I x I y 2
aff x, y exp 12 d
2
x y
2
aff x, y exp 12 t
2
c x c y 2
ComputerVision Scale affects affinity
ComputerVision Eigenvectors and cuts
• Simplest idea: we want a vector a giving the association between each element and a cluster
• We want elements within this cluster to, on the whole, have strong affinity with one another
• We could maximize
• But need the constraint
• This is an eigenvalue problem - choose the eigenvector of A with largest eigenvalue
aTAa
aTa1
ComputerVision Example eigenvector
points
matrix
eigenvector
ComputerVision
ComputerVision More than two segments
• Two options– Recursively split each side to get a tree,
continuing till the eigenvalues are too small
– Use the other eigenvectors
ComputerVision More than two segments
ComputerVision Normalized cuts
• Current criterion evaluates within cluster similarity, but not across cluster difference
• Instead, we’d like to maximize the within cluster similarity compared to the across cluster difference
• Write graph as V, one cluster as A and the other as B
• Maximize
• i.e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph
assoc(A,A)assoc(A,V )
assoc(B,B)assoc(B,V )
ComputerVision
ComputerVision
ComputerVision Normalized cuts
• Write a vector y whose elements are 1 if item is in A, -b if it’s in B
• Write the matrix of the graph as W, and the matrix which has the row sums of W on its diagonal as D, 1 is the vector with all ones.
• Criterion becomes
• and we have a constraint
• This is hard to do, because y’s values are quantized
miny
yT D W yyTDy
yTD1 0
ComputerVision
ComputerVision Normalized cuts
• Instead, solve the generalized eigenvalue problem
• which gives
• Now look for a quantization threshold that maximises the criterion --- i.e all components of y above that threshold go to one, all below go to -b
maxy yT D W y subject to yTDy 1
D W yDy
ComputerVision
Figure from “Image and video segmentation: the normalised cut framework”, by Shi and Malik, copyright IEEE, 1998
ComputerVision
Figure from “Normalized cuts and image segmentation,” Shi and Malik, copyright IEEE, 2000
ComputerVision Image parsing:
Unifying Segmentation, Detection and RecognitionTu, Chen, Yuille & Zhu ICCV’03 (Marr prize!)
ComputerVision Image parsing
• Generative models for each classsome random samples for 5 and H
some random face samples (PCA model)
ComputerVision DDMCMC
Data Driven Markov Chain Monte Carlo
ComputerVision Image parsing
ComputerVision Image parsing
ComputerVision Next class:
Fitting
Reading: Chapter 15