View
213
Download
0
Embed Size (px)
Citation preview
1
How do ideas from perceptual
organization relate to natural
scenes?
2
Brunswik & Kamiya 1953
• Thesis: Gestalt rules reflect the structure of the natural world
• Attempted to validate the grouping rule of proximity of similars
• Brunswik was ahead of his time… we now have the tools. Egon Brunswik (1903-1955)
3
• Can we define these cues for real images?• Are these cues “ecologically valid”?• How informative are different cues?
Grouping Figure/Ground
Ecological Statistics of Perceptual Organization
4
• Task: detect generic pattern or group
• Signal: class of patterns, known null hypothesis
• Cues: optimal test is usually obvious
• Result: mathematically precise characterization of when detection is possible
• Task: capture “useful” information about the scene
• Signal: natural image statistics, clutter
• Cues: something computable from real pixels
• Result: empirical statistics about relative power of different cues
5
Berkeley Segmentation DataSet [BSDS]
6
Cues:a) distance [proximity]b) region cues [similarity]c) boundary cues [connectedness, closure, convexity]
What image measurements allow us to gauge the probability that pixels i and j belong to the same group?
7
Learning Pairwise AffinitiesSij – indicator variable as to whether pixels i and j were marked as belonging to the same group by human
subjects.
Wij – our estimate of the likelihood that pixel i and j belong to the same group conditioned on the image measurements.
• Use the ground truth given by human segmentations to calibrate cues.• Learn “statistically optimal” cue combination in a supervised learning framework• Ecological Statistics: Measure the relative power of different cues for natural scenes
8
Color
a*b*
Brightness
L*
TextureOriginal Image
Wij
Distance
ED
2
Boundary Processing
Textons
A
B
C
A
B
C
2
Region Processing
9
Evaluation Measures
1. Precision-Recall of same-segment pairs– Precision is P(Sij=1 | Wij > t)
– Recall is P(Wij > t | Sij = 1)
2. Mutual Information between W and S
Groundtruth SijEstimate Wij
∫ p(s,w) log [p(s)p(w) / p(s,w)]
10
Individual Features
Patches Gradients
11
Affinity Model vs. Human Segmentation
12
Findings
• Both Edges and Patches provide useful “independent” information.
• Texture gradients can be quite powerful
• Color patches better than gradients
• Brightness gradients better than patches.
• Proximity is a result, not a cause of grouping
13
Figure-Ground Labeling
- start with 200 segmented images of natural scenes- boundaries labeled by at least 2 different human subjects- subjects agree on 88% of contours labeled
14
Local Cues for Figure/Ground
• Assume we have a perfect segmentation
• Can we predict which region a contour belongs to based on it’s local shape?– Size/Surroundedness– Convexity– Lower Region
15Size(p) = log(AreaF / AreaG)
Size and Surroundedness [Rubin 1921]
GFp
16
Convexity(p) = log(ConvF / ConvG)
ConvG = percentage of straight lines that lie completely within region G
pG F
Convexity [Metzger 1953, Kanizsa and Gerbino 1976]
17
LowerRegion(p) = θG
Lower Region[Vecera, Vogel & Woodman 2002]
θ
p
center of mass
18
Figural regions tend to be convex
19
Figural regions tend to lie below ground regions
20
Size
LowerRegion
Convexity
21
Power of cue depends on support of the analysis window.
22
Power of cue depends on support of the analysis window.
23
“Upper Bounding” Local Performance
• Present human subjects with local shapes, seen through an aperture.
24
Human Performance on Local Figure-Ground
25
Extension to Real Images
• Build up library of prototypical contour configurations by clustering local shape descriptors– Geometric Blur [Berg & Malik 01]
• Train a classifier which uses similarities to these prototype shapes to predict figure/ground label
26
Shapemes
Classifier using 64 shapeme features: 61%
27
Globalization of Figure/Ground Measurements
• Averaging local shapeme cue over human-marked boundaries: 71%
• Prior over junction types and label continuity: 79%