View
226
Download
2
Tags:
Embed Size (px)
Citation preview
Computational Theories & Low-level
Pixels To PerceptsA. Efros, CMU, Spring 2009
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Image- BasedProcessing
Surface- BasedProcessing
Object-Based
Processing
Category- BasedProcessing
Light
Vision
Audition
STM
LTM
Motor
Sound
LightMove-ment
Odor (etc.)
Ceramiccup on a table
David Marr, 1982
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
The Retinal Image
An Image (blowup) Receptor Output
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Image-basedRepresentation
Primal Sketch(Marr)
An Image
(Line Drawing)
RetinalImage
Image-based
processes
EdgesLinesBlobsetc.
We likely throw away a lot
line drawings are universal
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Surface-basedRepresentation
Primal Sketch 2.5-D Sketch
Image-basedRepresentation
Surface-based
processes
StereoShadingMotion
etc.
Single Surface(Koenderink’s trick)
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Surface-basedRepresentation
Primal Sketch 2.5-D Sketch
Image-basedRepresentation
Surface-based
processes
StereoShadingMotion
etc.
Figure/Ground Organization
A contour belongs to one of the two (but not both) abutting regions.
Figure(face)
Ground(shapeless)
Figure(Goblet)Ground
(Shapeless)
Important for the perception of shape
© Stephen E. Palmer, 2002
Properties of figures vs. grounds
15.18
Figure GroundThing-like Not thing-likeCloser FartherShaped Extends behind
Figure-Ground OrganizationFigure-Ground Organization
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Surroundedness
15.19Figure-Ground OrganizationFigure-Ground Organization
Surrounded region --> FigureSurrounding region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Size
15.20Figure-Ground OrganizationFigure-Ground Organization
Smaller region --> FigureLarger region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Orientation
15.21Figure-Ground OrganizationFigure-Ground Organization
Horizontal/vertical region --> FigureOblique region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Contrast
15.22Figure-Ground OrganizationFigure-Ground Organization
Higher contrast region --> FigureLower contrast region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Symmetry
15.23Figure-Ground OrganizationFigure-Ground Organization
Symmetrical region --> FigureAsymmetrical region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Convexity
15.24Figure-Ground OrganizationFigure-Ground Organization
More convex region --> FigureLess convex region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Parallelism
15.25Figure-Ground OrganizationFigure-Ground Organization
More parallel region --> FigureLess parallel region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Lower region
15.26Figure-Ground OrganizationFigure-Ground Organization
Lower region --> FigureUpper region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Meaningfulness
15.27Figure-Ground OrganizationFigure-Ground Organization
More meaningful region --> FigureLess meaningful region --> Ground
© Stephen E. Palmer, 2002
Relation to Depth Factors
15.28Figure-Ground OrganizationFigure-Ground Organization
Figure-ground organization as edge assignment:To which side does the edge belong?
Depth cues can also be figure-ground factorsand
Figure-ground factors can be depth cues.
To the closer side. This fact connects figure-groundorganization with depth perception.
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Occlusion
15.29Figure-Ground OrganizationFigure-Ground Organization
Occluding region --> FigureOccluded region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Cast Shadows
15.30Figure-Ground OrganizationFigure-Ground Organization
Shadowing region --> FigureShadowed region --> Ground
© Stephen E. Palmer, 2002
Principles of figure-ground organization:
Shading
15.32Figure-Ground OrganizationFigure-Ground Organization
Shaded region --> FigureNonshaded region --> Ground
Line Labeling
> : contour direction+ : convex edge - : concave edge
possible junctions(constraints)
ConstraintPropagation
[Clowes 1971, Huffman 1971; Waltz 1972; Malik 1986]
26
Line Labeling
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Object-basedRepresentation
Object-based
processes
GroupingParsing
Completionetc.
Surface-basedRepresentation
2.5-D Sketch Volumetric Sketch
Geons(Biederman '87)
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Category-basedRepresentation
Category-based
processes
Pattern-Recognition
Spatial-description
Object-basedRepresentation
Volumetric Sketch Basic-level Category
Category: cup
Color: light-gray
Size: 6”
Location: table
We likely throw away a lot
line drawings are universal
However, things are not so simple…
● Problems with feed-forward model of processing…
Junctions in Real Images
Are Junctions local evidence?
J McDermott, 2004
© Stephen E. Palmer, 2002
14.38
Is grouping an early or late process?
Early vs. Late GroupingEarly vs. Late Grouping
Image- BasedProcessing
Surface- BasedProcessing
Object-Based
Processing
Category- BasedProcessing
Light ? ? ? ?
© Stephen E. Palmer, 2002
14.39
Before or after stereoscopic depth?
(Rock & Brosgole, 1964)
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.40
Before or after lightness constancy?
(Rock, Nijhawan, Palmer & Tudor, 1992)
ReflectanceMatched
LuminanceMatched
TranslucentPlastic Strip
Early vs. Late GroupingEarly vs. Late Grouping
ReflectanceMatched
Luminance-Ratio Matched
OpaquePaper Strip
Opaquepaper strip
© Stephen E. Palmer, 2002
14.41
Before or after visual completion?
(Palmer, Neff & Beck, 1996)
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.42
Before or after illusory contours?
(Palmer & Nelson, 2000)
?
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.43
Conclusion: Grouping can occur “late”
Question: Can grouping also occur “early”
(Palmer & Brooks, in preparation)
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.44
Grouping affects shape constancy
(Palmer & Brooks, in preparation)
Ambiguous
Flat oval
Circle in depth
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.45
Proximity effects
Biased toward oval
Biased toward circle
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.46
Color similarity effects
Biased toward oval Biased toward circle
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.47
Common fate effects
Biased toward oval Biased toward circle
Early vs. Late GroupingEarly vs. Late Grouping
© Stephen E. Palmer, 2002
14.48
Conclusion: Grouping occurs both “early”
and “late” -- possibly everywhere!
Image- BasedProcessing
Surface- BasedProcessing
Object-Based
Processing
Category- BasedProcessing
Light
Grouping Grouping Grouping Grouping
Early vs. Late GroupingEarly vs. Late Grouping
two-tone images
hair (not shadow!)
inferred external contours
“attached shadow” contour
“cast shadow” contour
Finding 3D structure in two-tone images requires distinguishing cast shadows, attached shadows, and areas of low reflectivity
The images do not contain this information a priori (at low level)
Cavanagh's argument
A Classical View of Vision
Grouping /Segmentation
Figure/GroundOrganization
Object and Scene Recognition
pixels, features, edges, etc.Low-level
Mid-level
High-level
A Contemporary View of Vision
Figure/GroundOrganization
Grouping /Segmentation
Object and Scene Recognition
pixels, features, edges, etc.Low-level
Mid-level
High-level
But where we draw this line?
Question #1:What (if anything) should be done at the “Low-Level”?
N.B. I have already told you everything that is known. From now on, there
aren’t any answers.. Only questions…
Who cares? Why not just use pixels?
Pixel differences vs. Perceptual differences
Eye is not a photometer!
"Every light is a shade, compared to the higher lights, till you come to the sun; and every shade is a light, compared to the deeper shades, till you come to the night."
— John Ruskin, 1879
Cornsweet Illusion
Campbell-Robson contrast sensitivity curveCampbell-Robson contrast sensitivity curve
Sine wave
Metamers
Question #1:What (if anything) should be done at the “Low-Level”?
i.e. What input stimulus should we be invariant to?
Invariant to:
• Brightness / Color changes?
small brightness / color changeslow-frequency changes
But one can be too invariant
Invariant to:
• Edge contrast / reversal?
I shouldn’t care what background I am on!
but be careful of exaggerating noise
Representation choices
Raw Pixels
Gradients:
Gradient Magnitude:
Thresholded gradients (edge + sign):
Thresholded gradient mag. (edges):
Spatial invariance
• Rotation, Translation, Scale• Yes, but not too much…
• In brain: complex cells – partial invariance
• In Comp. Vision: histogram-binning methods (SIFT, GIST, Shape Context, etc) or, equivalently, blurring (e.g. Geometric Blur) -- will discuss later
Many lives of a boundary
Often, context-dependent…
input canny human
Maybe low-level is never enough?
1/f amplitude spectra for natural images
(Field 1987)
There are statistical regularities in the natural world, and image statistics reflect that. (Burton & Moorehead 1987; Field 1987; Tolhurst et al. 1992)
Why 1/f?
Scale invariance
Edges have 1/f structure
Object distribution in real world (Ruderman 1997; Lee & Mumford 1999)
(Image source: smokiesguidebook.comSlide content: Simoncelli & Olshausen 2001)
A closer look at amplitude spectra
(Torralba & Oliva 2003)
Do natural image statistics matter?Sensory coding might exploit statistical regularities of our world according to various criteria:
Representational efficiency Decorrelate input responses, make them independent, sparse,
information theoretic metrics etc.
Metabolic efficiencySpike efficiency, minimal wiring.
Learning efficiencySparseness, invariance, over completeness etc.
Lots and lots of work; see reviews Graham & Field (2007), Simoncelli & Olshausen (2001)Lots and lots of work; see reviews Graham & Field (2007), Simoncelli & Olshausen (2001)