Upload
suzan-tyler
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
My Group’s Current Research on Image Understanding
An image-understanding task
Low-level vision
Color, Shape, Texture
Low-level vision
Color, Shape, Texture
Simple SegmentationLow-level vision
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Pattern recognition
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Pattern recognition
Analogy-making
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Pattern recognition
“Meaning”
Analogy-making
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
??? Pattern recognition
“Meaning”
Analogy-making
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Pattern recognition
“Meaning”
Analogy-making
The “SEMANTIC
GAP’
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Pattern recognition
“Meaning”
Analogy-making
HMAX model of visual cortexRiesenhuber, Poggio, et al.
The “SEMANTIC
GAP’
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Pattern recognition
“Meaning”
Analogy-making
Active Symbol Architecturefor high-level perceptionHofstadter et al.
HMAX model of visual cortexRiesenhuber, Poggio, et al.
The “SEMANTIC
GAP’
Color, Shape, Texture
Simple SegmentationLow-level vision
Object recognition
High-level perception
Pattern recognition
“Meaning”
Analogy-making
Active Symbol Architecturefor high-level perceptionHofstadter et al.
HMAX model of visual cortexRiesenhuber, Poggio, et al.
The “SEMANTIC
GAP’
The HMAX model for object recognition(Riesenhuber, Poggio, Serre, et al.)
1. Densely tile the image withwindows of different sizes.
2. HMAX features are computed in each window.
3. The features in eachwindow are given as inputto the trained support vector machine.
4. If the SVM returns a score above a learned threshold, then the object is said to be “detected” .
…
…
Recognition Phase
Streetscenes “scene understanding” system(Bileschi, 2006)
Object detection (here, “car”) with HMAX model (Bileschi, 2006)
Some limitations of the Streetscenes approach to scene understanding
Some limitations of the Streetscenes approach to scene understanding
• Requires exhaustive search for object identification and localization
Some limitations of the Streetscenes approach to scene understanding
• Requires exhaustive search for object identification and localization
Exhaustive search over:
Some limitations of the Streetscenes approach to scene understanding
• Requires exhaustive search for object identification and localization
Exhaustive search over:
• Window size and location in the image
Some limitations of the Streetscenes approach to scene understanding
• Requires exhaustive search for object identification and localization
Exhaustive search over:
• Window size and location in the image
• Object categories (e.g., car, pedestrian, tree, etc.)
Some limitations of the Streetscenes approach to scene understanding
• Requires exhaustive search for object identification and localization
Exhaustive search over:
• Window size and location in the image
• Object categories (e.g., car, pedestrian, tree, etc.)
Exhaustive use of HMAX features in each window
• Does not recognize spatial and abstract relationships among objects for whole scene understanding
• Does not recognize spatial and abstract relationships among objects for whole scene understanding
• Has no prior knowledge about object categories and their place in “conceptual space”
• Does not recognize spatial and abstract relationships among objects for whole scene understanding
• Has no prior knowledge about object categories and their place in “conceptual space”
• HMAX model is completely feed-forward; no feedback to allow context to aid in scene understanding.
Goal of our project
• Perform whole-scene interpretation without exhaustive search.
– Incorporate conceptual knowledge
– Allow feedforward and feedback modes to interact
Person Dog
leash attached to
walking
actionaction
holds
A Simple Semantic Network (or “Ontology”)
“Dog walking”
But...
http://www.dogasaur.com/blog/wp-content/uploads/2011/04/dogwalker.jpg
But...
http://www.vet.k-state.edu/depts/development/lifelines/images/dog_jog_1435.jpg
Person Dog
leash attached to
walking
actionaction
holds Dog Group
running
“Dog walking”
Person Dog
leash attached to
walking
actionaction
holds
running
Allowing “conceptual slippage”
“Dog walking”
Dog Group
But...
http://3.bp.blogspot.com/_1YuoCTv4oKQ/S71jUDm7kOI/AAAAAAAAAak/jz4Pg7zzzQ8/s1600/23743577.JPG
http://lh3.ggpht.com/-ZZrYWeBFTjo/SFQH_0ijwaI/AAAAAAAABjA/8nwryW2BmEw/IMG_0356.JPG
Person
leash attached to
walking
actionaction
holds
“Dog walking”
running
Cat
Iguana
Dog
Dog Group
Tail
But...
http://www.mileanhour.com/post/Dog-walking-bike.aspx
http://cl.jroo.me/z3/Z/e/C/d/a.aaa-Thus-walking-dog.png
ttp://thedaemon.com/images/DARPA_Segue_Dog.jpg
http://www.bikeforest.com/product45422.jpg
http://www.k9ring.com/blog/image.axd?picture=2010%2F3%2Fwalking_dog_from_car.jpg
http://www.guy-sports.com/fun_pictures/dog_walking_helicopter.jpg
http://static.themetapicture.com/media/funny-dog-walking-horse-leash.jpg
http://macwetblog.files.wordpress.com/2012/05/dog-walking.jpg
Person Dog
leash attached to
walking
actionaction
holds
running
Cat
Iguana
Biking
Car
Helicopter
“Dog walking”
Dog Group
Driving
Segue-ingTreadmill-ing Horse
Tail
Active Symbol Architecture(Hofstadter et al., 1995)
Active Symbol Architecture(Hofstadter et al., 1995)
• Basis for – Copycat (analogy-making), Hofstadter & Mitchell
– Tabletop (anlaogy-making), Hofstadter & French
– Metacat (analogy-making and self-awareness),
Hofstadter & Marshall
and many others…
Semantic network
Temperature
Workspace
Active Symbol Architecture(Hofstadter et al., 1995)
Perceptual agents (codelets)are “active symbols”
Petacat:
(Descendant of Copycat, part of the PetaVision project)
Integration of Active Symbol Architecture and HMAX
Initial task:
Decide if image is an instance of “taking a dog for a walk”, and if so, how good an instance it is.
Workspace
Semantic network
Workspace
taking a dog for a walk
outdoors
has location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
horse
swims
ropebelt
leashsidewalk
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
is onSpatial
Relation
Semantic Network
cat
Property links
Slip links
taking a dog for a walk
outdoors
has location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
horse
swims
ropebelt
leashsidewalk
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
is onSpatial
Relation
Semantic Network
cat
Semantic Network
taking a dog for a walk
outdoors
has location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
horse
swims
ropebelt
leashsidewalk
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
is onSpatial
Relation
cat
taking a dog for a walk
outdoors
has location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
horse
swims
ropebelt
leashsidewalk
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
is onSpatial
Relation
cat
taking a dog for a walk
outdoors
has location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
horse
swims
ropebelt
leashsidewalk
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
is onSpatial
Relation
cat
taking a dog for a walk
outdoors
has location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
horse
swims
ropebelt
leashsidewalk
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
is onSpatial
Relation
cat
taking a dog for a walkhas location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
cathorse
swims
ropebelt
leash
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
sidewalk
outdoors
is on
Spatial Relation
taking a dog for a walkhas location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
horse
swims
ropebelt
leash
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
is on
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
sidewalk
outdoors
Spatial Relation
cat
taking a dog for a walkhas location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
cathorse
swims
ropebelt
leash
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
is on
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
sidewalk
outdoors
Spatial Relation
taking a dog for a walkhas location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
cathorse
swims
ropebelt
leash
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
is on
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
sidewalk
outdoors
Spatial Relation
• Measures how well organized the program’s “understanding” is as processing proceeds
– Little organization high temperature
– Lots of organization low temperature
• Temperature feeds back to affect perceptual agents:
– High temperature low confidence in decisions decisions are made more randomly
– Low temperature high confidence in decisions decisions are made more deterministically
Temperature
Input image
Input image Weak segmentation
Input image Weak segmentation
Location “heat map”(probability distribution over pixel locations)_
++++
+
Input image Weak segmentation
Location “heat map”(probability distribution over pixel locations)_
++++
+
Scale “heat map”(probability distribution over scales at each pixel location)
Dog?
Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.
Dog? Dog?
Person?
Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.
Dog? Dog?
Sidewalk?
Person?
Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.
Dog? Dog?
Sidewalk?
Person?
Dog?
Outdoors?
Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.
Dog?negative Dog?
negative
Sidewalk?positive: 0.4
Person?negative
Outdoors?positive: 0.7
Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.
Dog?positive: 0.8
Builder codelets: Ask HMAX to compute C2 features using prototype shapesspecific to the object class, and send them to corresponding SVM. If positive, decide to build structure with probability equal to SVM confidence. Break competing structures if necessary.
Dog?negative Dog?
negative
Sidewalk?positive: 0.4
Person?negative
Outdoors?positive: 0.7
Dog?positive: 0.8
Outdoors
Dog
taking a dog for a walkhas location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
cathorse
swims
ropebelt
leash
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
is on
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
sidewalk
outdoors
Spatial Relation
Object-specific heat maps are updated.
+
Dog
Person heat map
+
Object-specific heat maps are updated.
+
Dog
Person heat map
+
Dog
Person?Person?
Object-specific heat maps are updated.
As codelets build structure, heat maps
are continually updated to reflect prior
(learned) expectations about location
and scale as a function of location and
scale of “built” objects.
+
Dog
+
Person heat map
Person?Person?
Dog? Dog
Leash?
OutdoorsLeash?
Sidewalk?
Person?
Person?
Dog
PersonStrength: 0.75
Outdoors
Sidewalk
PersonStrength: 0.6
Dog
PersonOutdoors
Sidewalk
taking a dog for a walkhas location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
cathorse
swims
ropebelt
leash
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
is on
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
sidewalk
outdoors
Spatial Relation
Dog
PersonOutdoors
Sidewalk
Leash?
Leash?
Dog?
Sidewalk?
Dog?
Rope?
Dog
PersonOutdoors
Sidewalk
Leash
Dog(weak)
Dog
PersonOutdoors
Sidewalk
Leash
Dog(weak)
Dog(strong)
Dog
PersonOutdoors
Sidewalk
Leash
Dog
taking a dog for a walkhas location
persondog
has action
is on
is touching
has component
aroad
abeach
trail
drives
runsflies
cathorse
swims
ropebelt
leash
string
walkswalks
is in front of
has location
has action
has component
has componenthas component
stands
is on
sits
is in front of
is touching
is behind
is next to
is on
agrass
is touching
Object
Action
indoors
sidewalk
outdoors
Spatial Relation
Dog
PersonOutdoors
Sidewalk
Leash
Dog
Once objects begin to be built, relation and grouping codelets can run on them.
is next to
is next to
Dog group
Once objects begin to be built, relation and grouping codelets can run on them.
Dog
PersonOutdoors
Sidewalk
Dog
is next to
is next to
Dog group
Leash
Dog
PersonOutdoors
Sidewalk
Dog
is next to
is next to
Dog group
is next to
Leash
Once objects begin to be built, relation and grouping codelets can run on them.
How Petacat makes a final decision
Temperature
taking a dog for a walk
Dog
PersonOutdoorsLeash
Dog
is next to
is next to
Dog group Sidewalk
is next to
How Petacat makes a final decision
Temperature
taking a dog for a walk
Dog
PersonOutdoorsLeash
Dog
is next to
is next to
Dog group Sidewalk
“Situation” codelet is more likely to run when temperature is low.
is next to
Dog
PersonOutdoors
Leash
Dog
is next to
is next to
Dog group
is next to
Sidewalk
Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.
Dog
PersonOutdoors
Leash
Dog
is next to
is next to
Dog group
Sidewalk
person
taking a dog for a walk
leash
dog
outdoors
is next to
has componenthas component
has component
has location
is in front of
Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.
Dog
PersonOutdoors
Leash
Dog
is next to
is next to
Dog group
person
taking a dog for a walk
leash
dog
outdoors
is next to
has componenthas component
has component
has location
is in front of
is next toDog group
Sidewalk
Dog
PersonOutdoors
Leash
Dog
is next to
is next to
Dog group
person
taking a dog for a walk
leash
dog
outdoors
is next to
has componenthas component
has component
has location
is in front of
is next toDog group
If resulting temperature is low enough, classify scene as positive
Sidewalk
Dog
PersonOutdoors
Leash
Dog
is next to
is next to
Dog group Sidewalk
person
taking a dog for a walk
leash
dog
outdoors
is next to
has componenthas component
has component
has location
is in front of
is next toDog group
If situation codelet fails enough times or does not run for a long time,program has increasing chance of ending with negative classification.
If resulting temperature is low enough, classify scene as positive
Temperature at the end of the run gives a measure of how good an instance the picture is (e.g., of the “dog walking” situation).
Temperature