Upload
beverly-harrington
View
213
Download
0
Embed Size (px)
Citation preview
Linguistic Spatial Reasoning
Jim Keller
Electrical and Computer Engineering Department
University of Missouri-Columbia
I get by with a lot of help from my friends….
Big group of faculty and students at the University of Missouri, the University of Florida, Notre Dame University, and the
University of Guelph
Our Work
• Language Generation– Modeling of spatial relationships
– Natural scene understanding
• Communication: human and machine
• Text 2 Sketch
• Human Geography
• Summarization – Most of the night, Mary had medium restlessness and low motion
• Common Theme– Generation of “un-natural” natural language
– Modeling and matching based on linguistic statements
– Users of natural language understanding systems
Outline
• Linguistic Scene Description (External)
• Linguistic Scene Description (Ego Centered)
• Human/Robot Dialog
• Sketched Route Understanding
• Text-to-Sketch– Inverse of above
Scene Description
“B is to the right of A”?
B
A
• Natural scene understanding is an important aspect of computer vision
• Spatial relations among image objects play a vital role in the description of a scene
Scene Description•Human intuition varies considerably
–Vague concepts of what spatial relationships should mean–Uncertainty of how they can model differing human perceptions
•Using fuzzy set-based definitions supports the “Principle of Least Commitment” of David Marr
•Considerable amount of argument about the "best" method
–Much of the debate centered on human intuition
–Interesting results in Pattern Recognition applications
•Need flexible mechanisms that can be tailored
–Perhaps to individual human experts
v
A
B
+180°-180°angle
Histograms of Forces (Matsakis)•Elegant Theory•Much more flexible than angles•Incorporates Metric information
Truth Membership of “A is RIGHT of B”
FAB is called the Fhistogram associated with (A,B)Divided into 4 quadrants
2 3 41
F
A B
( )rA B
contradictory forces
com pensatory forces
effective forces
+ +b ( )=R IG H Tr
A is RIGHTof B
• The average direction r(RIGHT) of the effective forces is computed
• The degree of truth ar(RIGHT) of the proposition “A is to the right of B” is computed as:
a r (RIGHT) = ( r (RIGHT)) b r (RIGHT)
( )
1
Features for Linguistic Description
• Compute
1, the Primary Direction [d1 = a(1)]
Produces Basic Directional Term (e.g., RIGHT
of)
2, the Secondary Direction [d2 = a(2)]
Supplements the description (e.g., “but a little
above”)m1 and m2 specify how good the description is for each direction – combination of F0 and F2
gravitational
constant
-
-
3
2
1
1
3. A system of 27 fuzzy rules and meta-rulesallows meaningful linguistic descriptions to be produced.
1. Each histogram gives its opinion about the relative positionbetween the objects that are considered.
2. The two opinions are combined.Four numeric and two symbolic features result from this combination.
Linguistic Scene Description
m 1
high medium-high medium-low low
high perfectly -- 1 nearly 1medium-high -- 2 nearly 2 loosely 1medium-low mostly loosely 2 loosely 3
d 1
low no primary direction
noprimarydirection
Examples of Fuzzy Rules and Meta-Rules
Each linguistic output uses hedges from a dictionaryof about thirty adverbs and other termsthat can be tailored to individual users
Fuzzy Rule-based System
m 2
high medium low
high somewhat strongly
medium a little slightlyd 2
low
nosecondarydirection
no secondary direction
Fuzzy rules for secondary direction
Secondary directions
We used LADAR range images of the power-plant at China Lake, CA.They were processed by applying a median filter and a pseudo-intensity
filter. The filtered images were segmented and labeled manually.
Linguistic Scene Description Example
To the LEFT, but a little ABOVE.
The system handles a rich language to describe the spatialorganization of scene regions. It produces good intuitive results.
The system here describes the relative positionof the red object with respect to the group of buildings (in blue).
Linguistic Scene Description Example
Linguistic Spatial Scene Description
There are 5 missile launchers (1, 2, 3, 6, 8)They surround a center vehicle (4)The image includes a SAM siteA convoy of vehicles (5, 7, 9, 10) is BelowRight of the SAM site
Pseudo-intensity image of surface-to-air missile (SAM) site with convoy
Objects detected and automatically labeled by extended ATR system
Output of Scene Description Fuzzy Rule Base (from initial system)
Horizontal descriptions “The fish can is to the left of the truck box” “The truck box is to the right of the fish can but extends to the front” “The candy box is to the left of the truck box” “The truck box is to the right of the candy box but extends to the front”
“The candy box is surrounded” and “The fish can is surrounded.” Vertical descriptions “The fish can is to the left of the truck box” “The truck box is to the right of the fish can” “The truck box is to the right of the candy box”
“The candy box is on top of the fish can”“The fish can is below the candy box.” Near descriptions“The fish can and the candy box are near” “The fish can and the truck box are somewhat near” “The candy box and the truck box are somewhat near.”
An extension to the third Dimension
Outline• Linguistic Scene Description (External)
• Linguistic Scene Description (Ego Centered)
• Human/Robot Dialog
• Sketched Route Understanding
• Text-to-Sketch– Inverse of above
Human/Robot Dialog• Spatial Reasoning incorporated into NRL’s Natural
Language Understanding System for mobile robots
• Sensed data results in a “grid map” that displays occupancy of cells (doesn’t need to be binary)
Grid map after component labeling – robot heading towards Object 5
Generating Spatial Descriptions from Robot Sensors (here, NOMAD range sensors)
+
gravita tional
constant
GR OUPIN G DIRECTIONALRELATIONSHIPS
-
-
(a)
(b)
(c)
(g)
(d)
(d)
(e)
(f )
Fuzzy Rules
features
DETAILED SPATIAL DESCRIPTIONS for 6 OBJECTS:
•Object 1 is mostly behind me but somewhat to the right (the description is satisfactory). The object is very close.
•Object 2 is behind me (the description is satisfactory) The object is very close.
•Object 3 is to the left of me but extends to the rear relative to me (the description is satisfactory). The object is very close.
•Object 4 is mostly to the right of me but somewhat forward (the description is satisfactory). The object is very close.
•Object 5 is in front of me (the description is satisfactory). The object is very close.
•Object 6 is to the left-front of me (the description is satisfactory). The object is close.
Scene 1
Human-Driven Spatial Language for Human-Robot Interaction
• Investigating spatial language for eldercare scenario in the home– An elderly resident has lost an object; the robot will help the
resident to find it.– EXAMPLE: The eyeglasses are behind the lamp on the table
to the left of the bed in the bedroom
• Start with human-subject experiments; Develop spatial language algorithms to match the results
• NSF project # IIS-1017097– U of Missouri (Marge Skubic)– U of Notre Dame (Laura Carlson)
The virtual scene used for the human subject experiments
• Robot addressee vs. Human addressee
• Vary the speaker-addressee alignment
• Where to find it vs. how to find it
• Vary candidate reference objects
Hallway
Livingroom Bedroom
Example “where” Statements• the key is on the white table in the room to his left
• the book is on the wooden table in the back of the room to his right
• the wallet's in the right room behind the bed on the table next to the lamp and the plant
• the glasses case is down the hall in the right room -on the right side of the room as you enter, it's between the two chairs on a table next to a statue
• the mug's down the hall in the left room on a table on the light brown table in front of the couch next to a purse and a hat
Example “how” Statements
• move forward into the intersection, look left, move to the table at the far end of the room directly across from where he is and then look down and he'll find the car keys
• if you turn around and then walk left or walk forward about two steps then walk left into the bedroom, walk forward until you have passed the bed on your left, turn left around the foot of the bed, proceed forward, turn left again, walk forward towards the wall near the head of the bed, look down on the bedside table and there’s the object.
Outline
• Linguistic Scene Description (External)– Results in Scene Matching
• Linguistic Scene Description (Ego Centered)
• Human/Robot Dialog
• Sketched Route Understanding
• Text-to-Sketch– Inverse of above
Extracting a Navigation Path from a SketchStart: Move forwardWhen Object #3 is loosely to the left-front
Then Turn right
When Object #3 is to the left Then Move forward
When Object #4 is mostly in front Then Turn left
When Object #4 is to the right Then Move forward
When Object #4 is to the right and Object #5 is in front Then Stop
Robotpath
Outline
• Linguistic Scene Description (External)– Results in Scene Matching
• Linguistic Scene Description (Ego Centered)
• Human/Robot Dialog
• Sketched Route Understanding
• Text-to-Sketch– Inverse of above
University of Missouri
• Consider a frantic (and fictitious) call like this:– I saw terrorist X slip into a building up ahead. I don’t know where I am in the city and I
can’t read the signs. I’m walking towards the location.
– There is a somewhat long, thin rectangular shaped parking lot that extends forward.
– To the immediate right of that parking lot is a parking garage.
– I see a moderately small rectangular building close to me that is mostly to my left but partially forward.
– Across a 4-way intersection, there is a small rectangular building close to me on my left that extends behind me.
– There is another small rectangular building across the street that is mostly to the front of me, but somewhat to the left.
– A short distance to the right of that building is a small L-shaped office.
– I’ve reached another 4-way intersection, there is a large L-shaped building that extends to the rear. That’s where he entered.
• Can we automatically pinpoint the location?
Motivation: Funded by the National Geospatial Intelligence Agency
ACADEMIC RESEARCH GRANT
Histograms of forces for red structure
(upper left)
Structures into database
Histograms of forces+ Fuzzy rules
Convert to graph
Convert to graph
Natural Language Build graphics
sketch to match
Text descriptionsIHMC Parser
MATCHING
Best approach to date: Evolutionary Computation Algorithm
New approach: Hybrid AlgorithmEC basic structureSubgraph Isomorphism locally in database in addition to mutation
MU Text-to-Sketch System
ACADEMIC RESEARCH GRANT
Syntactic parse tree for: “There is a large rectangular building is to my left.”
Corresponding logical form graph of the deep semantic parse
J. Allen, M. Swift and W. de Beaumont, “Deep Semantic Analysis of Text”, Proc., Symp. On Semantics in Systems for Text Processing, 2008.
J. Allen, Natural Language Understanding, 2nd Ed. Redwood City, CA, USA: Benjamin-Cummings, 1995.
Automatically extract objects and relationships from IHMC’s deep semantic parse
Right now, language variation is restricted
Some Details:Parser
ACADEMIC RESEARCH GRANT
Example of “directly to the front,” which produces a constrained search region, combined with the distance descriptor
“somewhat close.”A compound linguistic description: “perfectly to the left of the blue building” and “mostly to the
right, but somewhat above the red building”
An illustration of “somewhat to the right” which produces a rather large search region
Building blocks: Intelligent object placement schemeToo expensive to use guided random search only – Directional Fuzzy Templates
I. Sledge and J. Keller, "Mapping natural language to imagery: Placing objects intelligently", Proc. IEEE Int. Conf. Fuzzy Syst., Jeju Island, Korea, August, 2009, pp 518-524 (Best student paper award).
ACADEMIC RESEARCH GRANT
“To the immediate right of that parking lot is a large parking garage that is the same length as the parking lot.”
Fuzzy Initial Placement template
Optimized PlacementInitial Placement
To the right but a little bit forward
Perfectly to the right
Linguistic DescriptionsFrom fuzzy rule base
Example of Sketch Creation After Parsing
ACADEMIC RESEARCH GRANT
(a) “There is a somewhat long, thin rectangular shaped parking lot that extends forward relative to me on
my right.”
(b) “To the immediate right of that parking lot is a large parking garage
that is the same length as the parking lot.“
(c) “I see a moderately small rectangular building close to me that
is mostly to my left but partially forward.”
(d) “Travelling to a 4-way intersection, there is a small
rectangular building close to me on my left that extends behind me.”
(e) “There is another small rectangular building across the street that is mostly to the front of me, but
somewhat to the left.”
(f) Left image: Final sketch (after a few more descriptions). Right image: Ground truth recovered from
matching algorithm
Sketch Building Example
ACADEMIC RESEARCH GRANT
Evolutionary Computation Approach
-180 -90 0 90 1800
1Target Histogram Set
FA
B(
)
Sketch converted to chromosome structure: building = gene
Each gene attributed with its Histograms of Forces
ACADEMIC RESEARCH GRANT
Initial Generation & fitness evaluation of chromosome
-180 -90 0 90 1800
1
Search Histogram Set
FA
B(
)
Rotate tooptimal angle
Add n random chromosomes to population over the Geospatial database
Each gene attributed with its Histograms of Forces
Blue are query HoFsRed are those of current Chromosome
Human Geography ProjectUniversity of Missouri and University of Florida
HuGeo based
knowledge
HuGeo based
knowledge
Pattern Recognition Techniques
Pattern Recognition Techniques
Simulation and
Modeling
Simulation and
Modeling
50 100 150 200 250
50
100
150
200
250 0
0.5
1
1.5
2
2.5
3
3.5
4
GISGeospatial representation and reasoningConflationConfidence assessmentText2Sketch
Math modelsRepast Sampling methodsSwarm Intelligence
Uncertainty Modeling (Probabilistic, fuzzy, belief )ClassifiersClusteringOntologiesMulti and Hyper Spectral analysisFusion
HuGeo Project Overview
• What’s needed/missing in HuGeo-based Knowledge Discovery component?
– Extraction of concepts and events from blogs, twitter, news sources, etc
– Interpretation of sentences
– Placing them geographically
• i.e., we need NLP