44
Linguistic Spatial Reasoning Jim Keller Electrical and Computer Engineering Department University of Missouri-Columbia I get by with a lot of help from my friends…. Big group of faculty and students at the University of Missouri, the University of Florida, Notre Dame University, and the University of Guelph

Linguistic Spatial Reasoning Jim Keller Electrical and Computer Engineering Department University of Missouri-Columbia I get by with a lot of help from

Embed Size (px)

Citation preview

Linguistic Spatial Reasoning

Jim Keller

Electrical and Computer Engineering Department

University of Missouri-Columbia

I get by with a lot of help from my friends….

Big group of faculty and students at the University of Missouri, the University of Florida, Notre Dame University, and the

University of Guelph

Our Work

• Language Generation– Modeling of spatial relationships

– Natural scene understanding

• Communication: human and machine

• Text 2 Sketch

• Human Geography

• Summarization – Most of the night, Mary had medium restlessness and low motion

• Common Theme– Generation of “un-natural” natural language

– Modeling and matching based on linguistic statements

– Users of natural language understanding systems

Outline

• Linguistic Scene Description (External)

• Linguistic Scene Description (Ego Centered)

• Human/Robot Dialog

• Sketched Route Understanding

• Text-to-Sketch– Inverse of above

Scene Description

“B is to the right of A”?

B

A

• Natural scene understanding is an important aspect of computer vision

• Spatial relations among image objects play a vital role in the description of a scene

Scene Description•Human intuition varies considerably

–Vague concepts of what spatial relationships should mean–Uncertainty of how they can model differing human perceptions

•Using fuzzy set-based definitions supports the “Principle of Least Commitment” of David Marr

•Considerable amount of argument about the "best" method

–Much of the debate centered on human intuition

–Interesting results in Pattern Recognition applications

•Need flexible mechanisms that can be tailored

–Perhaps to individual human experts

angle180°-180°

Relative position between 2-D objectsThe histogram of angles

v

A

B

+180°-180°angle

Histograms of Forces (Matsakis)•Elegant Theory•Much more flexible than angles•Incorporates Metric information

Choice of the function

force (d)

distance d

Gravitational Forces

A

B

Force-Histograms

+180°-180°angle

Truth Membership of “A is RIGHT of B”

FAB is called the Fhistogram associated with (A,B)Divided into 4 quadrants

2 3 41

F

A B

( )rA B

contradictory forces

com pensatory forces

effective forces

+ +b ( )=R IG H Tr

A is RIGHTof B

• The average direction r(RIGHT) of the effective forces is computed

• The degree of truth ar(RIGHT) of the proposition “A is to the right of B” is computed as:

a r (RIGHT) = ( r (RIGHT))  b r (RIGHT)

( )

1

Features for Linguistic Description

• Compute

1, the Primary Direction [d1 = a(1)]

Produces Basic Directional Term (e.g., RIGHT

of)

2, the Secondary Direction [d2 = a(2)]

Supplements the description (e.g., “but a little

above”)m1 and m2 specify how good the description is for each direction – combination of F0 and F2

gravitational

constant

-

-

3

2

1

1

3. A system of 27 fuzzy rules and meta-rulesallows meaningful linguistic descriptions to be produced.

1. Each histogram gives its opinion about the relative positionbetween the objects that are considered.

2. The two opinions are combined.Four numeric and two symbolic features result from this combination.

Linguistic Scene Description

m 1

high medium-high medium-low low

high perfectly -- 1 nearly 1medium-high -- 2 nearly 2 loosely 1medium-low mostly loosely 2 loosely 3

d 1

low no primary direction

noprimarydirection

Examples of Fuzzy Rules and Meta-Rules

Each linguistic output uses hedges from a dictionaryof about thirty adverbs and other termsthat can be tailored to individual users

Fuzzy Rule-based System

m 2

high medium low

high somewhat strongly

medium a little slightlyd 2

low

nosecondarydirection

no secondary direction

Fuzzy rules for secondary direction

Secondary directions

Simple Examples

Green = satisfactory; Orange = Rather satisfactory; Red = Unsatisfactory

We used LADAR range images of the power-plant at China Lake, CA.They were processed by applying a median filter and a pseudo-intensity

filter. The filtered images were segmented and labeled manually.

Linguistic Scene Description Example

To the LEFT, but a little ABOVE.

The system handles a rich language to describe the spatialorganization of scene regions. It produces good intuitive results.

The system here describes the relative positionof the red object with respect to the group of buildings (in blue).

Linguistic Scene Description Example

Linguistic Spatial Scene Description

There are 5 missile launchers (1, 2, 3, 6, 8)They surround a center vehicle (4)The image includes a SAM siteA convoy of vehicles (5, 7, 9, 10) is BelowRight of the SAM site

Pseudo-intensity image of surface-to-air missile (SAM) site with convoy

Objects detected and automatically labeled by extended ATR system

Output of Scene Description Fuzzy Rule Base (from initial system)

Horizontal descriptions “The fish can is to the left of the truck box” “The truck box is to the right of the fish can but extends to the front” “The candy box is to the left of the truck box” “The truck box is to the right of the candy box but extends to the front”

“The candy box is surrounded” and “The fish can is surrounded.” Vertical descriptions “The fish can is to the left of the truck box” “The truck box is to the right of the fish can” “The truck box is to the right of the candy box”

“The candy box is on top of the fish can”“The fish can is below the candy box.” Near descriptions“The fish can and the candy box are near” “The fish can and the truck box are somewhat near” “The candy box and the truck box are somewhat near.”

An extension to the third Dimension

Outline• Linguistic Scene Description (External)

• Linguistic Scene Description (Ego Centered)

• Human/Robot Dialog

• Sketched Route Understanding

• Text-to-Sketch– Inverse of above

Human/Robot Dialog• Spatial Reasoning incorporated into NRL’s Natural

Language Understanding System for mobile robots

• Sensed data results in a “grid map” that displays occupancy of cells (doesn’t need to be binary)

Grid map after component labeling – robot heading towards Object 5

Generating Spatial Descriptions from Robot Sensors (here, NOMAD range sensors)

+

gravita tional

constant

GR OUPIN G DIRECTIONALRELATIONSHIPS

-

-

(a)

(b)

(c)

(g)

(d)

(d)

(e)

(f )

Fuzzy Rules

features

DETAILED SPATIAL DESCRIPTIONS for 6 OBJECTS:

•Object 1 is mostly behind me but somewhat to the right (the description is satisfactory). The object is very close.

•Object 2 is behind me (the description is satisfactory) The object is very close.

•Object 3 is to the left of me but extends to the rear relative to me (the description is satisfactory). The object is very close.

•Object 4 is mostly to the right of me but somewhat forward (the description is satisfactory). The object is very close.

•Object 5 is in front of me (the description is satisfactory). The object is very close.

•Object 6 is to the left-front of me (the description is satisfactory). The object is close.

Scene 1

Human-Driven Spatial Language for Human-Robot Interaction

• Investigating spatial language for eldercare scenario in the home– An elderly resident has lost an object; the robot will help the

resident to find it.– EXAMPLE: The eyeglasses are behind the lamp on the table

to the left of the bed in the bedroom

• Start with human-subject experiments; Develop spatial language algorithms to match the results

• NSF project # IIS-1017097– U of Missouri (Marge Skubic)– U of Notre Dame (Laura Carlson)

The virtual scene used for the human subject experiments

• Robot addressee vs. Human addressee

• Vary the speaker-addressee alignment

• Where to find it vs. how to find it

• Vary candidate reference objects

Hallway

Livingroom Bedroom

Example “where” Statements• the key is on the white table in the room to his left

• the book is on the wooden table in the back of the room to his right

• the wallet's in the right room behind the bed on the table next to the lamp and the plant

• the glasses case is down the hall in the right room -on the right side of the room as you enter, it's between the two chairs on a table next to a statue

• the mug's down the hall in the left room on a table on the light brown table in front of the couch next to a purse and a hat

Example “how” Statements

• move forward into the intersection, look left, move to the table at the far end of the room directly across from where he is and then look down and he'll find the car keys

• if you turn around and then walk left or walk forward about two steps then walk left into the bedroom, walk forward until you have passed the bed on your left, turn left around the foot of the bed, proceed forward, turn left again, walk forward towards the wall near the head of the bed, look down on the bedside table and there’s the object.

Outline

• Linguistic Scene Description (External)– Results in Scene Matching

• Linguistic Scene Description (Ego Centered)

• Human/Robot Dialog

• Sketched Route Understanding

• Text-to-Sketch– Inverse of above

Extracting a Navigation Path from a SketchStart: Move forwardWhen Object #3 is loosely to the left-front

Then Turn right

When Object #3 is to the left Then Move forward

When Object #4 is mostly in front Then Turn left

When Object #4 is to the right Then Move forward

When Object #4 is to the right and Object #5 is in front Then Stop

Robotpath

Outline

• Linguistic Scene Description (External)– Results in Scene Matching

• Linguistic Scene Description (Ego Centered)

• Human/Robot Dialog

• Sketched Route Understanding

• Text-to-Sketch– Inverse of above

University of Missouri

• Consider a frantic (and fictitious) call like this:– I saw terrorist X slip into a building up ahead. I don’t know where I am in the city and I

can’t read the signs. I’m walking towards the location.

– There is a somewhat long, thin rectangular shaped parking lot that extends forward.

– To the immediate right of that parking lot is a parking garage.

– I see a moderately small rectangular building close to me that is mostly to my left but partially forward.

– Across a 4-way intersection, there is a small rectangular building close to me on my left that extends behind me.

– There is another small rectangular building across the street that is mostly to the front of me, but somewhat to the left.

– A short distance to the right of that building is a small L-shaped office.

– I’ve reached another 4-way intersection, there is a large L-shaped building that extends to the rear. That’s where he entered.

• Can we automatically pinpoint the location?

Motivation: Funded by the National Geospatial Intelligence Agency

ACADEMIC RESEARCH GRANT

Histograms of forces for red structure

(upper left)

Structures into database

Histograms of forces+ Fuzzy rules

Convert to graph

Convert to graph

Natural Language Build graphics

sketch to match

Text descriptionsIHMC Parser

MATCHING

Best approach to date: Evolutionary Computation Algorithm

New approach: Hybrid AlgorithmEC basic structureSubgraph Isomorphism locally in database in addition to mutation

MU Text-to-Sketch System

ACADEMIC RESEARCH GRANT

Syntactic parse tree for: “There is a large rectangular building is to my left.”

Corresponding logical form graph of the deep semantic parse

J. Allen, M. Swift and W. de Beaumont, “Deep Semantic Analysis of Text”, Proc., Symp. On Semantics in Systems for Text Processing, 2008.

J. Allen, Natural Language Understanding, 2nd Ed. Redwood City, CA, USA: Benjamin-Cummings, 1995.

Automatically extract objects and relationships from IHMC’s deep semantic parse

Right now, language variation is restricted

Some Details:Parser

ACADEMIC RESEARCH GRANT

Example of “directly to the front,” which produces a constrained search region, combined with the distance descriptor

“somewhat close.”A compound linguistic description: “perfectly to the left of the blue building” and “mostly to the

right, but somewhat above the red building”

An illustration of “somewhat to the right” which produces a rather large search region

Building blocks: Intelligent object placement schemeToo expensive to use guided random search only – Directional Fuzzy Templates

I. Sledge and J. Keller, "Mapping natural language to imagery: Placing objects intelligently", Proc. IEEE Int. Conf. Fuzzy Syst., Jeju Island, Korea, August, 2009, pp 518-524 (Best student paper award).

ACADEMIC RESEARCH GRANT

“To the immediate right of that parking lot is a large parking garage that is the same length as the parking lot.”

Fuzzy Initial Placement template

Optimized PlacementInitial Placement

To the right but a little bit forward

Perfectly to the right

Linguistic DescriptionsFrom fuzzy rule base

Example of Sketch Creation After Parsing

ACADEMIC RESEARCH GRANT

(a) “There is a somewhat long, thin rectangular shaped parking lot that extends forward relative to me on

my right.”

(b) “To the immediate right of that parking lot is a large parking garage

that is the same length as the parking lot.“

(c) “I see a moderately small rectangular building close to me that

is mostly to my left but partially forward.”

(d) “Travelling to a 4-way intersection, there is a small

rectangular building close to me on my left that extends behind me.”

(e) “There is another small rectangular building across the street that is mostly to the front of me, but

somewhat to the left.”

(f) Left image: Final sketch (after a few more descriptions). Right image: Ground truth recovered from

matching algorithm

Sketch Building Example

ACADEMIC RESEARCH GRANT

Evolutionary Computation Approach

-180 -90 0 90 1800

1Target Histogram Set

FA

B(

)

Sketch converted to chromosome structure: building = gene

Each gene attributed with its Histograms of Forces

ACADEMIC RESEARCH GRANT

Initial Generation & fitness evaluation of chromosome

-180 -90 0 90 1800

1

Search Histogram Set

FA

B(

)

Rotate tooptimal angle

Add n random chromosomes to population over the Geospatial database

Each gene attributed with its Histograms of Forces

Blue are query HoFsRed are those of current Chromosome

ACADEMIC RESEARCH GRANT

A walk by Shakespeare’s Pizza in Columbia

ACADEMIC RESEARCH GRANT

A walk by Shakespeare’s Pizza in Columbia:Up close and personal

Human Geography ProjectUniversity of Missouri and University of Florida

HuGeo based

knowledge

HuGeo based

knowledge

Pattern Recognition Techniques

Pattern Recognition Techniques

Simulation and

Modeling

Simulation and

Modeling

50 100 150 200 250

50

100

150

200

250 0

0.5

1

1.5

2

2.5

3

3.5

4

GISGeospatial representation and reasoningConflationConfidence assessmentText2Sketch

Math modelsRepast Sampling methodsSwarm Intelligence

Uncertainty Modeling (Probabilistic, fuzzy, belief )ClassifiersClusteringOntologiesMulti and Hyper Spectral analysisFusion

HuGeo Project Overview

• What’s needed/missing in HuGeo-based Knowledge Discovery component?

– Extraction of concepts and events from blogs, twitter, news sources, etc

– Interpretation of sentences

– Placing them geographically

• i.e., we need NLP

Conclusions

• We model and utilize spatial language– We’re great at modeling, pattern recognition, fusion

• Our languages are not very flexible

• To take the next steps, we need better and deeper language models

• Help!