55
Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and Processing of Signed Languages 4 th International Conference on Language Resources and Evaluation May 30, 2004, Lisbon, Portugal Computer and Information Science University of Pennsylvania Research Advisors: Mitch Marcus & Martha Palme

Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Embed Size (px)

Citation preview

Page 1: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Spatial Representations of Classifier Predicates for Machine Translation

into American Sign Language

Matt Huenerfauth

Workshop on the Representation and Processing of Signed Languages4th International Conference on Language Resources and Evaluation

May 30, 2004, Lisbon, Portugal

Computer and Information ScienceUniversity of Pennsylvania

Research Advisors: Mitch Marcus & Martha Palmer

Page 2: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Motivations and Applications

• Only half of Deaf high school graduates (age 18+) can read English at a fourth-grade (age 10) level, despite ASL fluency.

• Many Deaf accessibility tools forget that English is a second language for these students (and is different than ASL).

• Applications for a Machine Translation System:– TV captioning, teletype telephones.– Computer user-interfaces in ASL.– Educational tools, access to information/media.– Transcription, storage, and transmission of ASL.

Page 3: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Input / Output

What’s our input? English Text.

What’s our output? Less clear…

Imagine a 3D virtual reality human being…

One that can perform sign language…

What’s our input? English Text.

What’s our output? Less clear…

Imagine a 3D virtual reality human being…

One that can perform sign language…

But this character needs a set of instructions telling it how to move!

Our job: English These Instructions.VCom3d

Page 4: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Photos: Seamless Solutions, Inc.Simon the Signer (Bangham et al. 2000.)Vcom3D Corporation

Off-the-Shelf Virtual Humans

Page 5: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Classifier Predicates

The car drove down the bumpy road past a cat.CAT ClassPred-bentV-{location of cat}CAR ClassPred-3-{drive on bumpy road}

Where’s the cat, the road, and the car? How close? Where does the path start/stop? How show path is bumpy, winding, or hilly?

• Pushing the boundaries of ‘language.’

– Hard to handle with traditional computational linguistic representations (lexicons, grammars).

Page 6: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Previous ASL MT Systems• Word-for-Sign direct transliteration.

– Produces Signed English, not ASL

• Syntactic analysis, transfer, generation.– Handles much of the non-spatial phenomena.

• All ignore classifier predicates.– Need ASL classifiers to fluently translate many

English input texts.– Signers use classifier predicates once per

minute in most genres (17x/minute in some).Morford and McFarland, 2003.

Page 7: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Focus and Assumptions• Other systems: non-spatial ASL sentences.

• This project: spatially complex ASL.

• This means classifier predicates!– Predicates of movement and location– Generating a single classifier predicate

(multi-predicate issues also being studied)

Page 8: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Motivating a Design for a Classifier Predicate Generator

Four progressively better designs…

Page 9: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Four Designs: Keep ImprovingDesign 1:FullyLexicalized

Design 2:Compositional Rules

Design 3:Directly Pictorial

Design 4:TemplateLexicon

Virtual Reality Spatial Model

Associate a movement path with English multi-word phrases.

Combine a set of morphemes using compositional rules that we must write.

Invisible world;Place hand on top of the moving object.

Invisible world to calculate 3d points, then fill template for some classpred.

Can’t List All of Them

Too Many Morphemes

Overgenerates

Supalla’sPolymorphemic

DeMatteo’s Visual/Gestural

Liddell’s Templates

Page 10: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Design 1: List them all…

• Multi-word English lexical entries.

• Associate a classifier predicate with each.

• Exhaustively list them all… Problem?– Anticipate all of them?– ClassPreds are very productive.– Many ways to modulate performance.

• “…drive up the hill…”

– This approach is impractical.

Page 11: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Four Designs: Keep ImprovingDesign 1:FullyLexicalized

Design 2:Compositional Rules

Design 3:Directly Pictorial

Design 4:TemplateLexicon

Virtual Reality Spatial Model

Associate a movement path with English multi-word phrases.

Combine a set of morphemes using compositional rules that we must write.

Invisible world;Place hand on top of the moving object.

Invisible world to calculate 3d points, then fill template for some classpred.

Can’t List All of Them

Too Many Morphemes

Overgenerates

Supalla’sPolymorphemic

DeMatteo’s Visual/Gestural

Liddell’s Templates

Page 12: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Design 2: Composition Rules

• Identify minimal components of meaning.

• Corresponding element of movement/shape:– path contour, hand elevation, palm orientation…– e.g. “…which way is the person facing…”

• Compositional rules to combine these ‘morphemes’ into full classifier predicate.

Page 13: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Design 2: Linguistic Analogy

• Analogous to Suppalla’s polymorphemic model of classifier predicate generation.

(1978, 1982, 1986)

– Every piece of information = morpheme. – Build the predicate = combining lots of them. – E.g. “…two people meet…” (Liddell, 2003)

Morpheme count explosion! Not practical.

Page 14: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

So, what’s the problem?• Every 3D location/path = a new morpheme.

– No model of how objects arranged in space…

• 3D model = more intuitive.– Easier to select the motion path of our hand.– Need many fewer morphemes.

• Analyze English text make a 3D model.

• 3D coordinates How to move our hand.

Page 15: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Four Designs: Keep ImprovingDesign 1:FullyLexicalized

Design 2:Compositional Rules

Design 3:Directly Pictorial

Design 4:TemplateLexicon

Virtual Reality Spatial Model

Associate a movement path with English multi-word phrases.

Combine a set of morphemes using compositional rules that we must write.

Invisible world;Place hand on top of the moving object.

Invisible world to calculate 3d points, then fill template for some classpred.

Can’t List All of Them

Too Many Morphemes

Overgenerates

Supalla’sPolymorphemic

DeMatteo’s Visual/Gestural

Liddell’s Templates

Virtual Reality Spatial Model

Page 16: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

A Useful Technology…

Controlling a virtual reality with English input commands…

Page 17: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Controlling a Virtual Reality with NL

• AnimNL System– Virtual reality model of characters/objects in 3D.– Input: English sentences.

Directions for characters/objects to follow.– Produces an animation:

Characters/objects obey the English commands.– Updates the 3D scene to show changes.

Badler, Bindiganavale, Allbeck, Schuler, Zhao, Lee, Shin, and Palmer. 2000.Schuler. 2003.

Page 18: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

English-Controlled 3D Scene

http://hms.upenn.edu/software/PAR/images.html

Page 19: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

How it Works

English Text Syntactic Analysis

Select a PAR Template Fill the PAR Template

“Planning Process” Animation Output

PAR = “Parameterized Action Representation” (on next slide)

Page 20: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Parameterized Action Representationparticipants: [ agent: AGENT

objects: OBJECT list ]

semantics: [ motion: {Object, Translate?, Rotate?} path: {Direction, Start, End, Distance}

termination: CONDITION duration: TIME-LENGTH

manner: MANNER ]

start: TIME

prep conditions: CONDITION boolean-exp

sub-actions: sub-PARs

parent action: PAR24

previous action: PAR35

next action: PAR64

This is a subset of PAR info.http://hms.upenn.edu/software/PAR

Bob tripped on the ball.

…tripped…

Planning Operator(Artificial Intelligenceformalism for deciding howto act in complex situation.)

Bob{ ball_1 }

{Bob, translate…, rotate…}Specifics of the path taken…

Accidentally.

End at 6am.3 Hours.

Accidentally, Rapidly.

…until dawn.…for 3 hours.…rapidly.

What is a “planning” algorithm good for?

Page 21: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Adding Detail, Making Animation

• PAR is missing details needed to create animation.– “…turn the handle…”

• Use an artificial intelligence “planning” algorithm– Calculate preconditions, physical constraints,

sub-actions, effects, etc. of each animation movement.

• Works out the details needed to build animation.

Page 22: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Diagram of AnimNL

Page 23: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

A 3D Spatial Model forAmerican Sign Language

Using the virtual reality English-command technology

Page 24: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

English-Controlled 3D Scene

http://hms.upenn.edu/software/PAR/images.html

Page 25: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Using this technology…

An NL-Controlled 3D Scene

http://hms.upenn.edu/software/PAR/images.html

Page 26: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Using this technology…

An NL-Controlled 3D Scene

Page 27: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Using this technology…

An NL-Controlled 3D SceneOriginal image from: Simon the Signer (Bangham et al. 2000.)

Signing Character

Page 28: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Using this technology…

An NL-Controlled 3D SceneOriginal image from: Simon the Signer (Bangham et al. 2000.)

Signing Character

Page 29: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

“Invisible World” Approach

• Invisible objects floating in front of the signer.

• English sentences commands for virtual reality.• Positions, moves, and orients objects in this world.

• So, we’ve got all these floating invisible objects…

What do we do with them?

Page 30: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Using the 3D Virtual Reality

Design 3 and Design 4

Page 31: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Four Designs: Keep ImprovingDesign 1:FullyLexicalized

Design 2:Compositional Rules

Design 3:Directly Pictorial

Design 4:TemplateLexicon

Virtual Reality Spatial Model

Associate a movement path with English multi-word phrases.

Combine a set of morphemes using compositional rules that we must write.

Invisible world;Place hand on top of the moving object.

Invisible world to calculate 3d points, then fill template for some classpred.

Can’t List All of Them

Too Many Morphemes

Overgenerates

Supalla’sPolymorphemic

DeMatteo’s Visual/Gestural

Liddell’s Templates

Page 32: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Design 3: Directly Pictorial

• Invisible 3D Objects Classifier Predicate

– Put hand in the proper handshape– Place hand directly on top of (inside of)

object in the 3D scene.– Follow the paths objects trace through space.

We go along for the ride!

Page 33: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Diagram of Design 3

Page 34: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Diagram of Design 3

The AnimNL Technology

Page 35: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Linguistic Analogy / Problems• DeMatteo’s gestural model of classifier predicates

(1977)

– Mental model of scene.

– Move hands in topologically analogous manner.

– Merely iconic gestural movements.

• Problem? Overgenerative. – Doesn’t explain conventions/restrictions:

• legal combinations of handshape/movement.

• some movements not visually representative.

• discourse factors / multi-predicate concerns.

Design 3 hassame problem!

(Liddell, 2003)

Page 36: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Diagram of Design 3This process is harder than it seems.

Page 37: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Four Designs: Keep ImprovingDesign 1:FullyLexicalized

Design 2:Compositional Rules

Design 3:Directly Pictorial

Design 4:TemplateLexicon

Virtual Reality Spatial Model

Associate a movement path with English multi-word phrases.

Combine a set of morphemes using compositional rules that we must write.

Invisible world;Place hand on top of the moving object.

Invisible world to calculate 3d points, then fill template for some classpred.

Can’t List All of Them

Too Many Morphemes

Overgenerates

Supalla’sPolymorphemic

DeMatteo’s Visual/Gestural

Liddell’s Templates

Page 38: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

The Solution? More Templates!

• Can’t just ‘go along for the ride.’– Making a ClassPred is more complicated.

• Our last complicated animation task? – Move 3D objects based on English text.

– We used templates and ‘planning’.

• Can we do something like this again?– This time: how to move the arm to do a ClassPred.

Page 39: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Diagram of Design 3Insert a template library here…

Insert a planningprocess here…

Page 40: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Diagram of Design 4

Page 41: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

A Second PAR Template Library

• First library of templates: Possible movements of invisible objects in virtual reality.

• Second library: Possible movements of the signer’s hands while performing classifier predicates to describe these objects.

Original image from: Simon the Signer (Bangham et al. 2000.)

Library 1Library 2

Page 42: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Selecting/Filling a Template

• Big list of prototypical classifier predicates stored as templates.

• Select a template based upon:– English lexical items– Linguistic features in English sentence– 3D coordinates and motion paths of objects

• Let planning process build animated output.

• How is this better than design 3?

Page 43: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

…leisurely walking along…

• AnimNL: English Text Virtual Reality.

• Parse of Sentence Select a Template Leisurely-Walking-Upright-Figure– Specifies handshape, palm orientation,

“bouncing” path contour, and speed/timing.– Still needs 3D starting/stopping coordinates.

• Get coordinates from “invisible world,” fill template, let animation software make output.

How’s it better? Invisible world motion path≠ hand motion path.

Page 44: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Linguistic Motivations

• “Blended Spaces” Lexicalized Classifier Predicate Model of Scott Liddell (2003).– Signers imagine objects occupying space.– Classifier predicates stored as:

lexicon of templates that are parameterized on locations/orientations of these spatial entities.

• Both engineering & linguistic motivations.

Page 45: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

The Four Designs: Wrap UpDesign 1:FullyLexicalized

Design 2:Compositional Rules

Design 3:Directly Pictorial

Design 4:TemplateLexicon

Virtual Reality Spatial Model

Associate a movement path with English multi-word phrases.

Combine a set of morphemes using compositional rules that we must write.

Invisible world;Place hand on top of the moving object.

Invisible world to calculate 3d points, then fill template for some classpred.

Can’t List All of Them

Too Many Morphemes

Overgenerates

Supalla’sPolymorphemic

DeMatteo’s Visual/Gestural

Liddell’s Templates

Exciting PossibilitiesExciting Possibilities

Page 46: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Generating Multiple Classifier Predicates

Page 47: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Generating Multiple ClassPreds

• One English sentence one classifier predicate ?– Sometimes one-to-many or many-to-one.

– Change in ordering or organization.

– Interact or constrain one another.

– Emergent meaning from multiple predicates.

• Need to think about generation decisions at a multi-predicate level for the entire scene being described. Need a representation of how several predicates work together for one 3D scene.

Page 48: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Generating Multiple ClassPreds• We’re using PARs as classifier predicate

templates. – But PARs can store sub-PARs inside of them

to represent sub-parts of a movement.

• Instead of using a PAR to store only one classifier predicate template, let’s store several templates together as a group.

• We’ll call this group of CPs a “motif.”

Page 49: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Generating Multiple ClassPreds

• Several English sentences can trigger a large structure containing rules for how to compose several classifier predicates.– Allows us to plan out a whole scene.– Allows us to rearrange or introduce additional

classifier predicates to satisfy ASL grammatical constraints.

– Planning process expands motif several CPs.(More details in paper.)

Page 50: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Benefits of Virtual Reality

Page 51: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Advantages of Virtual Reality

• VR is not useful for more than just classifier predicates; it can facilitate the layout of discourse entities for pronominal reference.

• Also, the human characters from AnimNL have skills that are useful for ASL. – One of them will be our signer. – Will make it easier to build our system.

Page 52: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Advantages of Virtual Reality

• This approach suggests new ways to annotate classifier predicate performance.– Important to note the layout of the invisible

world objects in the space around the signer.– Allows us to study how classifier predicates are

generated to communicate the information in this scene.

Page 53: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Wrap Up

• Applications/motivations for ASL MT

• Classifier predicates are hard to generate.

• Need a 3D spatial model and generation process: Virtual reality “invisible worlds.”

• Engineering and Linguistic Motivations.

• Ways to handle multi-predicate expressions.

• Advantages to use of virtual reality.

Page 54: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

ReferencesBadler, Bindiganavale, Allbeck, Schuler, Zhao, Lee, Shin, and Palmer. 2000. Parameterized action

representations and natural language instructions for dynamic behavior modification of embodied agents. AAAI Spring Symposium.

Bangham, Cox, Lincoln, Marshall. 2000. Signing for the deaf using virtual humans. IEE2000.DeMatteo, A. (1977). Visual Analogy and the Visual Analogues in American Sign Language. In Lynn

Friedman (ed.). On the Other Hand: New Perspectives on American Sign Language. (pp 109-136). New York: Academic Press.

Holt, J. (1991). Demographic, Stanford Achievement Test - 8th Edition for Deaf and Hard of Hearing Students: Reading Comprehension Subgroup Results.

Liddell. 2003. Sources of Meaning in ASL Classifier Predicates. In Karen Emmorey (ed.). Perspectives on Classifier Constructions in Sign Languages. Workshop on Classifier Constructions, La Jolla, San Diego, California.

Liddell. 2003. Grammar, Gesture, and Meaning in American Sign Language. UK: Cambridge U. Press. Morford and MacFarlane. 2003. “Frequency Characteristics of ASL.” Sign Language Studies, 3:2. Schuler. 2003. Using model-theoretic semantic interpretation to guide statistical parsing and word

recognition in a spoken language interface. Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL’03), Sapporo, Japan.

Supalla, T. (1978). Morphology of Verbs of Motion and Location. In F. Caccamise and D. Hicks (eds). Proceedings of the Second National Symposium on Sign Language Research and Teaching. (pp. 27-45). Silver Spring, MD: National Association for the Deaf.

Supalla, T. (1982). Structure and Acquisition of Verbs of Motion and Location in American Sign Language. Ph.D. Dissertation, University of California, San Diego.

Supalla, T. (1986). The Classifier System in American Sign Language. In C. Craig (ed.) Noun Phrases and Categorization, Typological Studies in Language, 7. (pp. 181-214). Philadelphia: John Benjamins.

Page 55: Spatial Representations of Classifier Predicates for Machine Translation into American Sign Language Matt Huenerfauth Workshop on the Representation and

Photo Credits

Some images from:

• Seamless Solutions, Inc. Website

• Vcom3d Company Website

• J.A. Bangham, S J Cox, M Lincoln, I Marshall. 2000. Signing for the deaf using virtual humans. IEE2000