19
Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory

Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Embed Size (px)

Citation preview

Page 1: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Referring to Objectswith Spoken and Haptic Modalities

Frédéric LANDRAGIN

Nadia BELLALEM

& Laurent ROMARY

LORIA Laboratory

Nancy, FRANCE

Page 2: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Overview

• Context: Conception phase of the EU IST/MIAMM project (Multidimensional Information Access using Multiple Modalities - with DFKI, TNO, Sony, Canon)– Study of the design factors for a future haptic PDA like

device

• Background: Interpretation of natural language and spontaneous gestures.– A model of contextual interpretation of multimodal referring

expressions in visual contexts.

• Objective: To show the possible extension of the model to an interaction mode including tactile and kinesthetic feedback (henceforth “Haptic”)

Page 3: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

MIAMM - wheel mode

Page 4: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

MIAMM - axis mode

QuickTime™ et un décompresseurPhoto - JPEG sont requis pour visualiser

cette image.

Page 5: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

MIAMM - galaxy mode

Page 6: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

• The use of perceptual grouping

Reference domains and visual contexts

« these three objects »

{ , , }

Page 7: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

• The use of perceptual grouping

Reference domains and visual contexts

« these three objects »

{ , , }

« the triangle »

{ }

« the two circles » { , }

• The use of salience

Page 8: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Multimodal fusion architecture

reference domain(s)

history

history

task

module

visual

moduledialogue

manager

request

domains

request

domains

request

domains

referentialexpression

referent

history

language

moduleunder-specification

usermodel

referentialgesture

Page 9: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Haptics and deixis

• Haptic gestures can take three classical functions of gesture in man-machine interaction:– semiotic function: ‘select this object’– ergotic function: ‘reduce the size of this object’– epistemic function: ‘save the compliance of this object’

• How can the system identify the function(s)?– linguistic clues (referential expression, predicate)– task indications (possibilities linked to a type of objects,

“affordances”)

• Deictic role: to make the object salient, whatever the function, in order to focus the addressee’s attention on it.

Page 10: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Haptics and perceptual grouping

• Interest: formalism for the focalization on a subset of objects

• Grouping factors:– objects which have similar haptic (in particular, tactile)

properties (shape, consistency, texture)

– objects that have been browsed by the user (the elements of such a group are ordered)

– objects that are stuck together, parts of a same object...

Page 11: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Haptics and perceptual domains

• Can visual and tactile perceptions work together?– simultaneous visual and tactile perception implies the same

world of objects (and synchronized feedbacks)

– a referring expression can be interpreted in visual context or in tactile context

• How can the system identify the nature of perception?– for immediate references, the visual context gives the

reference domain and haptics gives the starting point in it

– for ieterating references, each type of context can provide the reference domain (so both hypotheses must be tested)

Page 12: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Haptics and dialogue history

• Interpretations that require an ordering within the reference domain: ‘the first one’, ‘the next one’, ‘the last one’– in visual perception, guiding lines can be helpful (if none, an

order can always be built with the reading direction)– in haptic perception, the only criterion can be the

manipulation order

• Some referring expressions that do not need an order may be interpreted in the haptic manipulation history– ‘the big one’ (in the domain of browsed objects)– ‘them’ (the most pressured objects)

Page 13: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Architecture for speech-haptic referring

reference domain(s)

history

history

task

module

visuo-tactile

module

dialogue

manager

request

domains

request

domains

request

domainsreferent

history

language

moduleunder-specification

usermodel

hapticgesture

referentialexpression

Page 14: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Summary

• What does not change from deictic to haptic:– the status of speech and gesture in the architecture

– the repartition of information among speech and gesture

– the need of reference domains

– the use of salience and the use of orderings in domains

– the algorithms for the exploitation of all these notions

• What does change:– some unchanged notions can have more than one cause

– objects must be browsed to be grouped in a haptic domain

– one aspect of the architecture: the visual perception module becomes the visuo-tactile perception module

Page 15: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Application - MMIL design

• MMIL - Multimodal Interface Language– Unique communication language within the MIAMM

architecture

– XML based representation embedded in a web service protocol (SOAP)

– Oriented towards multimodal content representation (cf. Joint ACL/SIGSEM and TC37/SC4 initiative)

• Example: VisHapTac notification to Dialogue Manager

Page 16: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Overall structuree0

set1

s1

s2

set2

s25

description

s2-1

s2-2

s2-3

inFocus

inSelection

Visual haptic state

Participant setting

Sub-divisions

Page 17: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

MMIL format for VisHapTac - 1<mmilcomponent>

<event id=“e0”>

<evtType>HGState</evtType>

<visMode>galaxy</visMode>

<tempSpan

startPoint=“2000-01-20T14:12:06”

endPoint=“2002-01-20T14:12:13”/>

</event>

<participant id=“set1”>

</participant>

<relation type=“description” source=“set1” target=“e0”/>

</mmilcomponent>

Page 18: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

MMIL format for VisHapTac - 2<participant id=“set1”>

…<participant id=”s1”>

<Name>Let it be</Name></participant><participant id=“set2”>

<individuation>set</individuation><attentionstatus>inFocus</attentionstatus><participant id=“s2-1”>

<Name>Lady Madonna</Name></participant>…<participant id=“s2-3”>

<attentionStatus>inSelection</attentionStatus>

<Name>Revolution 9</Name></participant>

</participant>…

</participant>

Page 19: Referring to Objects with Spoken and Haptic Modalities Frédéric LANDRAGIN Nadia BELLALEM & Laurent ROMARY LORIA Laboratory Nancy, FRANCE

Future work

• Within the dialogue manager module, domains may be confronted, using a relevance criterion

The way the linguistic contraints of the referring expression apply in the different domains may be such a criterion.

• Validation in the MIAMM architecture

The transition from deictic to haptic may not be an additional cost for the development of a dialogue system, both from the architecture point of view and the dialogue management point of view.