Cooperative Multimodal Augmented Reality Labyrinth

Cooperative Multimodal Augmented Reality

LabyrinthDidier Perroud

Raynald SeydouxFrédéric Barras

Abstract Objectives Modalities

◦ Project modalities◦ CASE/CARE

Implementation◦ VICI, Iphone, Voice recognition, Network

Demonstration Conclusion

Summary

Coordination between two persons to move a ball into a labyrinth

Rotation possible on the x and y axis Gates can be opened with vocal and

gestural commands

Abstract

Coordinate the following technologies:◦ Augmented reality with tags◦ Gesture detection ( with Iphone accelerometers)◦ Voice recognition ( words)◦ Collaborative environments◦ Physic engine

Objectives

Inputs◦ Hand rotation in x and y axis ( one axis per player) direct

manipulation of the labyrinth board◦ Hand pumping for gates’ openings◦ Voice recognition (words) for selecting gate to open and

start the game Outputs

◦ Image on the beamer◦ Iphone vibrations

Modalities – Project Modalities

CASE

◦ Semantic level of abstraction

CARE

◦ Gesture orientation: assignment◦ Gesture pumping/Voice selection: complementary to open a gate◦ Voice commands: assignment

Decision level fusion Fission: image, vibration

Modalities – CASE/CARE

Blocks◦ Webcam, Tag detection◦ OpenGL, Physic engine

Multimodality Management◦ state machine

Augmented reality application◦ event based

Messages from the gateway◦ Voice events◦ Gesture events (orientation X and Y, shake)

Messages to the gateway◦ Vibration events

Implementation VICI

Handle the UIAccelerometer interface Generate motionEvent when shaking Messages to the gateway

◦ Orientations (X or Y)◦ Shake

Messages from the gateway◦ Vibrate

Implementation Iphone

Windows speech API SDK Features:

◦ API definition files◦ Runtime component◦ Control Panel applet◦ Text-To-Speech engines in multiple languages.◦ Speech Recognition engines in multiple languages.◦ Redistributable components◦ Sample application code.◦ Sample engines◦ Documentation.

Implementation Voice 1/3

Our System A speech recognition engine A grammar<grammar xmlns="http://www.w3.org/2001/06/grammar" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/06/grammar http://www.w3.org/TR/speech-grammar/grammar.xsd" xml:lang="en-EN" version="1.0"> <rule id="Labyrinth" scope="public"> <one-of> <item>New game</item> <item>Pause</item> <item>Exit</item> <item>Open gate one</item> <item>Open gate two</item> <item>Close gate one</item> <item>Close gate two</item> </one-of> </rule></grammar>


Recognition comparison before training / after training


Implementation Network

Live Videos

Demonstration

Problems with the physic engine◦ Coordination user moves – physic moves

Voice recognition OK

High-level programing Heterogeneity not a problem

Functional prototype

Conclusion

Thank you

Questions ?

Documents

Cooperative Multimodal Augmented Reality Labyrinth