1
HARDWARE SOFTWARE The robot Golem-II+ is the latest service robot developed by the Golem Group. We design and construct domain independent service robots based on a theory of Human-Robot Communication centered in the interaction context, and the specification of Speech Acts protocols, which are named Dialogue Models (DMs). IOCA - PeopleBot robotic base - Dell Precision M4600 - QuickCam Pro 9000 Webcam - Microsoft Kinect Camera x2 - Hokuyo UTM-30LX Laser - Shure Base Omnidirectional microphones x3 - RODE VideoMic directional microphone - M-Audio Fast Track Ultra external sound interface - Infinity 3.5-Inch Two-Way loudspeakers x2 - LAIDETEC-IIMAS robotic arm x2 DMs are embedded in an Interaction-Oriented Cognitive Architecture (IOCA) which coordinates perceptual and motor actions within the main communication cycle. IOCA also includes a set of reactive behaviors that allow the robot to respond directly, and a coordinator, which permits that reactive behavior and inferential processes proceed smoothly in a coordinated fashion. DMs and IOCA permit to model structured and multimodal conversations between people and service robots in diverse application domains. DMs are also defined and manipulated independently of IOCA's modules which remain fixed among different application domains and tasks. GOLEM-II+ Departamento de Ciencias de la Computación GOLEM-II+ The Golem Group Computer Science Department Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas Universidad Nacional Autónoma de México http://golem.iimas.unam.mx [email protected] IOCA Interpretations Expectations Action Protocols Modal Codifications Concrete Actions EXTERNAL REPRESENTATIONS OR THE WORLD Rendering Devices Specification Module Coordinator Dialogue Manager Semantic Memory Interpretation Module Recognition Devices Autonomous Reactive Systems IOCA Perceptual Memory A full DM can be embedded within a situation. Situations, Expectations and Actions can be defined parametrically as functions of the interaction context. The next situation is determined by the current interpretation and also by the task history. During task execution the robot performs the action that is best met in the situation. In case there is no expectation satisfied by the world, the robot is "out of context" and performs a recovery protocol to place itself in context again and establish the common ground. TEAM MEMBERS Luis Pineda (team leader), Ivan Meza, Caleb Rascón, Gibran Fuentes, Carlos Gershenson, Mario Peña, Iván Sánchez, Mauricio Reyes, Hernando Ortega, Arturo Rodríguez, Lisset Salinas, Joel Durán, Esther Venegas, Varinia Estrada DIALOGUE MODELS DMs are information states named "situations", which are defined in terms of expectation and action pairs (E:A). A different set of expectations determines a different situation. Subordinate Dialogue Model Temporal Occlusion Si c start:hello ε:start_button Main Dialogue Model - Follow Me Test Ri continue:ε Rot Rrl Rru continue:ε continue:ε continue:ε Rlm fs finish:goodbye si v1 n see(N):see_you ε:ε v2 Rb no_detect:dont_see_you fs continue stop_gesture(N):dont_move see(N):see_you_again occlusion(N):wait occlusion(N):something_between ε:ask_for_help SOFTWARE LIBRARIES JACK, PocketSphinx OpenCV, OpenNI, MOPED Sicstus Prolog Festival TTS Player/Stage and Gearbox MODULE Navigation Dialogue Manager Vision Voice Recognition Voice Synthesizer Object Manipulation Roboplus WordSpotting, GF grammar Language Interpretation ABILITIES PERCEPTION AND ACTION FUNCTIONALITY Robot locates the third party by the source of sound with a percep- tual scheme, rotates and face him to attend the interruption. DISTINCTIVE FUNCTIONALITY: ORIENTATION BY SOURCE OF SOUND I'm attending someone else. Please, wait. I'll be with you in a minute. Please, wait Bring me the juice Hi, Golem! ¿? Ok. Thanks! Sorry! You said juice, right? Yes, please. Robot and user are engaged in a task oriented conversation. A third party interrupts taking the conversatio- nal initiative. Robot records the parameters of the inte- rruption, accomplishes the original task and returnes to the third party. NAVIGATION VISION AUDIOLOCALIZATION OBJECT MANIPULATION Robotic orientation towards speaker using 3 microphones Image Triangle Array-Redundancy OBJECT RECOGNITION Object Model Recognized Object FACE RECOGNITION Examples of eigenfaces PEOPLE TRACKING AND GESTURE RECOGNITION + 4 degrees of freedom + Reach: 10-90cm + Hands with sensors + Openning to 30cm + Load: 500gr

RIPS Tornemo Mexicano de Robotica 2012

Embed Size (px)

DESCRIPTION

The robot Golem-II+ is the latest service robot developed by the Golem Group. We design and construct domain independent service robots based on a theory of Human-Robot Communication centered in the interaction context, and the specification of Speech Acts protocols, which are named Dialogue Models (DMs).

Citation preview

Page 1: RIPS Tornemo Mexicano de Robotica 2012

HARDWARE

SOFTWARE

The robot Golem-II+ is the latest service robot developed by the Golem Group.

We design and construct domain independent service robots based on a theory of Human-Robot Communication centered in the interaction context, and the specification of Speech Acts protocols, which are named Dialogue Models (DMs).

IOCA

- PeopleBot robotic base - Dell Precision M4600 - QuickCam Pro 9000 Webcam - Microsoft Kinect Camera x2 - Hokuyo UTM-30LX Laser - Shure Base Omnidirectional microphones x3 - RODE VideoMic directional microphone - M-Audio Fast Track Ultra external sound interface - Infinity 3.5-Inch Two-Way loudspeakers x2 - LAIDETEC-IIMAS robotic arm x2

DMs are embedded in an Interaction-Oriented Cognitive Architecture (IOCA) which coordinates perceptual and motor actions within the main communication cycle. IOCA also includes a set of reactive behaviors that allow the robot to respond directly, and a coordinator, which permits that reactive behavior and inferential processes proceed smoothly in a coordinated fashion. DMs and IOCA permit to model structured and multimodal conversations between people and service robots in diverse application domains. DMs are also defined and manipulated independently of IOCA's modules which remain fixed among different application domains and tasks.

GOLEM-II+

Departamento de Ciencias de la Computación

GOLEM-II+The Golem Group

Computer Science DepartmentInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas

Universidad Nacional Autónoma de México

http://[email protected]

IOCA

InterpretationsExpectationsAction Protocols

Modal Codifications

Concrete Actions

EXTERNAL REPRESENTATIONS OR THE WORLD

RenderingDevices

Specification Module

Coordinator

Dialogue Manager

Semantic Memory

Interpretation Module

RecognitionDevices

AutonomousReactive Systems

IOCAPerceptual Memory

A full DM can be embedded within a situation. Situations, Expectations and Actions can be defined parametrically as functions of the interaction context. The next situation is determined by the current interpretation and also by the task history.

During task execution the robot performs the action that is best met in the situation. In case there is no expectation satisfied by the world, the robot is "out of context" and performs a recovery protocol to place itself in context again and establish the common ground.

TEAM MEMBERS

Luis Pineda (team leader), Ivan Meza, Caleb Rascón, Gibran Fuentes, Carlos Gershenson, Mario Peña, Iván Sánchez, Mauricio Reyes, Hernando Ortega, Arturo Rodríguez, Lisset Salinas, Joel Durán, Esther Venegas, Varinia Estrada

DIALOGUE MODELS

DMs are information states named "situations", which are defined in terms of expectation and action pairs (E:A). A different set of expectations determines a different situation.

Subordinate Dialogue Model Temporal Occlusion

Si

cstart:hello

ε:start_button

Main Dialogue Model - Follow Me Test

Ricontinue:ε

Rot Rrl Rrucontinue:ε

continue:εcontinue:ε

Rlm

fs

finish:goodbye

si

v1

n

see(N):see_you

ε:ε

v2

Rb

no_detect:dont_see_you

fs continue

stop_gesture(N):dont_move

see(N):see_you_againocclusion(N):wait

occlusion(N):something_between

ε:ask_for_help

SOFTWARE LIBRARIES

JACK, PocketSphinx

OpenCV, OpenNI, MOPED

Sicstus Prolog

Festival TTS

Player/Stage and Gearbox

MODULE

Navigation

Dialogue Manager

Vision

Voice Recognition

Voice Synthesizer

Object Manipulation Roboplus

WordSpotting, GF grammarLanguage Interpretation

ABILITIES

PERCEPTION AND ACTION FUNCTIONALITY

Robot locates the third party by the source of sound with a percep-tual scheme, rotates and face him to attend the interruption.

DISTINCTIVE FUNCTIONALITY: ORIENTATION BY SOURCE OF SOUND

I'm attending someone else. Please, wait. I'll be with

you in a minute.

Please, wait

Bring me the juice

Hi, Golem!

¿? Ok. Thanks!

Sorry! You said juice, right?

Yes, please.

Robot and user are engaged in a task oriented conversation. A third party interrupts taking the conversatio-nal initiative.

Robot records the parameters of the inte-rruption, accomplishes the original task and returnes to the third party.

NAVIGATION

VISION

AUDIOLOCALIZATIONOBJECT MANIPULATION

Robotic orientation towards speaker using 3 microphones

Image Triangle Array-Redundancy

OBJECT RECOGNITION

Object Model Recognized Object

FACE RECOGNITION

Examples of eigenfaces

PEOPLE TRACKING AND GESTURE RECOGNITION

+ 4 degrees of freedom+ Reach: 10-90cm+ Hands with sensors + Openning to 30cm+ Load: 500gr