ALIAS WP3 Results

  • View

  • Download

Embed Size (px)



Text of ALIAS WP3 Results

  • WP 3 Presentation:Dialogue ManagerJrgen Geiger
  • Overview Goals Achievements Open Questions List of Publications04.06.2013 WP 3 Presentation 2
  • Goals Dialogue Manager Back-end for HMI Control all other modules Applications: Games, Reading service, Physiological Monitoring04.06.2013 WP 3 Presentation 3
  • TasksT3.1 User identification via speech or face recognitionT3.2 Knowledge representationT3.3 Development of a dialogue systemT3.4 Development and Integration of a game collectionT3.5 Web 2.0 wrapper for web servicesT3.6 Integration of further software modulesT3.7 Adaptable behaviour of the robot platformT3.8 Integration of natural language understandingT3.9 Physiological monitoringT3.10 Integration of the physiological monitoring into the dialogue manager04.06.2013 WP 3 Presentation 4
  • Deliverables04.06.2013 WP 3 Presentation 5Name DueD3.1 Report on the dialogue manager concept 09/2010D3.2 Knowledge databases 04/2011D3.3 Identification System (face & voice) 01/2011D3.4 Prototype of dialogue manager 04/2011D3.5 Physiological Monitoring (PM) 02/2011D3.6 Dialogue system with integrated PM 06/2011D3.7 Dialogue system updated to users needs 05/2012D3.8 Final dialogue system with integrated PM 02/2013
  • Achievements Dialogue Manager Control of all other modules Natural language understanding Software modules Physiological Monitoring User Identification Adaptable behaviour Emotions Physiological Monitoring04.06.2013 WP 3 Presentation 6
  • Dialogue Manager: Overview (T3.3, D3.1, D3.4, D3.7, D3.8) Central component of the ALIAS robot (brain) Reproduces the basic mechanisms of human thinking Decides on the behavior of the robot Communicates with all other modules04.06.2013 WP 3 Presentation 7HelloRobot!TTSFace DetectASRRobot ControlGUITouch ScreenDM CoreSituation ModelActionInputCESUnderstandingPhysio Monitor
  • Dialogue Manager: Overview04.06.2013 WP 3 Presentation 8 Components DM-Core (Brain) NLU-Engine understands humanverbal messages Decision-Engine decides on thebehavior Based on conceptual eventrepresentations (human thinking) DM-Communicator Communicates with sensing andacting modules Translates between modules andDM-Core
  • Natural Language Understanding NLU-Engine (T3.8, D3.2)Based on Cognesys CES technologyExtracts and processes the conceptual meaning ofverbal messagesResistent to syntactically or grammatically degradedinformationsUses knowledge and current situation to identify andcheck the practicability of identified statements NLU-Knowledge Database (T3.2, D3.2)World knowledge: understands the world in general,simulates human memoryExpert knowledge: understands the world of elderlypeople and depends on the robots functionality04.06.2013 WP 3 Presentation 9
  • Acting and Behavior (T3.8, D3.2)04.06.2013 WP 3 Presentation 10 Decision-EngineBased on Cognesys CES technologyProcesses conceptual event representationslike humans doUses a situation model like human memorySituation model Represents the currently relevant objects andtheir states and modalities Represents history of events that constitutes thecurrent situationProactive behavior Example: inform the user about new mails, invites theuser to stay in contact with its relatives
  • Dialogue Management (T3.6)04.06.2013 WP 3 Presentation 11 ASR Adapter Receives spoken user input as text The NLU-Engine processes the text GUI Adapter Controls the GUI, processes user input Menus Games, TV, audio books, email Skype call and alarm call control flow Synchronizes the GUI menus with BCI masks BCI Adapter Controls the Brain Computer Interface masks Processes user inputs
  • Dialogue Management04.06.2013 WP 3 Presentation 12 TTS Adapter Sends text to be spoken to theText-To-Speech module RD Adapter Interface to the robots low-level-controlers Controls navigation and movement behavior Controls the robots head emotions Receives speaker ident information
  • User identification: speech (T3.1, D3.3) Research aspects Speaker diarization Overlap detection Speech activity detection Implementation for the robot04.06.2013 WP 3 Presentation 13
  • Research aspects Speaker diarization Who speaks when? Utilise the output of a speech transcription system to suppresslinguistic variation Overlap detection Overlapping speech degrades performance Detect & handle overlap Voice activity detection04.06.2013 WP 3 Presentation 14
  • Speaker Recognition : Implementation Integrated with DM Running permanently DM receives name ofspeaker Used during TTS output To call the user by his name04.06.2013 WP 3 Presentation 15
  • User Identification: Face (T3.1, D3.3) Omnidirectional camera Viola & Jones algorithm for face detection Fusion with laser-based leg pair detection Face identification using Eigenfaces Keep eye contact with user04.06.2013 WP 3 Presentation 16
  • Gaming with Speech Control (T3.4, D3.8) Control game via ASR Noughts and crosses AI to control computer player Touchscreen control alsopossible04.06.2013 WP 3 Presentation 17
  • Reading Service (T3.5, D3.8) Customised GUI Based on open-source software Functionality: Read out e-books Recognition from camera04.06.2013 WP 3 Presentation 18
  • Display of Emotions (T3.7, D3.8) Can ALIAS display emotions? 5 basic emotions (Disgust, Fear, Joy, Sadness, Surprise) Integrated into Dialogue System04.06.2013 WP 3 Presentation 19Disgust Neutral Sadness
  • Physiological Monitoring (T3.9, T3.10, D3.5, D3.6) Vital function monitoring system Recording, saving, display of vital function data Manual data input Data input directly by sensors Alarm function for suspicious data values04.06.2013 WP 3 Presentation 20
  • Open questions Personal data: storage andusage Person ID, physiological monitoring Who gets access? Learning how to use the robot Self-explanatory system Systems adapts to the user Tablet PC?04.06.2013 WP 3 Presentation 21
  • Selected Publications J. Geiger, M. Hofmann, B.Schuller and G. Rigoll: "Gait-based Person Identification by Spectral, Cepstral and Energy-related Audio Features," ICASSP 2013 J. Geiger, T. Leykauf, T. Rehrl, F. Wallhoff, G. Rigoll: "The Robot ALIAS as a Gaming Platform for Elderly Persons," AAL-Kongress 2013 J. Geiger, I. Yenin, T. Rehrl, F. Wallhoff, G. Rigoll: "Display of Emotions with the Robotic Platform ALIAS", AAL-Kongress2013 T. Rehrl, J. Geiger, M. Golcar, S. Gentsch, J. Knobloch, G. Rigoll: "The Robot ALIAS as a Database for Health Monitoringfor Elderly People," AAL-Kongress 2013 T. Rehrl, R. Troncy, A. Bley, S. Ihsen, K. Scheibl, W. Schneider, S. Glende, S. Goetze, J. Kessler, C. Hintermueller, and F.Wallhoff: The Ambient Adaptable Living Assistant is Meeting its Users, AAL-Forum 2012 T. Rehrl, J. Blume, A. Bannat, G. Rigoll, and F. Wallhoff: On-line Learning of Dynamic Gestures for Human-RobotInteraction, KI 2012 J. Geiger, R. Vipperla, S. Bozonnet, N. Evans, B. Schuller, G. Rigoll: " Convolutive Non-Negative Sparse Coding and NewFeatures for Speech Overlap Handling in Speaker Diarization", INTERSPEECH 2012 R. Vipperla, J. Geiger, S. Bozonnet, D. Wang, N. Evans, B. Schuller, G. Rigoll: "Speech Overlap Detection and AttributionUsing Convolutive Non-Negative Sparse Coding", ICASSP 2012 J. Geiger, M. Lakhal, B. Schuller, and G. Rigoll: Learning new acoustic events in an HMM-based system using MAPadaptation, INTERSPEECH 2011 T. Rehrl, J. Blume, J. Geiger, A. Bannat, F. Wallhoff, S. Ihsen, Y. Jeanrenaud, M. Merten, B. Schnebeck, S. Glende, andC. Nedopil: ALIAS: Der anpassungsfhige Ambient Living Assistent, AAL-Kongress 201104.06.2013 WP 3 Presentation 22