Statistical and Empirical Approaches to Spoken Dialog Systems

Statistical and Empirical approaches for spoken dialog systems Workshop proposal for AAAI-06 (Boston) Organizers: Jason D. Williams, Steve Young, Pascal Poupart, Stephanie Seneff 1) Workshop topic A description of the workshop topic. Identify the specific issues on which the workshop will focus. Spoken dialog systems are machines which interact with people using spoken language. This workshop seeks to draw new work on statistical and empirical approaches for spoken dialog systems. We welcome both theoretical and applied work, addressing issues such as: - Representations and data structures for dialog models suitable for machine learning - Methods for automatic generation and improvement of dialog managers incorporating machine learning - Ontology representations and integration methods suitable for machine learning - Techniques to accurately simulate human-computer dialog - Creation, use, and evaluation of user models - Methods for automatic evaluation of dialogue systems - Investigations into appropriate optimization criteria for spoken dialog systems - Applications and real-world examples of spoken dialog systems incorporating statistical or empirical techniques - Use of statistical or empirical techniques within multi-modal dialog systems - Application of statistical or empirical techniques to multi-lingual spoken dialog systems - The use and application of techniques and methods from related areas, such as cognitive science, operations research, emergence models, etc. 2) Motivation A brief discussion of why the topic is of particular interest at this time. Although the low-level speech recognition component of spoken dialog systems has long been framed as a statistical pattern classifier trained on data, most approaches to the higher-level dialog management components have been handcrafted. Recently a number of researchers have begun exploring how dialog management can be approached as a machine learning problem. This interest has been driven by several factors:

- Growing availability of dialog data corpora - Emergence of new optimization techniques and computing power able to scale to dialog management problems for example, in reinforcement learning - Realization that the design and testing of spoken dialog systems is time- consuming and expensive - Failure of hand-crafted approaches to dialog management to demonstrate robust behavior in the face of inaccurate speech recognition, and move reliably beyond simple types of systems. 3) Format A brief description of the proposed workshop format, regarding the mix of events such as paper presentations, invited talks, panels, and general discussion. We envisage approximately 3 paper presentation sessions (each with approximately 4 papers) mixed with approximately 2 invited speakers. For the invited speakers, we envisage distinguished members of the dialog/speech community and the machine learning community. We have identified several candidates for speakers but have not approached speakers yet. Our aims for invited speakers are to: provide views on issues such as how dialog management/dialog modeling can be represented as a machine learning problem, explain methods for machine learning of interest to the dialog management community, suggest how to scale machine learning to problems in this domain, and propose interesting research questions. For the paper sessions, we would like to foster interaction & discussion. After each paper is presented, time will be left for questions and discussions. At the end of each session, additional time will be reserved for general discussion about that session as a whole. 4) Length An indication as to whether the workshop should be considered for a half-day, one or two-day meeting. We envisage a one-day meeting. 5) Organizing committee The names and full contact information (email and postal addresses, fax and telephone numbers) of the organizing committee-three or four people knowledgeable in the field-and short descriptions of their relevant expertise. Strong proposals include organizers who bring differing perspectives to the workshop topic and who are actively connected to the communities of potential participants.

Jason D. Williams University of Cambridge 53A Marlow Road London SE20 7YG United Kingdom +44 7786 683 013 [email protected] Jason Williams has been working full-time on spoken dialog systems for the past 8 years, dividing his time evenly between research and commercial deployments. In industry, he has built telephone-based spoken dialog systems for a host of companies such as Sony, BMW, Lowes, Travelocity, and the Home Shopping Network. In research, he has focused on applying Partially Observable Markov Decision Processes (POMDPs) to dialog management problems. In this pursuit, he has explored data collection methods, dialog model representations, and optimization techniques for POMDPs. Steve Young University of Cambridge Engineering Department Trumpington Street Cambridge CB2 1PZ +44 1223 332 654 [email protected] Steve Young is Head of the Information Engineering Division at Cambridge University. Previously he was Chief Scientist at Entropic Inc and an Architect in the Speech Products group at Microsoft. He has experience of using statistical methods in all aspects of speech and language processing including recognition, understanding and dialogue management. His most recent work conducted as part of the European EC Talk Project has focused on applying Partially Observable MDPs to practical dialogue information systems. Pascal Poupart School of Computer Science University of Waterloo 200 University Avenue West Waterloo, Ontario Canada N2L 3G1 +1 519 888 4567 x 6239 [email protected] Pascal Poupart is an assistant professor in the school of Computer Science

at the University of Waterloo in Canada. His research focuses on the development of decision-theoretic planning and statistical machine learning techniques, which he has applied to a range of applications, including spoken dialog systems, assistive technologies for dementia patients and ontology learning. In particular, some of his recent work include the development of robust dialogue management algorithms based on partially observable Markov decision processes. Stephanie Seneff Spoken Language Systems Group MIT Computer Science and Artificial Intelligence Laboratory MIT Stata Center 32 Vassar Street Cambridge, MA 02139 USA +1 617 253 0451 [email protected] Stephanie Seneff is a Principal Research Scientist in the Spoken Language Systems group at the Computer Science and Artificial Intelligence Laboratory at MIT. She has been conducting research on all aspects of spoken dialogue system development for the past 15 years, and has played a significant role in the development of mixed-initiative telephone-access dialogue systems in many different domains (weather, flights, restaurants, etc.) Her recent interests include generic spoken language understanding, generic dialogue modeling, portability and robustness in dialogue systems, user simulation, and multimodal and multilingual dialogue systems. 6) Potential attendees A list of potential attendees. Note: the attendees listed below have not been contacted this is an illustrative list of people who are either active in this area, or who have attended similar workshops in the recent past: Ingrid Zukerman, Monash University, Australia Jan Alexandersson, DFKI GmbH, Germany Arne Jnsson, Linkping University, Sweden Geniveve Gorrell, Linkping University, Sweden Dan Bohus, Carnegie Mellon University, USA Tim Paek, Microsoft Research, USA Alex Rudnicky, Carnegie Mellon University, USA Jim Glass, MIT, USA Victor Zue, MIT, USA Grace Chung, MIT, USA

Jost Schatzmann, University of Cambridge, USA Alex Gruenstein, MIT, USA Ed Filisko, MIT, USA Matthias Denecke, NTT Computer Science Laboratories, Japan Ian Lane, ATR Spoken Language Communication Research Labpratories, Japan Mihai Rotaru, Univeristy of Pittsburg, USA Nils Dahlbck, Linkping University, Sweden Diane Litman, University of Pittsburg, USA Marilyn Walker, University of Sheffield, UK Joe Polifroni, University of Sheffield, UK Nate Blaylock, Saarland University, Germany Antoine Raux, Carnegie Mellon University, USA Verena Rieser, Saarland University, Germany Jost Schatzmann, Cambridge University, UK Gabriel Skantze, KTH - Royal Institute of Technology, Sweden Matt Stuttle, University of Cambridge, UK Stefanie Tomko, Carnegie Mellon University, USA Oliver Lemon, University of Edinburgh, UK Jamie Henderson, University of Edinburgh, UK Roi Georgila, University of Edinburgh, UK Ryuichiro Higashinaka, University of Sheffield, UK Stephen Choularton, Macquarie University, Australia Stephen Cox, University of East Anglia, UK Gokham Tur, AT&T Research, USA Dilek Hakkani-Tur, AT&T Research, USA Guiseppe di Fabbrizio, AT&T Research, USA Dan Jurafsky, Stanford University, USA Manny Rayner, NASA, USA Elizabeth Shriberg, SRI, USA Johan Boye, Telia Research, Sweden Sandra Carberry, University of Delaware, USA Peter Heeman, Oregon Graduate Institute, USA Eric Horvitz, Microsoft Research, USA Kazunori Komatani, Kyoto University, Japan Staffan Larsson, Gteborgs Universitet, Sweden Michael McTear, University of Ulster, UK Norbert Reithinger, DFKI, Germany Candy Sidner, MERL, USA David Traum, USC Institute for Creative Technology, USA Joelle Pineau, McGill University, Canada Nick Roy, MIT, USA Satinder Singh, U of Michigan, USA

Documents

Statistical and Empirical Approaches to Spoken Dialog Systems