POSTECH Dialog-Based Computer Assisted Language Learning System
Intelligent Software Lab. POSTECH Prof. Gary Geunbae Lee
Slide 2
Contents Introduction Methods DB-CALL System Example-based
Dialog Modeling Feedback Generation Translation Assistance
Comprehension Assistance Language Learner Simulation User
Simulation Grammar Error Simulation Discussion
Slide 3
RESEARCH BACKGROUND Globalization makes English more important
as a world language Extremely high cost of native speaker tutors
Most language learning software are dedicated to pronunciation
practice Dialog-based Computer-assisted Language Learning will be
an excellent solution BACKGROUND DB-CALL system should be able to
understand students poor and non-n ative expressions DB-CALL system
should have high domain scalability to support various practical
scenarios DB-CALL system should provide educational functionalities
which help s tudents improve their linguistic ability ISSUES
Slide 4
PREVIOUS WORKS ON DB-CALL Lets Go (CMU, 02-04) Providing bus
schdule information for CMU Non-native students Adaptation the
acoustic model and language model to non- native speakers
Edit-distance based corrective feedback
Slide 5
PREVIOUS WORKS ON DB-CALL SPELL (Edinburgh, 05) Restourant
Domain Scenario-based virtual space Incorporating mal-rules into
the ASR grammar
Slide 6
PREVIOUS WORKS ON DB-CALL DEAL (KTH, 07) Trade Domain Finite
State Network-based limited dialog management When leaners get
stuck, the system provides hints
Slide 7
POSTECH DB-CALL System Crawler Description Extractor
Description Extractor + Parallel Sentence Extractor + ~~~~ ~~~
Expression > Description > Expression > Description >
Korean EXP > English EXP > Korean EXP > English EXP Try
this expression
Slide 8
DB-CALL System
Slide 9
1. Example-based Dialog Modeling
Slide 10
INTRODUCTION Spoken Dialog System Applications Human-Robot
Interface, Telematics, Tutoring,...
Slide 11
PROBLEM & GOAL PROBLEM How to determine the next system
action Knowledge-based approach Plan recipe / ISU rule / Agenda
Data-driven approach Statistical approach Supervised Learning based
on state approximation Reinforcement Learning based on MDP/POMDP
Example-based approach GOAL To develop a simple and practical
approach to dialog modeling for multi-domain dialog systems
Slide 12
IDEA Dialog State Space Domain = Building_Guidance Dialog Act =
WH-QUESTION Main Goal = SEARCH-LOC ROOM-TYPE=1 (filled),
ROOM-NAME=0 (unfilled) LOC-FLOOR=0, PER-NAME=0, PER-TITLE=0
Previous Dialog Act =, Previous Main Goal = Discourse History
Vector = [1,0,0,0,0] Lexico-semantic Pattern = ROOM_TYPE ? System
Action = inform(Floor) Dialog Corpus USER: ? [Dialog Act =
WH-QUESTION] [Main Goal = SEARCH-LOC] [ROOM-TYPE = ] SYSTEM: 3 , 2
, . [System Action = inform(Floor)] Turn #1
(Domain=Building_Guidance) Dialog Example Indexed by using semantic
& discourse features Having the similar state Lee et al.,
(2006), A Situation-based Dialogue Management using Dialogue
Examples, IEEE ICASSP
Slide 13
ALGORITHM Query Generation Making SQL statement using Discourse
History and SLU results. Example Search Trying to search
semantically close dialog examples in example DB given the current
dialog state. Example Selection Selecting the best example to
maximize the utterance similarity measure based on lexical and
discourse information. Noisy Input (from ASR/SLU) Example Search
Example Selection Query Generation Example DB Content DB Discourse
History NLG Relaxation Strategy System Template
Slide 14
EXPERIMENTAL RESULTS Real user evaluation 10 undergraduates
Evaluation Metric STR (Success Turn Rate) # of successful turns / #
of total turns TCR (Task Completion Rate) # of successful dialogs /
# of total dialogs AvgUserTurn Average users turn length per dialog
Lee et al., (2009), Example-based Dialog Modelng for Practical
Multi-domain Dialog Systems, SPECOM System#DialogsAvgUserTurn STR
(%) TCR (%) Car Navigation504.5486.2592.00 Weather Information
504.4689.0194.00 EPG504.5083.9990.00 Chatbot505.6064.31-
Multi-domain156.0878.7786.67
Slide 15
EXPERIMENTAL RESULTS Lee et al., (2009), Example-based Dialog
Modelng for Practical Multi-domain Dialog Systems, SPECOM
SystemExact matchPartial matchNo example Car
Navigation50.2244.495.29 Weather Information 69.4925.005.51
EPG58.3337.224.45 Chatbot50.7114.2935.00 Multi-domain69.2324.626.15
Example match rate of each dialog system
Slide 16
ROBUST DIALOG MANAGEMENT PROBLEM How to overcome errors in the
real world ROBUST DIALOG MANAGEMENT Error handling Recovering
ASR/SLU errors by interacting with the user at the conversational
level N-best support Estimating the current state with uncertanity
ASRSLUDM Noise reduction Adaptation N-best & lattice & CN
Robust parsing Data-driven app. Error handling N-best support
+ERROR Lee et al., (2008), Robust management with n-best hypotheses
using dialog examples and agenda, ACL
Slide 17
GOAL & IDEA To increase the robustness of EBDM with prior
knowledge 1) Error Handling If the system knows what the user will
do next Dynamic Help Generation LOCATION OFFICE PHONE NUMBER ROOM
ROLE GUIDE FOCUS NODE NEXT_TASK AgendaHelp S: Next, you can do the
subtask 1) Asking the room's role, or 2)Asking the office phone
number, or 3) Selecting the desired room for navigation. UtterHelp
S: Next, you can say 1) What is it?, or 2) Whats the phone number
of [ROOM_NAME]?, or 3) Lets go there.
Slide 18
GOAL & IDEA To increase the robustness of EBDM with prior
knowledge 2) N-best support If the system knows which subtask will
be more probable next Rescoring N-best hypotheses ( h 1 ~h n )
LOCATION OFFICE PHONE NUMBER FLOOR ROOM NAME h2h2 h1h1 h3h3 h4h4
SubtaskSystem UtteranceSystem Action LOCATION The directors room is
Room No. 201. Inform(RoomNumber) N-bestUser Utterances Subtas k P(h
i |S) U1 (h 1 ) What are office rooms in this building? ROOM NAME
0.2 U2 (h 2 )What is the floor?FLOOR0.4 U3 (h 3 )Where is
it?LOCATION0.3 U4 (h 4 ) What is the phone number? OFFICE PHONE
NUMBER 0.5 (More probable)
EXPERIMENT SET-UP Simulated User Evaluation Test set : 1000
simulated dialogs (
Slide 21
EXPERIMENTAL RESULTS Lee et al., (2009), Hybrid Approach to
Robust Dialog Management using Agenda and Dialog Examples, CSL,
(Submitted) Legends Methods P-EUsing only Examples P-ERUsing
Examples + Recovery P-EAUsing Examples + Agenda Graph P-EARUsing
Examples + Agenda Graph + Recovery The average score of different
methods
Slide 22
EXPERIMENTAL RESULTS Lee et al., (2009), Hybrid Approach to
Robust Dialog Management using Agenda and Dialog Examples, CSL,
(Submitted) The average score of the P-EAR system according to
n-best size
Slide 23
DEMO VIDEO PC demo
Slide 24
DEMO VIDEO Robot demo
Slide 25
2. Feedback Generation
Slide 26
INTRODUCTION Recast Feedback User Input Tutor: ---------- User:
---------- Tutor: ---------- User: ---------- Tutor: ----------
User: ---------- Tutor: ---------- User: ---------- > Expression
> Description > Expression > Description > Korean EXP
> English EXP > Korean EXP > English EXP Tutor: What is
the purpose of you trip? User: My purpose business Tutor: Sorry, I
dont understand. What did you say? User: I am here on business Try
this expression I am here on business Clarification Request Recast
Feedback Learner Uptake Tutoring Process
Slide 27
INTRODUCTION Expression Suggestion User Input Tutor: ----------
User: ---------- Tutor: ---------- User: ---------- Tutor:
---------- User: ---------- Tutor: ---------- User: ---------- >
Expression > Description > Expression > Description >
Korean EXP > English EXP > Korean EXP > English EXP Tutor:
What is the purpose of you trip? Tutor: Sorry, I cant hear you.
User: I am here on business Try this expression I am here on
business TIMEOUT Expression Suggestion Learner Uptake Tutoring
Process
Slide 28
PROBLEMS How to recognize user intentions despite numerous
errors in their utterances The mal-rule based technique used in
previous studies doesnt work on low level learners due to multiple
errors Some utterances even seem to have a meaning that differs
from what they intended to say Intended meaning : When does the bus
leave? learners utterance : Which time I have to leave? How to
choose appropriate user intentions to suggest when a timeout is
expired The system should take into consideration the dialog
context as human tutors do Performing Intention-based soft
pattern-matching to generate correct feedback
Slide 29
MATHODS Context-aware & Level-specific Intention
Recognition Intention-based pattern matching Level 1 Utterance
Model Level 1 Data Learners Utterance Dialog State based Model
Level 2 Utterance Model Level N Utterance Model Level 2 Data Level
N Data Dialog State Learners Intention Example Expresssion DB
Example Search Example Expressions Pattern Matching Feedback
Intention RecognizerDialog Manager Dialog State Update
Slide 30
EXPERIMENT SET-UP Primitive data set Immigration domain 192
dialogs, 3517 utterances (18.32 utt/dialog) Annotation Manually
annotated each utterance with the speakers intention and component
slot-values Automatically annotated each utterance with the
discourse information
Demo: POSTECH DB-CALL initial version 2008 Demo: POSTECH
DB-CALL initial version 2008
Slide 35
3. Translation Assistance
Slide 36
Architecture Example format Web Parallel Sentence Example
Extraction ESL Dialog system / Other Applications Query Expression
Search Engine Interface (function call) ~~~~~~~ ~~~~~~~~ ~~~~~~~~
~~~~ ~~~~~~ ~~~~~~~ ~~~~~~~~ ~~~~~~~~ ~~~~ ~~~~~~ ~~~~~~~ ~~~~~~~~
~~~~~~~~ ~~~~ ~~~~~~ Analysis
Slide 37
Building Bilingual Example Word alignment Widely used in
Statistical Machine Translation IBM Model 1~5, Symmetrization
heuristics Word alignment presents a correspondence of each
word/phrase in a given bilingual example Example word alignment (
GIZA++ )
Slide 38
4. Comprehension Assistance
Slide 39
INTRODUCTION ESL pobcast website Expression- description DB
Dialog System Description Suggestion System English
Expression-Description Example Suggestion System When the user asks
for a unfamiliar English expression, the system present its
description to help understanding Expression detection Recommend
sentence description
Slide 40
INTRODUCTION Expression-Description Pair Extraction System To
present the expression example and its description, the system
extracts expression- description pair from ESL podcast site
PhraseDescription routine test we mean it's a normal, regular test
that the doctor runs many, many different times with different
patients, not a special test. TreatmentTreatment is another word
for what the doctor gives you or does to you to help you.
Slide 41
EXAMPLE [script] [description]
Slide 42
EXAMPLE [script] [description]
Slide 43
Language Learner Simulation
Slide 44
1. User Simulation
Slide 45
INTRODUCTION User Simulation For Spoken Dialog System
Developing `simulated user who can replace real users Application
Automated evaluation of Spoken Dialog System Detecting potential
flaws Predicting overall behaviors of system Learning dialog
strategy in reinforcement learning framework
Slide 46
PROBLEM & GOAL PROBLEM How to model real user User
Intention simulation User Surface simulation ASR channel simulation
GOAL Natural Simulation Diverse Simulation Controllable
Simulation
Slide 47
Slide 48
IDEA User Intention Simulation Jung et al., 2009, Data-driven
user simulation for automated evaluation of spoken dialog systems,
Computer Speech and Language. Discourse Factors + Knowledge +
Events Dialog is sequential behaviors Especially, user intention
User Intention simulation should take care of various discourse
information User Sys User Sys User Sys
Slide 49
User Intention Simulation - Linear Conditional Random Field
model Turn Assumption An user utterance has only one intention U I
: User Intention State State=[dialog_act, main_goal,
named_entities] DI : Previou Discourse Information System Response
+ Discourse History UIUI DI UIUI UIUI UIUI Jung et al., 2009,
Data-driven user simulation for automated evaluation of spoken
dialog systems, Computer Speech and Language.
Slide 50
ALGORITHM Jung et al., 2009, Data-driven user simulation for
automated evaluation of spoken dialog systems, Computer Speech and
Language.
Slide 51
User Surface Simulation PROBLEM How to generate user surface
utterance which express given user intention Approach 2-phase user
utterance generation 1-phase : candidate generation 2-phase :
rescoring Utterance.. User Utterance Model Simulation Selected
Utterance Rescoring 1 - phase 2 -phase
Slide 52
1 phase - Generation Dialog_Act _X_ Main_Goal S1 W1 S2 W2 S3 W3
S4 W4 S5 W5 Structure Tag Transition Emission Prob. Structure Tags
: Component Slot Names + Part of Speech Tags S : member of
Structure Tags given space W : member of vocabulary given space
Generation
Slide 53
2phase - Rescoring PROBLEM Rescoring and Selecting the good
utterances Criteria Human-like utterance Natural word transition
APPROACH Structure and Word interpolated BLEU score SWB score
Notice that Evaluation on system generated utterances on utterance
simulation and machine translation shares the same task SWB = *
Structure_Sequence_BLEU + (1- )* Word_Sequence_BLEU, where 0 1 We
set beta as 0.2 since Korean language is an agglutinative language
so that it is relatively free to the structural grammar. Jung et
al., 2009, Data-driven user simulation for automated evaluation of
spoken dialog systems, Computer Speech and Language.
Slide 54
ALGORITHM Jung et al., 2009, Data-driven user simulation for
automated evaluation of spoken dialog systems, Computer Speech and
Language.
Slide 55
ASR Channel Simulation PROBLEM How to simulate ASR channel
Knowledge-based approach Statistical Approach It is difficult to
collect speech data for target domain. WER controllable simulation
APPROACH Linguistic Knowledge based simulation Step 1 : Determining
error position Step 2 : Generating Error types on error marked
words Step 3 : Generating ASR Errors ( Substitution, Deletion,
Insertions) Step 4 : Rescoring and selecting erroneous utterance
Jung et al., 2009, Data-driven user simulation for automated
evaluation of spoken dialog systems, Computer Speech and
Language.
Slide 56
Error Type Distribution Determining Error types Based on the
results of English Speech Recognition We assume that Korean speech
recognition has similar error distribution generally. Greenberg et
al., 2000
Slide 57
Error Generation Insertion error Insert random word before the
insertion error mark Deletion error Just delete it Substitution
Error Based on Sequence Alignment Algorithm Syllable-and
Phone-based Alignment Selecting some candidates in a dictionary
Dynamic local alignment algorithm : Needleman and Wunsch (1970) Get
the similarity score Similarity = * Syllable_Alignment_Score + (1-
)* Phoneme_Alignment_Score, where 0 1 Vowel Confusion Matrix
example
Slide 58
Slide 59
EXPERIMENT SET-UP Korean Car navigation Dialog system SLU :
Jeong and Lee (2006) DM : Lee et al. (2009) Word Error Rate : 0.0 ~
0.4 5000 dialog samples at each WER setting
Slide 60
Intention Jung et al., 2009, Data-driven user simulation for
automated evaluation of spoken dialog systems, Computer Speech and
Language.
Slide 61
D-BLEU ( Discourse BLEU) is a metric for measuring naturalness
of simulated dialogs in the sense of n-gram precision based on BLEU
metric calculation. Intention Jung et al., 2009, Data-driven user
simulation for automated evaluation of spoken dialog systems,
Computer Speech and Language.
Slide 62
Utterance Jung et al., 2009, Data-driven user simulation for
automated evaluation of spoken dialog systems, Computer Speech and
Language.
Slide 63
ASR channel Jung et al., 2009, Data-driven user simulation for
automated evaluation of spoken dialog systems, Computer Speech and
Language.
Slide 64
ASR channel Jung et al., 2009, Data-driven user simulation for
automated evaluation of spoken dialog systems, Computer Speech and
Language.
Slide 65
Overall prediction Jung et al., 2009, Data-driven user
simulation for automated evaluation of spoken dialog systems,
Computer Speech and Language.
Slide 66
2. Grammar Error Simulation
Slide 67
INTRODUCTION Language learner simulation requires us to invent
grammar error simulation on top of the general user simulation SLU
Dialog Manager System Utterance Generator Dialog System Non-native
ASR TTS Grammar Errors Simulator User Utterance Simulator User
Intention Simulator ASR Errors Simulator Language Learner
Simulator
Slide 68
REALISTIC ERROR He wants to go to a movie theater He wants to
to a movie theater He want go to movie theater VS.
Slide 69
PROBLEMS How to incorporate expert knowledge about error
characteristics of Korean language learners into the statistical
model Subject-verb agreement errors Omission errors of the
preposition of prepositional verbs Omission errors of articles
Etc.
Slide 70
MARKOV LOGIC NETWORK Sungjin Lee, Gary Geunbae Lee. Realistic
grammar error simulation using markov logic. ACL 2009
Slide 71
METHOD The generation procedure involves three steps:
Generating probability over error types for each word through MLN
inference Determining an error type by sampling the generated
probability for each word Creating an ill-formed output sentence by
realizing the chosen error types Hewantstogotoamovietheater
v_agr_sub prp_lex_del at_del none 0.000 0.921 0.371 0.000 0.449
0.000 0.284 0.000 0.604 0.000 0.866 0.000 0.269 0.000 0.605 0.000
0.355 0.506 0.000 0.781 0.000 0.798 nonev_agr_subprp_lex_delnone
at_delnone Hewantgotomovietheater 1 step 2 step 3 step Inference
Sampling Realization
Slide 72
EXPERIMENT SET-UP Data Sets NICT JLE Corpus Dividing the 167
error annotated files into 3 level groups: Beginner(1-4) : 2,905
Intermediate(5-6) : 3,296 Advanced(7-9) : 2,752 Evaluation 10-fold
cross validations performed for each group The validation results
were added together across the rounds
Slide 73
EXPERIMENTAL RESULTS Advanced D KL (Real || Proposed)=0.068 vs.
D KL (Real || Baseline)=0.122
Slide 74
EXPERIMENTAL RESULTS Intermediate D KL (Real || Proposed)=0.075
vs. D KL (Real || Baseline)=0.142
Slide 75
EXPERIMENTAL RESULTS Beginner D KL (Real || Proposed)=0.075 vs.
D KL (Real || Baseline)=0.092
Slide 76
EXPERIMENTAL RESULTS Human Judgment Evaluated 100 randomly
chosen sentences consisting of 50 sent ences each from the real and
simulated data The sequence of the test sentences was mixed so that
the hum an judges did not know whether the source of the sentence
wa s real or simulated Two-level scale (0: Unrealistic, 1:
Realistic) Sungjin Lee, Gary Geunbae Lee. Realistic grammar error
simulation using markov logic. ACL 2009