POSTECH Dialog-Based Computer Assisted Language Learning System Intelligent Software Lab. POSTECH Prof. Gary Geunbae Lee

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

  • Slide 1
  • POSTECH Dialog-Based Computer Assisted Language Learning System Intelligent Software Lab. POSTECH Prof. Gary Geunbae Lee
  • Slide 2
  • Contents Introduction Methods DB-CALL System Example-based Dialog Modeling Feedback Generation Translation Assistance Comprehension Assistance Language Learner Simulation User Simulation Grammar Error Simulation Discussion
  • Slide 3
  • RESEARCH BACKGROUND Globalization makes English more important as a world language Extremely high cost of native speaker tutors Most language learning software are dedicated to pronunciation practice Dialog-based Computer-assisted Language Learning will be an excellent solution BACKGROUND DB-CALL system should be able to understand students poor and non-n ative expressions DB-CALL system should have high domain scalability to support various practical scenarios DB-CALL system should provide educational functionalities which help s tudents improve their linguistic ability ISSUES
  • Slide 4
  • PREVIOUS WORKS ON DB-CALL Lets Go (CMU, 02-04) Providing bus schdule information for CMU Non-native students Adaptation the acoustic model and language model to non- native speakers Edit-distance based corrective feedback
  • Slide 5
  • PREVIOUS WORKS ON DB-CALL SPELL (Edinburgh, 05) Restourant Domain Scenario-based virtual space Incorporating mal-rules into the ASR grammar
  • Slide 6
  • PREVIOUS WORKS ON DB-CALL DEAL (KTH, 07) Trade Domain Finite State Network-based limited dialog management When leaners get stuck, the system provides hints
  • Slide 7
  • POSTECH DB-CALL System Crawler Description Extractor Description Extractor + Parallel Sentence Extractor + ~~~~ ~~~ Expression > Description > Expression > Description > Korean EXP > English EXP > Korean EXP > English EXP Try this expression
  • Slide 8
  • DB-CALL System
  • Slide 9
  • 1. Example-based Dialog Modeling
  • Slide 10
  • INTRODUCTION Spoken Dialog System Applications Human-Robot Interface, Telematics, Tutoring,...
  • Slide 11
  • PROBLEM & GOAL PROBLEM How to determine the next system action Knowledge-based approach Plan recipe / ISU rule / Agenda Data-driven approach Statistical approach Supervised Learning based on state approximation Reinforcement Learning based on MDP/POMDP Example-based approach GOAL To develop a simple and practical approach to dialog modeling for multi-domain dialog systems
  • Slide 12
  • IDEA Dialog State Space Domain = Building_Guidance Dialog Act = WH-QUESTION Main Goal = SEARCH-LOC ROOM-TYPE=1 (filled), ROOM-NAME=0 (unfilled) LOC-FLOOR=0, PER-NAME=0, PER-TITLE=0 Previous Dialog Act =, Previous Main Goal = Discourse History Vector = [1,0,0,0,0] Lexico-semantic Pattern = ROOM_TYPE ? System Action = inform(Floor) Dialog Corpus USER: ? [Dialog Act = WH-QUESTION] [Main Goal = SEARCH-LOC] [ROOM-TYPE = ] SYSTEM: 3 , 2 , . [System Action = inform(Floor)] Turn #1 (Domain=Building_Guidance) Dialog Example Indexed by using semantic & discourse features Having the similar state Lee et al., (2006), A Situation-based Dialogue Management using Dialogue Examples, IEEE ICASSP
  • Slide 13
  • ALGORITHM Query Generation Making SQL statement using Discourse History and SLU results. Example Search Trying to search semantically close dialog examples in example DB given the current dialog state. Example Selection Selecting the best example to maximize the utterance similarity measure based on lexical and discourse information. Noisy Input (from ASR/SLU) Example Search Example Selection Query Generation Example DB Content DB Discourse History NLG Relaxation Strategy System Template
  • Slide 14
  • EXPERIMENTAL RESULTS Real user evaluation 10 undergraduates Evaluation Metric STR (Success Turn Rate) # of successful turns / # of total turns TCR (Task Completion Rate) # of successful dialogs / # of total dialogs AvgUserTurn Average users turn length per dialog Lee et al., (2009), Example-based Dialog Modelng for Practical Multi-domain Dialog Systems, SPECOM System#DialogsAvgUserTurn STR (%) TCR (%) Car Navigation504.5486.2592.00 Weather Information 504.4689.0194.00 EPG504.5083.9990.00 Chatbot505.6064.31- Multi-domain156.0878.7786.67
  • Slide 15
  • EXPERIMENTAL RESULTS Lee et al., (2009), Example-based Dialog Modelng for Practical Multi-domain Dialog Systems, SPECOM SystemExact matchPartial matchNo example Car Navigation50.2244.495.29 Weather Information 69.4925.005.51 EPG58.3337.224.45 Chatbot50.7114.2935.00 Multi-domain69.2324.626.15 Example match rate of each dialog system
  • Slide 16
  • ROBUST DIALOG MANAGEMENT PROBLEM How to overcome errors in the real world ROBUST DIALOG MANAGEMENT Error handling Recovering ASR/SLU errors by interacting with the user at the conversational level N-best support Estimating the current state with uncertanity ASRSLUDM Noise reduction Adaptation N-best & lattice & CN Robust parsing Data-driven app. Error handling N-best support +ERROR Lee et al., (2008), Robust management with n-best hypotheses using dialog examples and agenda, ACL
  • Slide 17
  • GOAL & IDEA To increase the robustness of EBDM with prior knowledge 1) Error Handling If the system knows what the user will do next Dynamic Help Generation LOCATION OFFICE PHONE NUMBER ROOM ROLE GUIDE FOCUS NODE NEXT_TASK AgendaHelp S: Next, you can do the subtask 1) Asking the room's role, or 2)Asking the office phone number, or 3) Selecting the desired room for navigation. UtterHelp S: Next, you can say 1) What is it?, or 2) Whats the phone number of [ROOM_NAME]?, or 3) Lets go there.
  • Slide 18
  • GOAL & IDEA To increase the robustness of EBDM with prior knowledge 2) N-best support If the system knows which subtask will be more probable next Rescoring N-best hypotheses ( h 1 ~h n ) LOCATION OFFICE PHONE NUMBER FLOOR ROOM NAME h2h2 h1h1 h3h3 h4h4 SubtaskSystem UtteranceSystem Action LOCATION The directors room is Room No. 201. Inform(RoomNumber) N-bestUser Utterances Subtas k P(h i |S) U1 (h 1 ) What are office rooms in this building? ROOM NAME 0.2 U2 (h 2 )What is the floor?FLOOR0.4 U3 (h 3 )Where is it?LOCATION0.3 U4 (h 4 ) What is the phone number? OFFICE PHONE NUMBER 0.5 (More probable)
  • Slide 19
  • ALGORITHM ASRSLU From User w1w1 w2w2 wnwn u1u1 u2u2 unun EBDM V1V1 V2V2 V3V3 V6V6 V7V7 V4V4 V5V5 V8V8 V9V9 s1s1 s2s2 snsn Discourse Interpretation Focus Stack V1V1 V2V2 Argmax Node Argmax Example am*am* V3V3 V4V4 V6V6 V6V6 e1e1 e2e2 ekek ej*ej* V6V6
  • Slide 20
  • EXPERIMENT SET-UP Simulated User Evaluation Test set : 1000 simulated dialogs (
  • Slide 21
  • EXPERIMENTAL RESULTS Lee et al., (2009), Hybrid Approach to Robust Dialog Management using Agenda and Dialog Examples, CSL, (Submitted) Legends Methods P-EUsing only Examples P-ERUsing Examples + Recovery P-EAUsing Examples + Agenda Graph P-EARUsing Examples + Agenda Graph + Recovery The average score of different methods
  • Slide 22
  • EXPERIMENTAL RESULTS Lee et al., (2009), Hybrid Approach to Robust Dialog Management using Agenda and Dialog Examples, CSL, (Submitted) The average score of the P-EAR system according to n-best size
  • Slide 23
  • DEMO VIDEO PC demo
  • Slide 24
  • DEMO VIDEO Robot demo
  • Slide 25
  • 2. Feedback Generation
  • Slide 26
  • INTRODUCTION Recast Feedback User Input Tutor: ---------- User: ---------- Tutor: ---------- User: ---------- Tutor: ---------- User: ---------- Tutor: ---------- User: ---------- > Expression > Description > Expression > Description > Korean EXP > English EXP > Korean EXP > English EXP Tutor: What is the purpose of you trip? User: My purpose business Tutor: Sorry, I dont understand. What did you say? User: I am here on business Try this expression I am here on business Clarification Request Recast Feedback Learner Uptake Tutoring Process
  • Slide 27
  • INTRODUCTION Expression Suggestion User Input Tutor: ---------- User: ---------- Tutor: ---------- User: ---------- Tutor: ---------- User: ---------- Tutor: ---------- User: ---------- > Expression > Description > Expression > Description > Korean EXP > English EXP > Korean EXP > English EXP Tutor: What is the purpose of you trip? Tutor: Sorry, I cant hear you. User: I am here on business Try this expression I am here on business TIMEOUT Expression Suggestion Learner Uptake Tutoring Process
  • Slide 28
  • PROBLEMS How to recognize user intentions despite numerous errors in their utterances The mal-rule based technique used in previous studies doesnt work on low level learners due to multiple errors Some utterances even seem to have a meaning that differs from what they intended to say Intended meaning : When does the bus leave? learners utterance : Which time I have to leave? How to choose appropriate user intentions to suggest when a timeout is expired The system should take into consideration the dialog context as human tutors do Performing Intention-based soft pattern-matching to generate correct feedback
  • Slide 29
  • MATHODS Context-aware & Level-specific Intention Recognition Intention-based pattern matching Level 1 Utterance Model Level 1 Data Learners Utterance Dialog State based Model Level 2 Utterance Model Level N Utterance Model Level 2 Data Level N Data Dialog State Learners Intention Example Expresssion DB Example Search Example Expressions Pattern Matching Feedback Intention RecognizerDialog Manager Dialog State Update
  • Slide 30
  • EXPERIMENT SET-UP Primitive data set Immigration domain 192 dialogs, 3517 utterances (18.32 utt/dialog) Annotation Manually annotated each utterance with the speakers intention and component slot-values Automatically annotated each utterance with the discourse information
  • Slide 31
  • EXPERIMENTAL RESULTS Utterance Model Hybrid Model
  • Slide 32
  • EXPERIMENTAL RESULTS Level-spec Hybrid Level-spec Utterance Level-ignore Hybrid Level-ignore Utterance
  • Slide 33
  • EXPERIMENTAL RESULTS
  • Slide 34
  • Demo: POSTECH DB-CALL initial version 2008 Demo: POSTECH DB-CALL initial version 2008
  • Slide 35
  • 3. Translation Assistance
  • Slide 36
  • Architecture Example format Web Parallel Sentence Example Extraction ESL Dialog system / Other Applications Query Expression Search Engine Interface (function call) ~~~~~~~ ~~~~~~~~ ~~~~~~~~ ~~~~ ~~~~~~ ~~~~~~~ ~~~~~~~~ ~~~~~~~~ ~~~~ ~~~~~~ ~~~~~~~ ~~~~~~~~ ~~~~~~~~ ~~~~ ~~~~~~ Analysis
  • Slide 37
  • Building Bilingual Example Word alignment Widely used in Statistical Machine Translation IBM Model 1~5, Symmetrization heuristics Word alignment presents a correspondence of each word/phrase in a given bilingual example Example word alignment ( GIZA++ )
  • Slide 38
  • 4. Comprehension Assistance
  • Slide 39
  • INTRODUCTION ESL pobcast website Expression- description DB Dialog System Description Suggestion System English Expression-Description Example Suggestion System When the user asks for a unfamiliar English expression, the system present its description to help understanding Expression detection Recommend sentence description
  • Slide 40
  • INTRODUCTION Expression-Description Pair Extraction System To present the expression example and its description, the system extracts expression- description pair from ESL podcast site PhraseDescription routine test we mean it's a normal, regular test that the doctor runs many, many different times with different patients, not a special test. TreatmentTreatment is another word for what the doctor gives you or does to you to help you.
  • Slide 41
  • EXAMPLE [script] [description]
  • Slide 42
  • EXAMPLE [script] [description]
  • Slide 43
  • Language Learner Simulation
  • Slide 44
  • 1. User Simulation
  • Slide 45
  • INTRODUCTION User Simulation For Spoken Dialog System Developing `simulated user who can replace real users Application Automated evaluation of Spoken Dialog System Detecting potential flaws Predicting overall behaviors of system Learning dialog strategy in reinforcement learning framework
  • Slide 46
  • PROBLEM & GOAL PROBLEM How to model real user User Intention simulation User Surface simulation ASR channel simulation GOAL Natural Simulation Diverse Simulation Controllable Simulation
  • Slide 47
  • Slide 48
  • IDEA User Intention Simulation Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language. Discourse Factors + Knowledge + Events Dialog is sequential behaviors Especially, user intention User Intention simulation should take care of various discourse information User Sys User Sys User Sys
  • Slide 49
  • User Intention Simulation - Linear Conditional Random Field model Turn Assumption An user utterance has only one intention U I : User Intention State State=[dialog_act, main_goal, named_entities] DI : Previou Discourse Information System Response + Discourse History UIUI DI UIUI UIUI UIUI Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 50
  • ALGORITHM Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 51
  • User Surface Simulation PROBLEM How to generate user surface utterance which express given user intention Approach 2-phase user utterance generation 1-phase : candidate generation 2-phase : rescoring Utterance.. User Utterance Model Simulation Selected Utterance Rescoring 1 - phase 2 -phase
  • Slide 52
  • 1 phase - Generation Dialog_Act _X_ Main_Goal S1 W1 S2 W2 S3 W3 S4 W4 S5 W5 Structure Tag Transition Emission Prob. Structure Tags : Component Slot Names + Part of Speech Tags S : member of Structure Tags given space W : member of vocabulary given space Generation
  • Slide 53
  • 2phase - Rescoring PROBLEM Rescoring and Selecting the good utterances Criteria Human-like utterance Natural word transition APPROACH Structure and Word interpolated BLEU score SWB score Notice that Evaluation on system generated utterances on utterance simulation and machine translation shares the same task SWB = * Structure_Sequence_BLEU + (1- )* Word_Sequence_BLEU, where 0 1 We set beta as 0.2 since Korean language is an agglutinative language so that it is relatively free to the structural grammar. Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 54
  • ALGORITHM Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 55
  • ASR Channel Simulation PROBLEM How to simulate ASR channel Knowledge-based approach Statistical Approach It is difficult to collect speech data for target domain. WER controllable simulation APPROACH Linguistic Knowledge based simulation Step 1 : Determining error position Step 2 : Generating Error types on error marked words Step 3 : Generating ASR Errors ( Substitution, Deletion, Insertions) Step 4 : Rescoring and selecting erroneous utterance Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 56
  • Error Type Distribution Determining Error types Based on the results of English Speech Recognition We assume that Korean speech recognition has similar error distribution generally. Greenberg et al., 2000
  • Slide 57
  • Error Generation Insertion error Insert random word before the insertion error mark Deletion error Just delete it Substitution Error Based on Sequence Alignment Algorithm Syllable-and Phone-based Alignment Selecting some candidates in a dictionary Dynamic local alignment algorithm : Needleman and Wunsch (1970) Get the similarity score Similarity = * Syllable_Alignment_Score + (1- )* Phoneme_Alignment_Score, where 0 1 Vowel Confusion Matrix example
  • Slide 58
  • Slide 59
  • EXPERIMENT SET-UP Korean Car navigation Dialog system SLU : Jeong and Lee (2006) DM : Lee et al. (2009) Word Error Rate : 0.0 ~ 0.4 5000 dialog samples at each WER setting
  • Slide 60
  • Intention Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 61
  • D-BLEU ( Discourse BLEU) is a metric for measuring naturalness of simulated dialogs in the sense of n-gram precision based on BLEU metric calculation. Intention Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 62
  • Utterance Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 63
  • ASR channel Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 64
  • ASR channel Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 65
  • Overall prediction Jung et al., 2009, Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech and Language.
  • Slide 66
  • 2. Grammar Error Simulation
  • Slide 67
  • INTRODUCTION Language learner simulation requires us to invent grammar error simulation on top of the general user simulation SLU Dialog Manager System Utterance Generator Dialog System Non-native ASR TTS Grammar Errors Simulator User Utterance Simulator User Intention Simulator ASR Errors Simulator Language Learner Simulator
  • Slide 68
  • REALISTIC ERROR He wants to go to a movie theater He wants to to a movie theater He want go to movie theater VS.
  • Slide 69
  • PROBLEMS How to incorporate expert knowledge about error characteristics of Korean language learners into the statistical model Subject-verb agreement errors Omission errors of the preposition of prepositional verbs Omission errors of articles Etc.
  • Slide 70
  • MARKOV LOGIC NETWORK Sungjin Lee, Gary Geunbae Lee. Realistic grammar error simulation using markov logic. ACL 2009
  • Slide 71
  • METHOD The generation procedure involves three steps: Generating probability over error types for each word through MLN inference Determining an error type by sampling the generated probability for each word Creating an ill-formed output sentence by realizing the chosen error types Hewantstogotoamovietheater v_agr_sub prp_lex_del at_del none 0.000 0.921 0.371 0.000 0.449 0.000 0.284 0.000 0.604 0.000 0.866 0.000 0.269 0.000 0.605 0.000 0.355 0.506 0.000 0.781 0.000 0.798 nonev_agr_subprp_lex_delnone at_delnone Hewantgotomovietheater 1 step 2 step 3 step Inference Sampling Realization
  • Slide 72
  • EXPERIMENT SET-UP Data Sets NICT JLE Corpus Dividing the 167 error annotated files into 3 level groups: Beginner(1-4) : 2,905 Intermediate(5-6) : 3,296 Advanced(7-9) : 2,752 Evaluation 10-fold cross validations performed for each group The validation results were added together across the rounds
  • Slide 73
  • EXPERIMENTAL RESULTS Advanced D KL (Real || Proposed)=0.068 vs. D KL (Real || Baseline)=0.122
  • Slide 74
  • EXPERIMENTAL RESULTS Intermediate D KL (Real || Proposed)=0.075 vs. D KL (Real || Baseline)=0.142
  • Slide 75
  • EXPERIMENTAL RESULTS Beginner D KL (Real || Proposed)=0.075 vs. D KL (Real || Baseline)=0.092
  • Slide 76
  • EXPERIMENTAL RESULTS Human Judgment Evaluated 100 randomly chosen sentences consisting of 50 sent ences each from the real and simulated data The sequence of the test sentences was mixed so that the hum an judges did not know whether the source of the sentence wa s real or simulated Two-level scale (0: Unrealistic, 1: Realistic) Sungjin Lee, Gary Geunbae Lee. Realistic grammar error simulation using markov logic. ACL 2009
  • Slide 77
  • Q & A