53
Neural Dialog Shrimai Prabhumoye Alan W Black Speech Processing 11-[468]92

neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

NeuralDialogShrimai Prabhumoye

AlanWBlackSpeechProcessing11-[468]92

Page 2: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Review

• TaskOrientedSystems• Intents,slots,actionsandresponse

• Non-TaskOrientedSystems• Noagenda,forfun

• Buildingdialogsystems• RuleBasedSystems• Eliza

• RetrievalTechniques• Representations:TF-IDF,N-grams,wordsthemselves• SimilarityMeasures:Jaccard,cosine,euclidean distance• Limitations– fixedsetofresponses,novariationinresponse

Page 3: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Review

• TaskOrientedSystems• Non-TaskOrientedSystems• Buildingdialogsystems• RetrievalTechniques

• Representation• WordVectors

• SimilarityMeasures• Limitations– fixedsetofresponses,novariationinresponse

• GenerativeModels

Page 4: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Overview

• WordEmbeddings

• LanguageModelling

• RecurrentNeuralNetworks• SequencetoSequenceModels

• HowtoBuildDialogSystem

• IssuesandExamples

• Alexa-Prize

Page 5: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

NeuralDialog

• Wewanttomodel:

• Howtowerepresentsentence(𝑃 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 , 𝑃 𝑖𝑛𝑝𝑢𝑡 ?)• Howtobuildalanguagemodel.• Howtorepresentswords(wordembeddings?)

𝑷 𝒓𝒆𝒔𝒑𝒐𝒏𝒔𝒆 𝒊𝒏𝒑𝒖𝒕)

Page 6: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

NaturalLanguageProcessing

• Typicalpreprocessingstepso FormvocabularyofwordsthatmapswordstoauniqueIDo Differentcriteriacanbeusedtoselectwhichwordsarepartof

thevocabulary(eg:thresholdfrequency)o Allwordsnotinthevocabularywillbemappedtoaspecial‘out-

of-vocabulary’• Typicalvocabularysizeswillvarybetween10,000and250,000

(Salakhutdinov,2017)

Page 7: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

PreprocessingTechniques

• Tokenization• “Iamagirl.”tokenizedto“I”,“am”,“a”,“girl”,“.”

• Lowercaseallwords• RemovingStopWords• Ex:“the”,“a”,“and”,etc

• FrequencyofWords• SetathresholdandmakeallwordsbelowthisfrequencyasUNK

• Add<START>and<EOS>tagatthebeginningandendofsentence.

(Salakhutdinov,2017)

Page 8: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Vocabulary

Page 9: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

One-HotEncoding

• FromitswordID,wegetabasicrepresentationofawordthroughthe

one-hotencodingoftheID

• theone-hotvectorofanIDisavectorfilledwith0s,exceptfora1at

thepositionassociatedwiththeID

• ForvocabularysizeD=10,theone-hotvectorofwordIDw=4is:

𝑒 𝑤 = [0001000000]

(Salakhutdinov,2017)

Page 10: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

LimitationsofOne-HotEncoding

Page 11: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

LimitationsofOne-HotEncoding

• Aone-hotencodingmakesnoassumptionaboutwordsimilarity.o [“working”,“on”,“Friday”,“is”,“tiring”]doesnotappearinourtrainingset.

o [“working”,“on”,“Monday”,“is”,“tiring”]isinthetrainset.oWewanttomodel𝑃 “𝑡𝑖𝑟𝑖𝑛𝑔” “𝑤𝑜𝑟𝑘𝑖𝑛𝑔”, “𝑜𝑛”, “𝐹𝑟𝑖𝑑𝑎𝑦”, “𝑖𝑠”)oWordrepresentationof“Monday”and“Friday” aresimilarthengeneralize

(Salakhutdinov,2017)

Page 12: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

LimitationsofOne-HotEncoding

• Themajorproblemwiththeone-hotrepresentationisthatitisvery

high-dimensional

othedimensionalityofe(w)isthesizeofthevocabulary

oatypicalvocabularysizeis≈100,000

oawindowof10wordswouldcorrespondtoaninputvectorofat

least1,000,000units!

(Salakhutdinov,2017)

Page 13: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ContinuousRepresentationofWords

• Eachwordwisassociatedwithareal-valuedvectorC(w)• Typicalsizeofword– embeddingis300ormore.

(Salakhutdinov,2017)

Page 14: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ContinuousRepresentationofWords

• Wewouldlikethedistance||C(w)-C(w’)||toreflectmeaningfulsimilarities betweenwords

(Salakhutdinov,2017)

Page 15: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

LanguageModeling

• Alanguagemodelallowsustopredicttheprobabilityofobserving

the sentence(inagivendataset)as:

𝑃 𝑥G, … , 𝑥I = J𝑃 𝑥K 𝑥G, … , 𝑥KLG)I

KMG

• Herelengthofsentenceisn.• Builda languagemodel usingaRecurrentNeuralNetwork.

Page 16: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

WordEmbeddings fromLanguageModels

(Neubig,2017)

Page 17: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ContinuousBagofWords(CBOW)

• Predictwordbasedonsumofsurroundingembeddings

(Neubig,2017)

Page 18: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Skip-gram

• usethecurrentwordtopredictthesurroundingwindowofcontext

words

(Neubig,2017)

Page 19: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

BERT(BidirectionalEncoderRepresentationsfromTransformers)

• BERTisamethodofpretraining languagerepresentations

• Data:Wikipedia(2.5Bwords)+BookCorpus (800Mwords)

• Maskoutk%oftheinputwords,andthenpredictthemaskedwords

• WordEmbeddingSize:768

Page 20: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

UseofWordEmbeddings

• torepresentasentence

• asinputtoaneuralnetwork

• tounderstandpropertiesofwords

• Partofspeech

• Dotwowordsmeanthesamething?

• semanticrelation(is-a,part-of,went-to-school-at)?

Page 21: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

NLPandSequentialData

• NLPisfullofsequentialdata

• Charactersinwords

• Wordsinsentences

• Sentencesindiscourse

• …

(Neubig,2017)

Page 22: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Long-distanceDependenciesinLanguage

• Agreementinnumber,gender,etc.

• He doesnothaveverymuchconfidenceinhimself.

• She doesnothaveverymuchconfidenceinherself.

• Selectional preference

• Thereign haslastedaslongasthelifeofthequeen.

• Therain haslastedaslongasthelifeoftheclouds.

(Neubig,2017)

Page 23: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

RecurrentNeuralNetworks

• Toolstorememberinformation

(Neubig,2017)

FeedForwardNN RecurrentNN

Page 24: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

UnrollinginTime

• Whatdoesprocessingasequencelooklike?

I hate this movie

RNN

predict

label

RNN

predict

label

RNN

predict

label

predict

label

RNN

(Neubig,2017)

Page 25: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

TrainingRNNsI hate this movie

RNN

predict

Prediction1

RNN

predict

Prediction2

RNN

predict

Prediction3

predict

Prediction4

RNN

Label1 Label2 Label3 Label4

Loss1 Loss2 Loss3 Loss4

sum totalloss (Neubig,2017)

Page 26: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

WhatcanRNNsdo

• Representasentence

• Readwholesentence,makeaprediction

• Representacontextwithinasentence

• Readcontextupuntilthatpoint

(Neubig,2017)

Page 27: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Representingasentence

• ℎO istherepresentationofthesentence

• ℎO istherepresentationoftheprobabilityofobserving“Ihatethismovie”

I hate this movie

RNN RNN RNN RNN

ℎP ℎG ℎQ ℎR ℎO

(Neubig,2017)

Page 28: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

LanguageModelingusingRNN

(Neubig,2017)

I hate this movie<start>

RNN RNN RNN RNN RNN

predict

hate

predict

I

predict

this

predict

movie

predict

<end>

Page 29: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Bidirectional-RNNs

• Asimpleextension,runtheRNNinbothdirections

(Neubig,2017)

Page 30: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Bidirectional-RNNs

• Asimpleextension,runtheRNNinbothdirections

(Neubig,2017)Prediction1

Page 31: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Bidirectional-RNNs

• Asimpleextension,runtheRNNinbothdirections

(Neubig,2017)Prediction1 Prediction2

Page 32: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Bidirectional-RNNs

• Asimpleextension,runtheRNNinbothdirections

(Neubig,2017)Prediction1 Prediction2 Prediction3 Prediction4

Page 33: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

RecurrentNeuralNetworks

• TheideabehindRNNsistomakeuseofsequentialinformation.

Page 34: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

RecurrentNeuralNetworks

• 𝑥S istheinputattimestept• 𝑥S isthewordembedding• 𝑠S isthehiddenrepresentationattimestept

𝑠S = 𝑓 𝑈𝑥S +𝑊𝑠SLG𝑜S = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑉𝑠S)

• Note: U,V,Wareshared acrossalltimesteps

Page 35: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

RNNProblemsandAlternatives

• Vanishinggradients

• Gradientsdecreaseastheygetpushedback

• Sol:LongShort-termMemory(Hochreiter andSchmidhuber 1997)(Neubig,2017)

Page 36: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

RNNStrengthsandWeaknesses

• RNNs,particularlydeepRNNs/LSTMs,arequitepowerfulandflexible

• Buttheyrequirealotofdata

• Alsohavetroublewithweakerrorsignalspassedbackfromtheend

ofthesentence

Page 37: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

BuildChatbots

• Wewanttomodel𝑃 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑖𝑛𝑝𝑢𝑡_𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒)

• Welearnthowtobuildwordembeddings

• Welearnthowtobuildalanguagemodel

• Welearnthowtorepresentasentence.

• Wewanttogetarepresentationoftheinput_sentence andthen

generatetheresponseconditionedontheinput.

Page 38: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ConditionalLanguageModels

• LanguageModel

𝑃 𝑋 = J𝑃 𝑥K|𝑥G, … , 𝑥KLG

I

KMG

• ConditionalLanguageModel

𝑃 𝑌 𝑋 = J𝑃 𝑦 |𝑋, 𝑦G, … , 𝑦 LG

a

`MG

(Neubig,2017)

contextnextword

context

Addedcontext

Page 39: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ConditionalLanguageModel(Sutskever etal.2014)

(Neubig,2017)

Page 40: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Howtopasshiddenstate?

• Initializedecoderw/encoder(Sutskever etal.2014)

• Transform(canbedifferentdimensions)

• Inputateverytimestep(Kalchbrenner &Blunsom 2013)

(Neubig,2017)

Page 41: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

SequencetoSequenceModels

Page 42: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ConstraintsofNeuralModels

Backchanneling

Long-termconversationplanning

Context

Engagement

Page 43: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ConstraintsofNeuralModels

Constraints

Gesture

GazeLaughter

Backchanneling

Long-termconversationplanning

Context

Engagement

Page 44: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

ExamplesofNeuralChatbots

Page 45: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Tay

Page 46: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Zo

Page 47: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Xiaoice

• https://www.youtube.com/watch?v=dg-x1WuGhuI

Page 48: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

AlexaPrizeChallenge

• Challenge:Buildachatbotthatengages theusersfor20mins.• Sponsored12UniversityTeamswith$100k.• CMUMagnusandCMURuby.• Systemsaremulticomponent

oCombinationsoftask/non-taskoHand-writtenandstatistical/neuralmodels

• ItsaboutengagingresearchersoHavingmorePhDstudentsdodialogoGivingaccessfordeveloperstousersoCollectingdata:whatdouserssay

Page 49: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

CMUMagnus

• Highaveragenumberofturns

• AverageRating

• Topics:Movies,Sports,Travel,GoT

• Usershadlongerconversationsbutdidnotenjoytheconversation.oIdentifywhenuserisfrustrated orwantstochangetopic.

oIdentifywhattheuserwouldliketotalkabout(intent).

• Detecting“Abusive”remarksandrespondingappropriately

Page 50: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

Summary

• Howtorepresentwordsincontinuousspace.• WhatareRNNsandhowtousethemtorepresentasentence.• Sequencetosequencemodelsfor𝑃 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑖𝑛𝑝𝑢𝑡_𝑠𝑒𝑛𝑡𝑒𝑛𝑐𝑒)• Issuesinneuralmodel• IssueswithLivesystem!

Page 51: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

References

• http://www.phontron.com/class/nn4nlp2017/assets/slides/nn4nlp-03-wordemb.pdf• http://www.phontron.com/class/nn4nlp2017/assets/slides/nn4nlp-06-rnn.pdf• http://www.phontron.com/class/nn4nlp2017/assets/slides/nn4nlp-08-condlm.pdf• https://www.cs.cmu.edu/~rsalakhu/10707/Lectures/Lecture_Language_2019.pdf• http://www.phontron.com/class/mtandseq2seq2017/mt-spring2017.chapter6.pdf

Page 52: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

References

• http://www.wildml.com/2016/04/deep-learning-for-chatbots-part-1-introduction/• http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/• http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-2-implementing-a-language-model-rnn-with-python-numpy-and-theano/• http://www.wildml.com/2016/07/deep-learning-for-chatbots-2-retrieval-based-model-tensorflow/• https://nlp.stanford.edu/seminar/details/jdevlin.pdf

Page 53: neural dialog - Carnegie Mellon School of Computer …sprabhum/docs/neural_dialog.pdfReview •Task Oriented Systems •Intents, slots, actions and response •Non-Task Oriented Systems

RNNtorepresentasentence

RNN

Embedding

how

𝒔𝟎 RNN

Embedding

are

𝒔𝟏RNN

Embedding

you

𝒔𝟐RNN

Embedding

?

𝒔𝟑𝒔𝟒

• 𝑠O istherepresentationoftheentiresentence• 𝑠O istherepresentationofprobabilityofobserving“howareyou?”