NLP Applications using Deep Learning - GitHub Pages · 2018. 4. 29. · Neural Machine Translation...

Preview:

Citation preview

NLP Applications using Deep Learning

Giri Iyengar

Cornell University

gi43@cornell.edu

Feb 26, 2018

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 1 / 27

Agenda for the day

Sentiment AnalysisNeural Machine TranslationEntailment

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 2 / 27

Overview

1 Sentiment Analysis

2 Neural Machine Translation

3 Textual Entailment

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 3 / 27

Sentiment Analysis

Defining Sentiment AnalysisSentiment Analysis is the process of determining whether a piece ofwriting is positive, negative or neutral. It is also known as opinion mining,deriving the opinion or attitude of a speaker.

Common Use case: How people feel about a particular {Brand, Topic,News, Company, Product}?

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 4 / 27

Sentiment Analysis: Naive Bayes

Bag of wordsUse words without caring for their order as features. For each word, useits vector representation (e.g. word2vec, GloVe)

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 5 / 27

Sentiment Analysis: Logistic Regression

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 6 / 27

Sentiment Analysis using Bag of Words

Context is very importantWord order conveys informationRedundancy in human language

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 7 / 27

Recurrent Neural Networks for Sentiment Analysis

Word order is naturally captured in this model

Entire sentence/phrase is consideredVariable lengths of inputs are naturally handled without artificialpadding / chopping

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 8 / 27

Recurrent Neural Networks for Sentiment Analysis

Word order is naturally captured in this modelEntire sentence/phrase is considered

Variable lengths of inputs are naturally handled without artificialpadding / chopping

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 8 / 27

Recurrent Neural Networks for Sentiment Analysis

Word order is naturally captured in this modelEntire sentence/phrase is consideredVariable lengths of inputs are naturally handled without artificialpadding / chopping

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 8 / 27

Sentiment Analysis: RNN

Figure: Source: A. Karpathy

Many to One RNNFeed words of a sentence in and then run a classifier on the finalrepresentation

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 9 / 27

Sentiment Analysis: RNN

Figure: Source: deeplearning.net

Average pooling over timeFeed words of a sentence in and then average pooling over time to classify

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 10 / 27

Sentiment Analysis: CNN

Figure: Source: Y. Kim, NYU

Sentiment Analysis using CNNStart with sentence matrix. Consider k words at a time, where k varies.Pool and classify.

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 11 / 27

Sentiment Analysis using CNNs and RNNs

Both CNN and RNN can use static word representations

Or, they could learn/fine-tune these representationsCan make use of multiple representations. Not just oneWinner has been switching places time and again

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 12 / 27

Sentiment Analysis using CNNs and RNNs

Both CNN and RNN can use static word representationsOr, they could learn/fine-tune these representations

Can make use of multiple representations. Not just oneWinner has been switching places time and again

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 12 / 27

Sentiment Analysis using CNNs and RNNs

Both CNN and RNN can use static word representationsOr, they could learn/fine-tune these representationsCan make use of multiple representations. Not just one

Winner has been switching places time and again

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 12 / 27

Sentiment Analysis using CNNs and RNNs

Both CNN and RNN can use static word representationsOr, they could learn/fine-tune these representationsCan make use of multiple representations. Not just oneWinner has been switching places time and again

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 12 / 27

Sentiment Analysis using Recursive Networks

Both CNNs and RNNs seem to not capture the compositionalstructure in language

Historically, sentences have been represented by ComputationalLinguists using Parse TreesThese trees seem to be a more natural representation of a sentencestructureCan we build models that explore/exploit this structure?

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 13 / 27

Sentiment Analysis using Recursive Networks

Both CNNs and RNNs seem to not capture the compositionalstructure in languageHistorically, sentences have been represented by ComputationalLinguists using Parse Trees

These trees seem to be a more natural representation of a sentencestructureCan we build models that explore/exploit this structure?

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 13 / 27

Sentiment Analysis using Recursive Networks

Both CNNs and RNNs seem to not capture the compositionalstructure in languageHistorically, sentences have been represented by ComputationalLinguists using Parse TreesThese trees seem to be a more natural representation of a sentencestructure

Can we build models that explore/exploit this structure?

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 13 / 27

Sentiment Analysis using Recursive Networks

Both CNNs and RNNs seem to not capture the compositionalstructure in languageHistorically, sentences have been represented by ComputationalLinguists using Parse TreesThese trees seem to be a more natural representation of a sentencestructureCan we build models that explore/exploit this structure?

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 13 / 27

Parse Trees

Figure: Parse Tree examples: Courtesy Wikipedia

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 14 / 27

Parse Trees

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 15 / 27

Recursive Networks for Sentiment Analysis

Figure: Source: Richard Socher/EMNLP2013

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 16 / 27

Overview

1 Sentiment Analysis

2 Neural Machine Translation

3 Textual Entailment

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 17 / 27

Neural Machine Translation using RNN

Figure: Image Source: http://cs224d.stanford.edu/lectures/CS224d-Lecture8.pdf

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 18 / 27

Neural Machine Translation using RNN

Feed entire sentence in source language

Extract sentence in target languageTrain using parallel corporaE.g. European Parliamentary proceedings. Canadian HansardEncoder-Decoder paradigm

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 19 / 27

Neural Machine Translation using RNN

Feed entire sentence in source languageExtract sentence in target language

Train using parallel corporaE.g. European Parliamentary proceedings. Canadian HansardEncoder-Decoder paradigm

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 19 / 27

Neural Machine Translation using RNN

Feed entire sentence in source languageExtract sentence in target languageTrain using parallel corpora

E.g. European Parliamentary proceedings. Canadian HansardEncoder-Decoder paradigm

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 19 / 27

Neural Machine Translation using RNN

Feed entire sentence in source languageExtract sentence in target languageTrain using parallel corporaE.g. European Parliamentary proceedings. Canadian Hansard

Encoder-Decoder paradigm

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 19 / 27

Neural Machine Translation using RNN

Feed entire sentence in source languageExtract sentence in target languageTrain using parallel corporaE.g. European Parliamentary proceedings. Canadian HansardEncoder-Decoder paradigm

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 19 / 27

Beam Search to generate target sentences

Figure: Image Source:https://deepage.net/img/beam-search/beam-search-opt.jpg

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 20 / 27

Neural Machine Translation using RNN

The entire sentence is squished into one vectorIncreasing the hidden state size is not a solution

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 21 / 27

Attention based NMT models

While translating, have access to the entire hidden stateEach word translation looks up the words emitted so far and parts ofthe source sentenceModel identifies parts of the source sentence that the decoder paysattention to while translating

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 22 / 27

Attention based NMT model

Figure: Image Source: Bahdanau, ICLR 2015

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 23 / 27

Overview

1 Sentiment Analysis

2 Neural Machine Translation

3 Textual Entailment

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 24 / 27

Textual Entailment

Textual Entailment is directional relation between text fragmentsEntailing and Entailed sentences are referred as text t and hypothesish respectivelyMeasures Natural Language Understanding because it requires asemantic interpretation of the text

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 25 / 27

Textual Entailment

An example of a positive TE (text entails hypothesis) is:text: If you help the needy, God will reward you.hypothesis: Giving money to a poor man has good consequences.

An example of a negative TE (text contradicts hypothesis) is:text: If you help the needy, God will reward you.hypothesis: Giving money to a poor man has no consequences.

An example of a non-TE (text does not entail nor contradict) is:text: If you help the needy, God will reward you.hypothesis: Giving money to a poor man will make you a better person.

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 26 / 27

Giri Iyengar (Cornell Tech) NLP Applications Feb 26, 2018 27 / 27

Recommended