54
Recurrent Neural Network Applied Deep Learning March 17th, 2020 http://adl.miulab.tw

Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Recurrent Neural Network

Applied Deep Learning

March 17th, 2020 http://adl.miulab.tw

Page 2: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

2

Page 3: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

語言模型

Language Modeling3

Page 4: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

4

Page 5: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Language Modeling

◉ Goal: estimate the probability of a word sequence

◉ Example task: determinate whether a sequence is grammatical or makes more

sense

5

recognize speech

or

wreck a nice beach Output = “recognize speech”

If P(recognize speech)

> P(wreck a nice beach)

Page 6: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

6

Page 7: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

N-Gram Language Modeling

◉ Goal: estimate the probability of a word sequence

◉ N-gram language model○ Probability is conditioned on a window of (n-1) previous words

○ Estimate the probability based on the training data

7

𝑃 beach|nice =𝐶 nice each

𝐶 nice Count of “nice” in the training data

Count of “nice beach” in the training data

Issue: some sequences may not appear in the training data

Page 8: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

N-Gram Language Modeling

◉ Training data:○ The dog ran ……

○ The cat jumped ……

8

P( jumped | dog ) = 0

P( ran | cat ) = 0give some small probability

→ smoothing

0.0001

0.0001

➢ The probability is not accurate.

➢ The phenomenon happens because we cannot collect all the

possible text in the world as training data.

Page 9: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

9

Page 10: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Neural Language Modeling

◉ Idea: estimate not from count, but from NN prediction

10

Neural

Network

vector of “START”

P(next word is

“wreck”)

Neural

Network

vector of “wreck”

P(next word is “a”)

Neural

Network

vector of “a”

P(next word is

“nice”)

Neural

Network

vector of “nice”

P(next word is

“beach”)

P(“wreck a nice beach”) = P(wreck | START) P(a | wreck) P(nice | a) P(beach | nice)

Page 11: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Neural Language Modeling11

Bengio et al., “A Neural Probabilistic Language Model,” in JMLR, 2003.

input

hidden

output

context vector

Probability distribution

of the next word

Page 12: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Neural Language Modeling

◉ The input layer (or hidden layer) of the related words are close

○ If P(jump | dog) is large, P(jump | cat) increase accordingly (even there is not

“… cat jump …” in the data)

12

h1

h2

dog

cat

rabbit

Smoothing is automatically done

Issue: fixed context window for conditioning

Page 13: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

13

Page 14: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Recurrent Neural Network

◉ Idea: condition the neural network on all previous words and tie the weights at

each time step

◉ Assumption: temporal information matters

14

Page 15: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

RNN Language Modeling15

vector of “START”

P(next word is

“wreck”)

vector of “wreck”

P(next word is

“a”)

vector of “a”

P(next word is

“nice”)

vector of “nice”

P(next word is

“beach”)

input

hidden

output

context vector

word prob dist

Idea: pass the information from the previous hidden layer to leverage all contexts

Page 16: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

詳細解析鼎鼎大名的RNN

Recurrent Neural Network16

Page 17: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

17

Page 18: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

RNNLM Formulation

◉ At each time step,

18

…………

……

……

vector of the current word

probability of the next word

Page 19: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

19

Page 20: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Recurrent Neural Network Definition20

http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/

: tanh, ReLU

Page 21: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Model Training

◉ All model parameters can be updated by

21

http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/

yt-1 yt+1yt target

predicted

Page 22: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

22

Page 23: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Backpropagation23

……

1

2

j

……

1

2

il

ijw

Layer lLayer 1−l

=

1

11

lx

la

j

l

jl

i

Backward Pass

Forward Pass

Error signal

Page 24: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Backpropagation24

1

2

n

1y

C

( )Lz1

( )Lz2

( )L

nz

2y

C

ny

C

Layer L

2

1

i

Layer l

( )lz1

( )lz2

( )l

iz

lδ1

lδ2

l

2

( )1L

1

− z

1

m

Layer L-1

… ( )TW L( )TlW 1+

( )yCL1-L

( )1L

2

− z

( )1L− mz

l

i

Backward Pass

Error signal

Page 25: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Backpropagation through Time (BPTT)

◉ Unfold

○ Input: init, x1, x2, …, xt

○ Output: ot

○ Target: yt

25

init

st ytxt ot

xt-1 st-1

xt-2

x1 s1

st-2

1o

C

2o

C

no

C

( )yC

Page 26: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Backpropagation through Time (BPTT)

◉ Unfold

○ Input: init, x1, x2, …, xt

○ Output: ot

○ Target: yt

26

init

st ytxt ot

xt-1 st-1

xt-2

x1 s1

st-2

1

2

n

1

2

n

( )yC

Page 27: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Backpropagation through Time (BPTT)

◉ Unfold

○ Input: init, x1, x2, …, xt

○ Output: ot

○ Target: yt

27

init

st ytxt ot

xt-1 st-1

xt-2

x1 s1

st-2

( )yC

Page 28: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Backpropagation through Time (BPTT)

◉ Unfold

○ Input: init, x1, x2, …, xt

○ Output: ot

○ Target: yt

28

init

st ytxt ot

xt-1 st-1

xt-2

x1 s1

st-2

j

i

i

j

i

j

i

j

( )yC

Weights are tied together

the same

memory

pointer

pointer

Page 29: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Backpropagation through Time (BPTT)

◉ Unfold

○ Input: init, x1, x2, …, xt

○ Output: ot

○ Target: yt

29

init

st ytxt ot

xt-1 st-1

xt-2

x1 s1

st-2

j

i

k

i

k

j

i

j

i

k

j

( )yC

Weights are tied together

Page 30: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

BPTT30

For 𝐶(1)Backward Pass:For 𝐶(2)

For 𝐶(3)For 𝐶(4)

Forward Pass: Compute s1, s2, s3, s4 ……

y1 y2 y3

x1x2 x3

o1 o2o3

init

y4

x4

o4

𝐶(1) 𝐶(2) 𝐶(3) 𝐶(4)

s1 s2 s3s4

Page 31: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

31

Page 32: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

RNN Training Issue

◉ The gradient is a product of Jacobian matrices, each associated with a step in

the forward computation

◉ Multiply the same matrix at each time step during backprop

32

The gradient becomes very small or very large quickly

→ vanishing or exploding gradient

Bengio et al., “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. of Neural Networks, 1994. [link]

Pascanu et al., “On the difficulty of training recurrent neural networks,” in ICML, 2013. [link]

Page 33: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

w2

w1

Cost

Rough Error Surface33

The error surface is either very flat or very steep

Bengio et al., “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. of Neural Networks, 1994. [link]

Pascanu et al., “On the difficulty of training recurrent neural networks,” in ICML, 2013. [link]

Page 34: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Vanishing/Exploding Gradient Example34

0

5

10

15

20

25

30

35-1

00

-10 -1

-0.1

-0.0

1 0

0.0

1

0.1 1

10

10

0

0

5

10

15

20

25

30

35

-10

0

-10 -1

-0.1

-0.0

1 0

0.0

1

0.1 1

10

10

0

0

5

10

15

20

25

30

35

-10

0

-10 -1

-0.1

-0.0

1 0

0.0

1

0.1 1

10

10

0

0

5

10

15

20

25

30

35

-10

0

-10 -1

-0.1

-0.0

1 0

0.0

1

0.1 1

10

10

0

0

5

10

15

20

25

30

35

-10

0

-10 -1

-0.1

-0.0

1 0

0.0

1

0.1 1

10

10

0

0

5

10

15

20

25

30

35

-10

0

-10 -1

-0.1

-0.0

1 0

0.0

1

0.1 1

10

10

0

1 step 2 steps 5 steps

10 steps 20 steps 50 steps

Page 35: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Solution for Exploding Gradient: Clipping35

w2

w1

Co

st

clipped gradient Idea: control the gradient value to avoid exploding

Parameter setting: values from half to ten times

the average can still yield convergence

Pascanu et al., “On the difficulty of training recurrent neural networks,” in ICML, 2013. [link]

Page 36: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Solution for Vanishing Gradient: Gating

◉ RNN models temporal sequence information○ can handle “long-term dependencies” in theory

36

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Issue: RNN cannot handle “long-term dependencies” due to vanishing gradient

→ gating directly encodes long-distance information

“I grew up in France…

I speak fluent French.”

Page 37: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Extension: Bidirectional RNN37

ℎ = ℎ; ℎ represents (summarizes) the past and future around a single token

Page 38: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Extension: Deep Bidirectional RNN38

Each memory layer passes an intermediate representation to the next

Page 39: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

RNN各式應用情境

RNN Applications39

Page 40: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

40

Page 41: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

How to Frame the Learning Problem?

◉ The learning algorithm f is to map the input domain X into the output domain Y

◉ Input domain: word, word sequence, audio signal, click logs

◉ Output domain: single label, sequence tags, tree structure, probability distribution

41

YXf →:

Network design should leverage input and output domain properties

Page 42: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model

○ Feed-Forward Neural Language Model

○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition

○ Training via Backpropagation through Time (BPTT)

○ Training Issue

◉ Applications○ Sequential Input

○ Sequential Output■ Aligned Sequential Pairs (Tagging)

■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

42

Page 43: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Input Domain – Sequence Modeling

◉ Idea: aggregate the meaning from all words into a vector

◉ Method:○ Basic combination: average, sum

○ Neural combination: ✓ Recursive neural network (RvNN)

✓ Recurrent neural network (RNN)

✓ Convolutional neural network (CNN)

43

How to compute

規格

(specification)

誠意

(sincerity)

(this)

(have)

N-dim

Page 44: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

誠意這 規格 有

x4

h4

Sentiment Analysis

◉ Encode the sequential input into a vector using RNN

44

1x

2x

……

1y

2y

… …

Input Output

MyNx

RNN considers temporal information to learn sentence vectors as classifier’s input

Page 45: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

45

Page 46: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Output Domain – Sequence Prediction

◉ POS Tagging

◉ Speech Recognition

◉ Machine Translation

46

“推薦我台大後門的餐廳” 推薦/VV我/PN台大/NR後門/NN的/DEG餐廳/NN

“大家好”

“How are you doing today?” “你好嗎?”

The output can be viewed as a sequence of classification

Page 47: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

47

Page 48: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

POS Tagging

◉ Tag a word at each timestamp○ Input: word sequence

○ Output: corresponding POS tag sequence

48

四樓 好 專業

N VA AD

Page 49: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Natural Language Understanding (NLU)

◉ Tag a word at each timestamp○ Input: word sequence

○ Output: IOB-format slot tag and intent tag

49

<START> just sent email to bob about fishing this weekend <END>

O O O O

B-contact_name

O

B-subject I-subjectI-subject

→ send_email(contact_name=“bob”, subject=“fishing this weekend”)

O

send_email

Temporal orders for input and output are the same

Page 50: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Outline

◉ Language Modeling○ N-gram Language Model○ Feed-Forward Neural Language Model○ Recurrent Neural Network Language Model (RNNLM)

◉ Recurrent Neural Network○ Definition○ Training via Backpropagation through Time (BPTT)○ Training Issue○ Extension

◉ RNN Applications○ Sequential Input○ Sequential Output

■ Aligned Sequential Pairs (Tagging)■ Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder)

50

Page 51: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

超棒 的 醬汁

Machine Translation

◉ Cascade two RNNs, one for encoding and one for decoding○ Input: word sequences in the source language

○ Output: word sequences in the target language

51

encoder

decoder

Page 52: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Chit-Chat Dialogue Modeling

◉ Cascade two RNNs, one for encoding and one for decoding○ Input: word sequences in the question

○ Output: word sequences in the response

52

Temporal ordering for input and output may be different

Page 53: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Sci-Fi Short Film - SUNSPRING53

https://www.youtube.com/watch?v=LY7x2Ih

qj

Page 54: Recurrent Neural Networkmiulab/s108-adl/doc/200317...Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM)

Concluding Remarks

◉ Language Modeling○ RNNLM

◉ Recurrent Neural Networks○ Definition

○ Backpropagation through Time (BPTT)

○ Vanishing/Exploding Gradient

◉ RNN Applications○ Sequential Input: Sequence-Level Embedding

○ Sequential Output: Tagging / Seq2Seq (Encoder-Decoder)

54