Upload
hye-min-ahn
View
1.935
Download
4
Embed Size (px)
Citation preview
Introduction toSequence to Sequence Model
2017.03.16 SeminarPresenter : Hyemin Ahn
05/02/2023 CPSLAB (EECS) 2
Recurrent Neural Networks : For what?
Human remembers and uses the pattern of sequence.• Try ‘a b c d e f g…’
• But how about ‘z y x w v u t s…’ ?
The idea behind RNN is to make use of sequential in-
formation.
Let’s learn a pattern of a sequence, and utilize (estimate,
generate, etc…) it!
But HOW?
05/02/2023 CPSLAB (EECS) 3
Recurrent Neural Networks : Typical RNNs
OUTPUT
INPUT
ONE STEP DE-
LAY
HIDDEN STATE
RNNs are called “RECURRENT” because they perform the same task for every element of a sequence, with the output being depended on the previous computations.
RNNs have a “memory” which captures infor-mation about what has been calculated so far.
The hidden state captures some information about a sequence.
If we use , Vanishing/Exploding gradient prob-lem happens.
For overcome this, we use LSTM/GRU.
𝒉𝒕
𝒚 𝒕
𝒙 𝒕
𝑈 𝑊
𝑉
05/02/2023 CPSLAB (EECS) 4
Recurrent Neural Networks : LSTM
Let’s think about the machine, which guesses the dinner menu from things in shopping bag.
Umm,,Car-
bonara!
05/02/2023 CPSLAB (EECS) 5
Recurrent Neural Networks : LSTM
𝑪𝒕Cell state,Internal memory unit,Like a conveyor belt!
𝒉𝒕𝒙 𝒕
Pancake,Oil spaghetti, Fried food
05/02/2023 CPSLAB (EECS) 6
Recurrent Neural Networks : LSTM
𝑪𝒕Cell state,Internal memory unit,Like a conveyor belt!
𝒉𝒕
𝒙 𝒕
ForgetSome
Memories!
Pancake,Oil spaghetti, Fried food
05/02/2023 CPSLAB (EECS) 7
Recurrent Neural Networks : LSTM
𝑪𝒕Cell state,Internal memory unit,Like a conveyor belt!
𝒉𝒕
𝒙 𝒕
Pancake,Oil spaghetti, Fried food
ForgetSome
Memories!
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
05/02/2023 CPSLAB (EECS) 8
Recurrent Neural Networks : LSTM
𝑪𝒕Cell state,Internal memory unit,Like a conveyor belt!
𝒉𝒕
𝒙 𝒕
Oil /Cream /Tomato Spaghetti
InsertSome
Memories!
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
05/02/2023 CPSLAB (EECS) 9
Recurrent Neural Networks : LSTM
𝑪𝒕Cell state,Internal memory unit,Like a conveyor belt!
𝒉𝒕
𝒙 𝒕
Oil /Cream / Tomato Spaghetti, Carbonara
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
05/02/2023 CPSLAB (EECS) 10
Recurrent Neural Networks : LSTM
𝑪𝒕Cell state,Internal memory unit,Like a conveyor belt!
𝒉𝒕
𝒙 𝒕
Oil /Cream Spaghetti, Carbonara
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
05/02/2023 CPSLAB (EECS) 11
Recurrent Neural Networks : LSTM
𝑪𝒕Cell state,Internal memory unit,Like a conveyor belt!
𝒉𝒕
𝒚 𝒕
Carbonara
𝒙 𝒕
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
05/02/2023 CPSLAB (EECS) 12
Recurrent Neural Networks : LSTM
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
05/02/2023 CPSLAB (EECS) 13
Recurrent Neural Networks : LSTM
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
05/02/2023 CPSLAB (EECS) 14
Recurrent Neural Networks : LSTM
LSTM learns (1) How to forget a memory when the and new input is given, (2) Then how to add the new memory with given and
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
CPSLAB (EECS) 15
Recurrent Neural Networks : GRU
05/02/2023
Maybe we can simplify this structure, efficiently!
GRU
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
05/02/2023 CPSLAB (EECS) 16
Sequence to Sequence Model: What is it?
h𝑒(1) h𝑒(2) h𝑒(3) h𝑒(4 ) h𝑒(5)LSTM/GRU
Encoder
LSTM/GRUDecoder
h𝑑(1) h𝑑(𝑇 𝑒)
Western FoodTo
Korean FoodTransition
05/02/2023 CPSLAB (EECS) 17
Sequence to Sequence Model: Implementation
The simplest way to implement sequence to sequence model is
to just pass the last hidden state of decoder to the first GRU cell of encoder!
However, this method’s power gets weaker when the encoder need to gener-ate longer sequence.
05/02/2023 CPSLAB (EECS) 18
Sequence to Sequence Model: Attention Decoder
Bidirectional GRU Encoder
Attention GRU Decoder
𝑐𝑡
For each GRU cell consisting the decoder, let’s differently pass the encoder’s information!
h 𝑖=[ h⃗𝑖h́𝑖 ] 𝑐 𝑖=∑𝑗=1
𝑇 𝑥
𝛼𝑖𝑗h 𝑗
05/02/2023 CPSLAB (EECS) 19
Sequence to Sequence Model: Example codes
Codes Here @ Github
05/02/2023 CPSLAB (EECS) 20