Upload
others
View
39
Download
0
Embed Size (px)
Citation preview
Document Context Neural Machine Translation with Memory Networks
Document Context Neural Machine Translationwith Memory Networks
Sameen Maruf, Gholamreza Haffari
Faculty of Information Technology
Monash University
July 17, 2017
1 / 30
Document Context Neural Machine Translation with Memory Networks
Overview
1 Introduction
2 Document MT as Structured Prediction
3 Document NMT with MemNets
4 Experiments and Analysis
5 Conclusion
6 References
2 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Overview
1 Introduction
2 Document MT as Structured Prediction
3 Document NMT with MemNets
4 Experiments and Analysis
5 Conclusion
6 References
3 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Most MT models translate sentences independently
Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency
4 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Most MT models translate sentences independently
Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency
4 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Most MT models translate sentences independently
Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency
4 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Most MT models translate sentences independently
Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency
4 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Most MT models translate sentences independently
Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency
4 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]
Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]
We incorporate global source and target document contexts
5 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]
Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]
We incorporate global source and target document contexts
5 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]
Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]
We incorporate global source and target document contexts
5 / 30
Document Context Neural Machine Translation with Memory Networks
Introduction
Why document-level machine translation?
Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]
Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]
We incorporate global source and target document contexts
5 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Overview
1 Introduction
2 Document MT as Structured Prediction
3 Document NMT with MemNets
4 Experiments and Analysis
5 Conclusion
6 References
6 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
7 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
7 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
7 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
7 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
7 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
7 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Two types of factors: fθ(yt ; xt , x−t), gθ(yt ; y−t)
8 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)
=⇒ Maximise the pseudo-likelihood
arg maxθ
|d |∏t=1
Pθ(yt |xt , y−t , x−t) (1)
where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)
9 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Training objective:
Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)
=⇒ Maximise the pseudo-likelihood
arg maxθ
|d |∏t=1
Pθ(yt |xt , y−t , x−t) (1)
where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)
9 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)
=⇒ Maximise the pseudo-likelihood
arg maxθ
|d |∏t=1
Pθ(yt |xt , y−t , x−t) (1)
where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)
9 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)
=⇒ Maximise the pseudo-likelihood
arg maxθ
|d |∏t=1
Pθ(yt |xt , y−t , x−t) (1)
where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)
9 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Challenge: During test time, the target document is not given
Coordinate Ascent (i.e., Iterative Decoding)
10 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Challenge: During test time, the target document is not given
Coordinate Ascent (i.e., Iterative Decoding)
10 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Challenge: During test time, the target document is not given
Coordinate Ascent (i.e., Iterative Decoding)
10 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Challenge: During test time, the target document is not given
Coordinate Ascent (i.e., Iterative Decoding)
10 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Challenge: During test time, the target document is not given
Coordinate Ascent (i.e., Iterative Decoding)
10 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Challenge: During test time, the target document is not given
Coordinate Ascent (i.e., Iterative Decoding)
10 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Iterative Decoding
11 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Iterative Decoding
11 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Iterative Decoding
11 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Iterative Decoding
11 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Iterative Decoding
11 / 30
Document Context Neural Machine Translation with Memory Networks
Document MT as Structured Prediction
Document MT as Structured Prediction
Iterative Decoding
11 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Overview
1 Introduction
2 Document MT as Structured Prediction
3 Document NMT with MemNets
4 Experiments and Analysis
5 Conclusion
6 References
12 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Document NMT with MemNets
=⇒ Pθ(yt |xt , y−t , x−t)
13 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Document NMT with MemNets
=⇒
14 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Document NMT with MemNets
=⇒
15 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Document NMT with MemNets
=⇒
16 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Document NMT with MemNets
=⇒
Memory-to-Context:
st,j = GRU(st,j−1,ET [yt,j−1], ct,j , csrct , c trg
t )
17 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Document NMT with MemNets
=⇒
Memory-to-Output:
yt,j ∼ softmax(Wy · rt,j + Wym · csrct + Wyt · c trg
t + by )
18 / 30
Document Context Neural Machine Translation with Memory Networks
Document NMT with MemNets
Document NMT with MemNets
Use only source, target, or both external memories
Use Memory-to-Context/Memory-to-Output architectures forincorporating the different contexts
19 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Overview
1 Introduction
2 Document MT as Structured Prediction
3 Document NMT with MemNets
4 Experiments and Analysis
5 Conclusion
6 References
20 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Experimental Setup
Training/dev/test corpora statistics:
corpus #docs (H) #sents (K) avg doc len
Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19
Evaluation Metrics: BLEU, METEOR
Baselines:
Context-free baseline (S-NMT)
Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]
21 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Experimental Setup
Training/dev/test corpora statistics:
corpus #docs (H) #sents (K) avg doc len
Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19
Evaluation Metrics: BLEU, METEOR
Baselines:
Context-free baseline (S-NMT)
Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]
21 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Experimental Setup
Training/dev/test corpora statistics:
corpus #docs (H) #sents (K) avg doc len
Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19
Evaluation Metrics: BLEU, METEOR
Baselines:
Context-free baseline (S-NMT)
Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]
21 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Experimental Setup
Training/dev/test corpora statistics:
corpus #docs (H) #sents (K) avg doc len
Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19
Evaluation Metrics: BLEU, METEOR
Baselines:
Context-free baseline (S-NMT)
Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]
21 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Context Results
Fr→EnFr→EnFr→En0
0.2
0.4
0.6
0.8
1
BL
EU
De→EnDe→EnDe→En0
0.2
0.4
0.6
0.8
1
Et→EnEt→EnEt→En0
0.2
0.4
0.6
0.8
1
22 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Context Results
Fr→En20
21
22
23
20.85
BL
EU
De→En9
9.5
10
10.5
11
9.18
S-NMT
Et→En20
21
22
23
20.42
22 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Context Results
Fr→En20
21
22
23
20.85
21.91
BL
EU
De→En9
9.5
10
10.5
11
9.18
10.2
S-NMT S-NMT+src
Et→En20
21
22
23
20.42
22.1
22 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Context Results
Fr→En20
21
22
23
20.85
21.91
21.74
BL
EU
De→En9
9.5
10
10.5
11
9.18
10.2
9.97
S-NMT S-NMT+src S-NMT+trg
Et→En20
21
22
23
20.42
22.121.94
22 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Context Results
Fr→En20
21
22
23
20.85
21.91
21.74
22
BL
EU
De→En9
9.5
10
10.5
11
9.18
10.2
9.97
10.54
S-NMT S-NMT+src S-NMT+trg S-NMT+both
Et→En20
21
22
23
20.42
22.121.94
22.32
22 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Output Results
Fr→EnFr→EnFr→En0
0.2
0.4
0.6
0.8
1
BL
EU
De→EnDe→EnDe→En0
0.2
0.4
0.6
0.8
1
Et→EnEt→EnEt→En0
0.2
0.4
0.6
0.8
1
23 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Output Results
Fr→En20
21
22
23
20.85
BL
EU
De→En9
9.5
10
10.5
11
9.18
S-NMT
Et→En20
21
22
23
20.42
23 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Output Results
Fr→En20
21
22
23
20.85
21.8
BL
EU
De→En9
9.5
10
10.5
11
9.18
9.98
S-NMT S-NMT+src
Et→En20
21
22
23
20.42
21.5
23 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Output Results
Fr→En20
21
22
23
20.85
21.8 21.76
BL
EU
De→En9
9.5
10
10.5
11
9.18
9.9810.04
S-NMT S-NMT+src S-NMT+trg
Et→En20
21
22
23
20.42
21.5
21.82
23 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Memory-to-Output Results
Fr→En20
21
22
23
20.85
21.8 21.76 21.77
BL
EU
De→En9
9.5
10
10.5
11
9.18
9.9810.04
10.23
S-NMT S-NMT+src S-NMT+trg S-NMT+both
Et→En20
21
22
23
20.42
21.5
21.82
22.2
23 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Main Results
Fr→EnFr→EnFr→En0
0.2
0.4
0.6
0.8
1
BL
EU
De→EnDe→EnDe→En0
0.2
0.4
0.6
0.8
1
Et→EnEt→EnEt→En0
0.2
0.4
0.6
0.8
1
24 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Main Results
Fr→En21
21.5
22
22.5
23
21.9521.87
BL
EU
De→En10
10.2
10.4
10.6
10.8
11
10.26
10.14
[Jean et al., 2017] [Wang et al., 2017]
Et→En21
21.5
22
22.5
23
21.67
22.06
24 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Main Results
Fr→En21
21.5
22
22.5
23
21.9521.87 21.91
BL
EU
De→En10
10.2
10.4
10.6
10.8
11
10.26
10.14
10.2
[Jean et al., 2017] [Wang et al., 2017] S-NMT+src
Et→En21
21.5
22
22.5
23
21.67
22.06 22.1
24 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Main Results
Fr→En21
21.5
22
22.5
23
21.9521.87 21.91
22
BL
EU
De→En10
10.2
10.4
10.6
10.8
11
10.26
10.14
10.2
10.54
[Jean et al., 2017] [Wang et al., 2017] S-NMT+src S-NMT+both
Et→En21
21.5
22
22.5
23
21.67
22.06 22.1
22.32
24 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Example translation
Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.
S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the
millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the
objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.
[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.
25 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Example translation
Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.
S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the
millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the
objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.
[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.
25 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Example translation
Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.
S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the
millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the
objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.
[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.
25 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Example translation
Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.
S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the
millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the
objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.
[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.
25 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Example translation (contd.)
Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.
Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.
S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.
+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.
+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.
+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.
[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.
26 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Example translation (contd.)
Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.
Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.
S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.
+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.
+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.
+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.
[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.
26 / 30
Document Context Neural Machine Translation with Memory Networks
Experiments and Analysis
Example translation (contd.)
Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.
Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.
S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.
+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.
+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.
+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.
[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.
26 / 30
Document Context Neural Machine Translation with Memory Networks
Conclusion
Overview
1 Introduction
2 Document MT as Structured Prediction
3 Document NMT with MemNets
4 Experiments and Analysis
5 Conclusion
6 References
27 / 30
Document Context Neural Machine Translation with Memory Networks
Conclusion
Conclusion
Proposed a model which incorporates the global source andtarget document contexts
Proposed effective training and decoding methodologies forour model
Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena
28 / 30
Document Context Neural Machine Translation with Memory Networks
Conclusion
Conclusion
Proposed a model which incorporates the global source andtarget document contexts
Proposed effective training and decoding methodologies forour model
Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena
28 / 30
Document Context Neural Machine Translation with Memory Networks
Conclusion
Conclusion
Proposed a model which incorporates the global source andtarget document contexts
Proposed effective training and decoding methodologies forour model
Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena
28 / 30
Document Context Neural Machine Translation with Memory Networks
Conclusion
Conclusion
Proposed a model which incorporates the global source andtarget document contexts
Proposed effective training and decoding methodologies forour model
Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena
28 / 30
Document Context Neural Machine Translation with Memory Networks
Conclusion
Conclusion
Proposed a model which incorporates the global source andtarget document contexts
Proposed effective training and decoding methodologies forour model
Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena
28 / 30
Document Context Neural Machine Translation with Memory Networks
References
Overview
1 Introduction
2 Document MT as Structured Prediction
3 Document NMT with MemNets
4 Experiments and Analysis
5 Conclusion
6 References
29 / 30
Document Context Neural Machine Translation with Memory Networks
References
References I
Hardmeier C. and Federico, M. (2010).
Modelling pronominal anaphora in statistical machine translation.International Workshop on Spoken Language Translation.
Gong Z. and Zhang M. and Zhou G. (2011).
Cache-based document-level statistical machine translation.Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Garcia E. M. and Espana-Bonet C. and Marquez L. (2014).
Document-level machine translation as a re-translation process.Procesamiento del Lenguaje Natural, 53:103110..
Jean, S. and Lauly, L. and Firat, O. and Cho, K. (2017).
Does Neural Machine Translation Benefit from Larger Context?arXiv:1704.05135.
Wang, L. and Tu, Z. and Way, A. and Liu, Q. (2017).
Exploiting Cross-Sentence Context for Neural Machine Translation.Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Bawden, R. and Sennrich, R. and Birch, A. and Haddow, B. (2018).
Evaluating Discourse Phenomena in Neural Machine Translation.Proceedings of the NAACL-HLT 2018.
30 / 30