Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine...

Preview:

Citation preview

Document Context Neural Machine Translation with Memory Networks

Document Context Neural Machine Translationwith Memory Networks

Sameen Maruf, Gholamreza Haffari

Faculty of Information Technology

Monash University

July 17, 2017

1 / 30

Document Context Neural Machine Translation with Memory Networks

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

2 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

3 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

6 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Two types of factors: fθ(yt ; xt , x−t), gθ(yt ; y−t)

8 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:

Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

12 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒ Pθ(yt |xt , y−t , x−t)

13 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

14 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

15 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

16 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

Memory-to-Context:

st,j = GRU(st,j−1,ET [yt,j−1], ct,j , csrct , c trg

t )

17 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

Memory-to-Output:

yt,j ∼ softmax(Wy · rt,j + Wym · csrct + Wyt · c trg

t + by )

18 / 30

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

Use only source, target, or both external memories

Use Memory-to-Context/Memory-to-Output architectures forincorporating the different contexts

19 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

20 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→EnFr→EnFr→En0

0.2

0.4

0.6

0.8

1

BL

EU

De→EnDe→EnDe→En0

0.2

0.4

0.6

0.8

1

Et→EnEt→EnEt→En0

0.2

0.4

0.6

0.8

1

22 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

BL

EU

De→En9

9.5

10

10.5

11

9.18

S-NMT

Et→En20

21

22

23

20.42

22 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

21.91

BL

EU

De→En9

9.5

10

10.5

11

9.18

10.2

S-NMT S-NMT+src

Et→En20

21

22

23

20.42

22.1

22 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

21.91

21.74

BL

EU

De→En9

9.5

10

10.5

11

9.18

10.2

9.97

S-NMT S-NMT+src S-NMT+trg

Et→En20

21

22

23

20.42

22.121.94

22 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

21.91

21.74

22

BL

EU

De→En9

9.5

10

10.5

11

9.18

10.2

9.97

10.54

S-NMT S-NMT+src S-NMT+trg S-NMT+both

Et→En20

21

22

23

20.42

22.121.94

22.32

22 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→EnFr→EnFr→En0

0.2

0.4

0.6

0.8

1

BL

EU

De→EnDe→EnDe→En0

0.2

0.4

0.6

0.8

1

Et→EnEt→EnEt→En0

0.2

0.4

0.6

0.8

1

23 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

BL

EU

De→En9

9.5

10

10.5

11

9.18

S-NMT

Et→En20

21

22

23

20.42

23 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

21.8

BL

EU

De→En9

9.5

10

10.5

11

9.18

9.98

S-NMT S-NMT+src

Et→En20

21

22

23

20.42

21.5

23 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

21.8 21.76

BL

EU

De→En9

9.5

10

10.5

11

9.18

9.9810.04

S-NMT S-NMT+src S-NMT+trg

Et→En20

21

22

23

20.42

21.5

21.82

23 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

21.8 21.76 21.77

BL

EU

De→En9

9.5

10

10.5

11

9.18

9.9810.04

10.23

S-NMT S-NMT+src S-NMT+trg S-NMT+both

Et→En20

21

22

23

20.42

21.5

21.82

22.2

23 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→EnFr→EnFr→En0

0.2

0.4

0.6

0.8

1

BL

EU

De→EnDe→EnDe→En0

0.2

0.4

0.6

0.8

1

Et→EnEt→EnEt→En0

0.2

0.4

0.6

0.8

1

24 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→En21

21.5

22

22.5

23

21.9521.87

BL

EU

De→En10

10.2

10.4

10.6

10.8

11

10.26

10.14

[Jean et al., 2017] [Wang et al., 2017]

Et→En21

21.5

22

22.5

23

21.67

22.06

24 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→En21

21.5

22

22.5

23

21.9521.87 21.91

BL

EU

De→En10

10.2

10.4

10.6

10.8

11

10.26

10.14

10.2

[Jean et al., 2017] [Wang et al., 2017] S-NMT+src

Et→En21

21.5

22

22.5

23

21.67

22.06 22.1

24 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→En21

21.5

22

22.5

23

21.9521.87 21.91

22

BL

EU

De→En10

10.2

10.4

10.6

10.8

11

10.26

10.14

10.2

10.54

[Jean et al., 2017] [Wang et al., 2017] S-NMT+src S-NMT+both

Et→En21

21.5

22

22.5

23

21.67

22.06 22.1

22.32

24 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation (contd.)

Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.

Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.

S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.

+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.

+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.

+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.

[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.

26 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation (contd.)

Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.

Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.

S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.

+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.

+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.

+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.

[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.

26 / 30

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation (contd.)

Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.

Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.

S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.

+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.

+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.

+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.

[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.

26 / 30

Document Context Neural Machine Translation with Memory Networks

Conclusion

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

27 / 30

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Document Context Neural Machine Translation with Memory Networks

References

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

29 / 30

Document Context Neural Machine Translation with Memory Networks

References

References I

Hardmeier C. and Federico, M. (2010).

Modelling pronominal anaphora in statistical machine translation.International Workshop on Spoken Language Translation.

Gong Z. and Zhang M. and Zhou G. (2011).

Cache-based document-level statistical machine translation.Proceedings of the Conference on Empirical Methods in Natural Language Processing.

Garcia E. M. and Espana-Bonet C. and Marquez L. (2014).

Document-level machine translation as a re-translation process.Procesamiento del Lenguaje Natural, 53:103110..

Jean, S. and Lauly, L. and Firat, O. and Cho, K. (2017).

Does Neural Machine Translation Benefit from Larger Context?arXiv:1704.05135.

Wang, L. and Tu, Z. and Way, A. and Liu, Q. (2017).

Exploiting Cross-Sentence Context for Neural Machine Translation.Proceedings of the Conference on Empirical Methods in Natural Language Processing.

Bawden, R. and Sennrich, R. and Birch, A. and Haddow, B. (2018).

Evaluating Discourse Phenomena in Neural Machine Translation.Proceedings of the NAACL-HLT 2018.

30 / 30

Recommended