58
From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen Livescu, and Dan Roth

From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Embed Size (px)

DESCRIPTION

Motivation Improve coverage Have a parametric model Improve phrase pair scores

Citation preview

Page 1: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

From Paraphrase Database to Compositional Paraphrase Model and Back

John WietingUniversity of Illinois

Joint work with Mohit Bansal, Kevin Gimpel, Karen Livescu, and Dan Roth

Page 2: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

The PPDB (Ganitkevitch et. al, 2013) is a vast collection of paraphrase pairs

Motivation

that allow the which enable thebe given the opportunity to have the possibility of

i can hardly hear you . you 're breaking up .and the establishment as well as the developmentlaying the foundations pave the way

making every effort to do its utmost… …

Page 3: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Motivation

• Improve coverage

• Have a parametric model

• Improve phrase pair scores

Page 4: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Contributions

• Powerful word embeddings that have human-level performance on SimLex999 and WordSim353

• Phrase embeddings• Model can re-rank phrases in PPDB 1.0 (Improve human

correlation from 25 to 52 ρ.)• Parameterization of PPDB that can be used downstream

• New datasets

Page 5: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Datasets

Wanted clean way to evaluate paraphrase composition

Two new datasets: One for bigram paraphrases and one for short-phrase paraphrases from PPDB

Page 6: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

6

WordSim353

Topical Paraphrastic

SimLex-999Words

Bigrams MLSim(Mitchell and Lapata, 2010)

MLSim BigramPara

television programme tv set 5.8 1.0

training programme education course 5.7 5.0

bedroom window education officer 1.3 1.0

Page 7: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

7

WordSim353

Topical Paraphrastic

SimLex-999Words

Bigrams MLSim(Mitchell and Lapata, 2010)

MLPara(this talk)

MLSim MLPara

television programme tv set 5.8 1.0

training programme education course 5.7 5.0

bedroom window education officer 1.3 1.0

Page 8: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

8

WordSim353

Topical Paraphrastic

SimLex-999Words

Bigrams MLSim(Mitchell and Lapata, 2010)

MLPara(this talk)

Spearman’s rho Cohen’s kappa

adjective noun 0.87 0.79

noun noun 0.64 0.58

verb noun 0.73 0.73

Page 9: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

9

WordSim353

Topical Paraphrastic

SimLex-999Words

Bigrams MLSim(Mitchell and Lapata, 2010)

MLPara(this talk)

Phrases AnnoPPDB(this talk)

Page 10: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

10

AnnoPPDB(this talk)

AnnoPPDB

can not be separated from is inseparable from 5.0

hoped to be able to looked forward to 3.4

come on , think about it people , please 2.2

how do you mean that what worst feelings 1.6

Phrases

Topical Paraphrastic

Page 11: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

11

AnnoPPDB(this talk)

AnnoPPDB

can not be separated from is inseparable from 5.0

hoped to be able to looked forward to 3.4

come on , think about it people , please 2.2

how do you mean that what worst feelings 1.6

Phrases

Topical Paraphrastic

Mean Deviation: 0.60

Page 12: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

12

AnnoPPDB(this talk)

AnnoPPDB

can not be separated from is inseparable from 5.0

hoped to be able to looked forward to 3.4

come on , think about it people , please 2.2

how do you mean that what worst feelings 1.6

Phrases

Topical Paraphrastic

Dev and test sets were designed to have:

1) Variety of lengths2) Variety of quality3) Low word overlap

Page 13: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

13

AnnoPPDB(this talk)

AnnoPPDB

can not be separated from is inseparable from 5.0

hoped to be able to looked forward to 3.4

come on , think about it people , please 2.2

how do you mean that what worst feelings 1.6

Phrases

Topical Paraphrastic

See Pavlick et al., 2015 for similar but larger dataset

Page 14: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Learning EmbeddingsWe now have datasets to test paraphrase similarity. Next we learn to embed words and phrases

All similarities are computed using cosine distance

Page 15: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Learning Embeddings

Related work on using PPDB to improve word embeddings: Yu and Dredze, 2014; Faruqui et al., 2015

We now have datasets to test paraphrase similarity. Next we learn to embed words and phrases

All similarities are computed using cosine distance

Page 16: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

16

Training examples (word pairs from PPDB):

contamination pollution

converged convergence

captioned subtitled

outwit thwart

bad villain

broad general

permanent permanently

bed sack

carefree reckless

absolutely urgently

… …

Page 17: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

17

Loss Function for Learning

sums over word pairs in PPDB

Page 18: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

18

Loss Function for Learning

sums over word pairs in PPDB

positive example

Page 19: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

19

Loss Function for Learning

negative examples

sums over word pairs in PPDB

positive example

Page 20: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

20

Choosing Negative Examples?

Page 21: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

21

Choosing Negative Examples?

only do argmax over current mini-batch

(for efficiency)

Page 22: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

22

Choosing Negative Examples?

only do argmax over current mini-batch

(for efficiency)

we regularize by penalizing squared L2 distance to initial embeddings

Page 23: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

23

113k word pairs from PPDB (XL)Training:

WordSim353Tuning:

SimLex-999Test:Notes: 1. trained with AdaGrad, tuned stepsize, mini-batch size, and regularization 2. initialized with 25-dim skip-gram vectors trained on Wikipedia 3. statistical significance computed using one-tailed method of Steiger (1980) 4. output of training: “paragram” embeddings

contamination pollutionconverged convergencecaptioned subtitled

… …

Page 24: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: SimLex-999

Series110

20

30

40

50

60

70

80

21

38

52

65.1

skip-gram (25-dim)

skip-gram (1000-dim)

Hill et al. (2014)

Average Human

Spearman’s ρ × 100

Page 25: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: SimLex-999

Series110

20

30

40

50

60

70

80

21

38

5256

65.1

skip-gram (25-dim)

skip-gram (1000-dim)

Hill et al. (2014)

paragram (25-dim)

Average Human

Hum

an

Spearman’s ρ × 100

Para

gram

Page 26: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

26

170k word pairs from PPDB (XL)Training:

WordSim353Tuning:

SimLex-999Test:Notes: 1. replaced dot product in objective with cosine distance 2. trained with AdaGrad, tuned stepsize, mini-batch size, margin and regularization 3. initialized with 300-dim GloVe common crawl embeddings 4. output of training: “paragram-ws353” embeddings (“paragram-sl999” if tuned on SimLex-999)

contamination pollutionconverged convergencecaptioned subtitled

… …Scaling up to 300 dimensions

Page 27: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

27

170k word pairs from PPDB (XL)Training:

WordSim353Tuning:

SimLex-999Test:Notes: 1. replaced dot product in objective with cosine distance 2. trained with AdaGrad, tuned stepsize, mini-batch, margin and regularization 3. initialized with 300-dim GloVe common crawl embeddings 4. output of training: “paragram-ws353” embeddings (“paragram-sl999” if tuned on SimLex-999)

contamination pollutionconverged convergencecaptioned subtitled

… …

Page 28: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: SimLex-999

Series110

20

30

40

50

60

70

80

37.6

56.3 57.8

65.1

GloVe

Schwartz et al. 2015

Faruqui and Dyer 2015

Average Human

Spearman’s ρ × 100

Page 29: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: SimLex-999

Series110

20

30

40

50

60

70

80

37.6

56.3 57.8

66.7 65.1

GloVe

Schwartz et al. 2015

Faruqui and Dyer 2015

paragram-ws353

Average HumanPa

ragr

am-w

s353

Hum

an

Spearman’s ρ × 100

Page 30: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: SimLex-999

Series110

20

30

40

50

60

70

80

37.6

56.3 57.8

66.7 68.565.1

GloVeSchwartz et al. 2015Faruqui and Dyer 2015 paragram-ws353paragram-sl999Average Human

Para

gram

-ws3

53

Para

gram

-sl9

99

Hum

an

Spearman’s ρ × 100

Page 31: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: WordSim-353

Series110

20

30

40

50

60

70

80

57.9

68.171.3

75.6

GloVe

Faruqui et al. 2015

Huang et al. 2012

Average Human

Tune on SimLex-999, test on WordSim-353

Spearman’s ρ × 100

Page 32: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: WordSim-353

Series110

20

30

40

50

60

70

80

57.9

68.171.3 72

75.6

GloVe

Faruqui et al. 2015

Huang et al. 2012

paragram-sl999

Average Human

Tune on SimLex-999, test on WordSim-353

Hum

an

Para

gram

-sl9

99

Spearman’s ρ × 100

Page 33: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Results: WordSim-353

Series110

20

30

40

50

60

70

80

57.9

68.171.3 72

76.9 75.6

GloVeFaruqui et al. 2015 Huang et al. 2012paragram-sl999paragram-ws353Average Human

Tune on SimLex-999, test on WordSim-353

Para

gram

-ws3

53

Para

gram

-sl9

99

Hum

an

Spearman’s ρ × 100

Page 34: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

34

Extrinsic Evaluation: Sentiment Analysis

word vectors dimensionality accuracy

skip-gram 25 77.0

skip-gram 50 79.6

paragram 25 80.9

Stanford Sentiment Treebank, binary classification

convolutional neural network (Kim, 2014) with 200 unigram filtersstatic: no fine-tuning of word vectors

25 dimension case

Page 35: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

35

Extrinsic Evaluation: Sentiment Analysis

word vectors dimensionality accuracy

skip-gram 25 77.0

skip-gram 50 79.6

paragram 25 80.9

Stanford Sentiment Treebank, binary classification

convolutional neural network (Kim, 2014) with 200 unigram filtersstatic: no fine-tuning of word vectors

Page 36: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

36

Extrinsic Evaluation: Sentiment Analysis

word vectors dimensionality accuracy

GloVe 300 81.4

paragram-ws353 300 83.9

paragram-sl999 300 84.0

Stanford Sentiment Treebank, binary classification

convolutional neural network (Kim, 2014) with 200 unigram filtersstatic: no fine-tuning of word vectors

300 dimension case

Page 37: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

37

Extrinsic Evaluation: Sentiment Analysis

word vectors dimensionality accuracy

GloVe 300 81.4

paragram-ws353 300 83.9

paragram-sl999 300 84.0

Stanford Sentiment Treebank, binary classification

convolutional neural network (Kim, 2014) with 200 unigram filtersstatic: no fine-tuning of word vectors

Page 38: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

38

We compare standard approaches:

vector addition

recursive neural network (RvNN) (Socher et al., 2011)

recurrent neural networks (RtNN)

Embedding Phrases?

requires binarized parse;we use Stanford parser

Page 39: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

39

Loss Functions for Phrases

replace word vectors by phrase vectors

(computed by RvNN, RtNN, etc.)sum over phrase

pairs in PPDB

we regularize by penalizing squared L2 distance to initial (skip-gram) embeddings and L2 regularization on the composition

parameters

Page 40: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

40

bigram pairs extracted from PPDBTraining:

MLSim (Mitchell & Lapata, 2010)Tuning:

MLParaTest:

adjective noun (134k) noun noun (36k) verb noun (63k)

easy job simple task town meeting town council achieve goal achieve aim

Notes: we extract bigram pairs of each type from PPDB using a part-of-speech tagger when tuning/testing on one subset, we only train on bigram pairs for that subset

Page 41: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

41

Series110

20

30

40

50

60

70

80

36

4541

75

skip-gram (25), +

skip-gram (1000), +

Hashimoto et al. (2014)

Average Human

Spearman’s ρ × 100

Results: MLPara

averages over three data splits: adj noun, noun noun, verb noun

Page 42: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

42

Series110

20

30

40

50

60

70

80

36

4541

46

75

skip-gram (25), +

skip-gram (1000), +

Hashimoto et al. (2014)

paragram (25), +

Average Human

Spearman’s ρ × 100

Results: MLPara

averages over three data splits: adj noun, noun noun, verb nounHu

man

Para

gram

, +

Page 43: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

43

Series110

20

30

40

50

60

70

80

36

4541

46

52

75

skip-gram (25), +skip-gram (1000), +Hashimoto et al. (2014)paragram (25), +paragram (25), RNNAverage Human

Spearman’s ρ × 100

Results: MLPara

averages over three data splits: adj noun, noun noun, verb noun

Para

gram

, +

Hum

an

Para

gram

, RN

N

Page 44: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

44

Series110

20

30

40

50

60

70

80

40

52

75

GloVe

paragram (25), RNN

Average Human

Spearman’s ρ × 100

Results: MLPara

averages over three data splits: adj noun, noun noun, verb noun

300 dimension case

Page 45: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

45

Series110

20

30

40

50

60

70

80

40

52

75

GloVe

paragram (25), RNN

Average Human

Spearman’s ρ × 100

Results: MLPara

averages over three data splits: adj noun, noun noun, verb noun

Page 46: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

46

Series110

20

30

40

50

60

70

80

40

51 52 52

75

GloVe

paragram-ws353,+

paragram-sl999,+

paragram (25), RNN

Average Human

Spearman’s ρ × 100

Results: MLPara

averages over three data splits: adj noun, noun noun, verb noun

Hum

an

Para

gram

-ws3

53,+

Para

gram

-sl9

99,+

Para

gram

(25)

, RN

N

Page 47: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

47

60k phrase pairs from PPDBTraining:

260 annotated phrase pairsTuning:

1000 annotated phrase pairsTest:

that allow the which enable thebe given the opportunity to have the possibility of

i can hardly hear you . you 're breaking up .and the establishment as well as the developmentlaying the foundations pave the way

making every effort to do its utmost… …

Page 48: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

48

Series10

10

20

30

40

50

20

25

33skip-gram (25)PPDBPPDB (tuned)

Results: AnnoPPDB

Spearman’s ρ × 100

support vector regression to predict gold similarities5-fold cross validation on 260-example dev set

Page 49: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

49

Series10

10

20

30

40

50

20

25

33 32skip-gram (25)

PPDB

PPDB (tuned)

paragram (25), +

Results: AnnoPPDB

Spearman’s ρ × 100

Para

gram

, +

Page 50: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

50

Series10

10

20

30

40

50

20

25

33 32

39 40

skip-gram (25)PPDBPPDB (tuned)paragram (25), +paragram (25), RtNNparagram (25), RvNN

Results: AnnoPPDB

Spearman’s ρ × 100

Para

gram

, +

Para

gram

, RtN

N

Para

gram

, RvN

N

Page 51: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

51

Series10

10

20

30

40

50

60

25

40PPDB

paragram (25), RtNN

Results: AnnoPPDB

Spearman’s ρ × 100

300 dimension case

Page 52: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

52

Series10

10

20

30

40

50

60

25

40PPDB

paragram (25), RtNN

Results: AnnoPPDB

Spearman’s ρ × 100

Page 53: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

53

Series10

10

20

30

40

50

60

25

4043

41 PPDB

paragram (25), RtNN

paragram-ws353

paragram-sl999

Results: AnnoPPDB

Spearman’s ρ × 100

Para

gram

-sl9

99

Para

gram

-ws3

53

Page 54: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

54

Series10

10

20

30

40

50

60

25

4043

41

4952

PPDBparagram (25), RtNNparagram-ws353paragram-sl999RtNN (300)LSTM (300)

Results: AnnoPPDB

Spearman’s ρ × 100

RtN

N (3

00)

LSTM

(300

)

Para

gram

-sl9

99

Para

gram

-ws3

53

Page 55: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

55

gold RvNN +

does not exceed is no more than 5.0 4.8 3.5

could have an impact on may influence 4.6 4.2 3.2

earliest opportunity early as possible 4.4 4.3 2.9

gold RcNN +

scheduled to be held in that will take place in 4.6 2.9 4.4

according to the paper , the newspaper reported that 4.6 2.8 4.1

’s surname family name of 4.4 2.8 4.1

RvNN is better:

addition is better:

Qualitative Analysis: For positive examples, addition model outperforms

RvNN when phrases 1) have similar length

2) have more “synonyms” in common

Page 56: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

56

gold RvNN +

does not exceed is no more than 5.0 4.8 3.5

could have an impact on may influence 4.6 4.2 3.2

earliest opportunity early as possible 4.4 4.3 2.9

gold RvNN +

scheduled to be held in that will take place in 4.6 2.9 4.4

according to the paper , the newspaper reported that 4.6 2.8 4.1

’s surname family name of 4.4 2.8 4.1

RvNN is better:

Addition is better:

Page 57: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

Conclusion

Our work shows how to use PPDB to:1) Create word embeddings that have human level performance on Simlex-999 and WordSim-353 2) Create compositonal paraphrase models that can improve correlation of PPDB 1.0 from 25 to 52 ρ.

We have also released two new datasets for evaluation of short-phrase paraphrasing models

Ongoing work: Phrase model improvements, off-the-shelf testing on downstream tasks

Page 58: From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen

58

Thanks!