86
Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding (NNSE) Brian Murphy, Partha Talukdar, Tom Mitchell Machine Learning Department Carnegie Mellon University 1

Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Learning Effective and Interpretable Semantic Models using

Non-Negative Sparse Embedding (NNSE)

Brian Murphy, Partha Talukdar, Tom MitchellMachine Learning Department

Carnegie Mellon University

1

Page 2: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Distributional Semantic Modeling2

Page 3: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Distributional Semantic Modeling

• Words are represented in a high dimensional vector space

2

Page 4: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Distributional Semantic Modeling

• Words are represented in a high dimensional vector space

• Long history:• (Deerwester et al., 1990), (Lin, 1998), (Turney, 2006), ...

2

Page 5: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Distributional Semantic Modeling

• Words are represented in a high dimensional vector space

• Long history:• (Deerwester et al., 1990), (Lin, 1998), (Turney, 2006), ...

• Although effective, these models are often not interpretable

2

Page 6: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Distributional Semantic Modeling

• Words are represented in a high dimensional vector space

• Long history:• (Deerwester et al., 1990), (Lin, 1998), (Turney, 2006), ...

• Although effective, these models are often not interpretable

2

Examples of top 5 words from 5 randomly chosen dimensions from SVD300

Page 7: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?

Semantic Decoding: (Mitchell et al., Science 2008)

3

Page 8: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?

Semantic Decoding: (Mitchell et al., Science 2008)

“motorbike”

Inputstimulus

word

3

Page 9: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?

Semantic Decoding: (Mitchell et al., Science 2008)

“motorbike”

Semanticrepresentation

(0.87, ride)

(0.29, see)

(0.00, rub)

(0.00, taste)

.

.

.

Inputstimulus

word

3

Page 10: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?

Semantic Decoding: (Mitchell et al., Science 2008)

“motorbike”

Semanticrepresentation

Mapping learned from fMRI data

(0.87, ride)

(0.29, see)

(0.00, rub)

(0.00, taste)

.

.

.

Inputstimulus

word

3

Page 11: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?

Semantic Decoding: (Mitchell et al., Science 2008)

“motorbike”predictedactivity

for “motorbike”

Semanticrepresentation

Mapping learned from fMRI data

(0.87, ride)

(0.29, see)

(0.00, rub)

(0.00, taste)

.

.

.

Inputstimulus

word

3

Page 12: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?(Mitchell et al., Science 2008)

4

Page 13: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?(Mitchell et al., Science 2008)

4

Page 14: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?(Mitchell et al., Science 2008)

• Interpretable dimension reveals insightful brain activation patterns!

4

Page 15: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?(Mitchell et al., Science 2008)

• Interpretable dimension reveals insightful brain activation patterns!

•But, features in the semantic representation were based on 25 hand-selected verbs

4

Page 16: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?(Mitchell et al., Science 2008)

• Interpretable dimension reveals insightful brain activation patterns!

•But, features in the semantic representation were based on 25 hand-selected verbs‣ can’t represent arbitrary concepts

4

Page 17: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Why interpretable dimensions?(Mitchell et al., Science 2008)

• Interpretable dimension reveals insightful brain activation patterns!

•But, features in the semantic representation were based on 25 hand-selected verbs‣ can’t represent arbitrary concepts‣ need data-driven, broad coverage semantic representations

4

Page 18: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

What is an Interpretable, Cognitively-plausible Representation?

5

Page 19: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

What is an Interpretable, Cognitively-plausible Representation?

wor

ds

features

5

Page 20: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

What is an Interpretable, Cognitively-plausible Representation?

wor

ds

features

... 0 0 0 ... 1.2 ...w

5

Page 21: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

What is an Interpretable, Cognitively-plausible Representation?

wor

ds

features

... 0 0 0 ... 1.2 ...w

1. Compact representation:Sparse, many zeros

5

Page 22: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

What is an Interpretable, Cognitively-plausible Representation?

wor

ds

features

... 0 0 0 ... 1.2 ...w

2. Uneconomical to store negative (or inferable) characteristics:

Non-Negative

1. Compact representation:Sparse, many zeros

5

Page 23: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

What is an Interpretable, Cognitively-plausible Representation?

wor

ds

features

3. Meaningful Dimensions:Coherent

... 0 0 0 ... 1.2 ...w

2. Uneconomical to store negative (or inferable) characteristics:

Non-Negative

1. Compact representation:Sparse, many zeros

5

Page 24: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Properties of Different Semantic

Representations

BroadCoverage Effective Sparse

Non-Negative Coherent

6

Page 25: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Properties of Different Semantic

Representations

BroadCoverage Effective Sparse

Non-Negative Coherent

Hand Tuned

6

Page 26: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Properties of Different Semantic

Representations

BroadCoverage Effective Sparse

Non-Negative Coherent

Corpus-derived(existing)

Hand Tuned

6

Page 27: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Properties of Different Semantic

Representations

BroadCoverage Effective Sparse

Non-Negative Coherent

Corpus-derived(existing)

Hand Tuned

6

Hand Coded

Corpus-derived 83.1

83.5

(Murphy, Talukdar, Mitchell, StarSem 2012)

Prediction accuracy (on Neurosemantic Decoding)

Page 28: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Properties of Different Semantic

Representations

BroadCoverage Effective Sparse

Non-Negative Coherent

Corpus-derived(existing)

Hand Tuned

6

Page 29: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Properties of Different Semantic

Representations

BroadCoverage Effective Sparse

Non-Negative Coherent

Corpus-derived(existing)

NNSE

Our proposal

Hand Tuned

6

Page 30: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Properties of Different Semantic

Representations

BroadCoverage Effective Sparse

Non-Negative Coherent

Corpus-derived(existing)

NNSE

Our proposal

Hand Tuned

6

Page 31: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Non-Negative Sparse Embedding

(NNSE)

7

Page 32: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

X

Non-Negative Sparse Embedding

(NNSE)Input Matrix

(Corpus Cooc + SVD)

7

Page 33: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

X

Non-Negative Sparse Embedding

(NNSE)

wi

input representationfor word wi

Input Matrix(Corpus Cooc + SVD)

7

Page 34: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

X

Non-Negative Sparse Embedding

(NNSE)

wi

input representationfor word wi

A= x D

Input Matrix(Corpus Cooc + SVD)

7

Page 35: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

X

Non-Negative Sparse Embedding

(NNSE)

wi

input representationfor word wi

A= x D

BasisInput Matrix

(Corpus Cooc + SVD)

NNSE representation for word wi

7

Page 36: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

X

Non-Negative Sparse Embedding

(NNSE)

wi

input representationfor word wi

A= x D

Basis

• matrix A is non-negative

Input Matrix(Corpus Cooc + SVD)

NNSE representation for word wi

7

Page 37: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

X

Non-Negative Sparse Embedding

(NNSE)

wi

input representationfor word wi

A= x D

Basis

• matrix A is non-negative• sparsity penalty on the rows of A

Input Matrix(Corpus Cooc + SVD)

NNSE representation for word wi

7

Page 38: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

X

Non-Negative Sparse Embedding

(NNSE)

wi

input representationfor word wi

A= x D

Basis

• matrix A is non-negative• sparsity penalty on the rows of A• alternating minimization between A and D,

using SPAMS package

Input Matrix(Corpus Cooc + SVD)

NNSE representation for word wi

7

Page 39: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

NNSE Optimization

X A= x D

8

Page 40: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments9

Page 41: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments

• Three main questions

9

Page 42: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments

• Three main questions1. Are NNSE representations effective in practice?

9

Page 43: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments

• Three main questions1. Are NNSE representations effective in practice?

2. What is the degree of sparsity of NNSE?

9

Page 44: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments

• Three main questions1. Are NNSE representations effective in practice?

2. What is the degree of sparsity of NNSE?

3. Are NNSE dimensions coherent?

9

Page 45: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments

• Three main questions1. Are NNSE representations effective in practice?

2. What is the degree of sparsity of NNSE?

3. Are NNSE dimensions coherent?

• Setup

9

Page 46: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments

• Three main questions1. Are NNSE representations effective in practice?

2. What is the degree of sparsity of NNSE?

3. Are NNSE dimensions coherent?

• Setup• partial ClueWeb09, 16bn tokens, 540m sentences,

50m documents

9

Page 47: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Experiments

• Three main questions1. Are NNSE representations effective in practice?

2. What is the degree of sparsity of NNSE?

3. Are NNSE dimensions coherent?

• Setup• partial ClueWeb09, 16bn tokens, 540m sentences,

50m documents• dependency parsed using Malt parser

9

Page 48: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Baseline Representation: SVD10

Page 49: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Baseline Representation: SVD

• For about 35k words (~adult vocabulary), extract

10

Page 50: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Baseline Representation: SVD

• For about 35k words (~adult vocabulary), extract • document co-occurrence

10

Page 51: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Baseline Representation: SVD

• For about 35k words (~adult vocabulary), extract • document co-occurrence• dependency features from the parsed corpus

10

Page 52: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Baseline Representation: SVD

• For about 35k words (~adult vocabulary), extract • document co-occurrence• dependency features from the parsed corpus

• Reduce dimensionality using SVD. Subsets of this reduced dimensional space is the baseline

10

Page 53: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Baseline Representation: SVD

• For about 35k words (~adult vocabulary), extract • document co-occurrence• dependency features from the parsed corpus

• Reduce dimensionality using SVD. Subsets of this reduced dimensional space is the baseline

• This is also the input (X) to NNSE

10

Page 54: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Baseline Representation: SVD

• For about 35k words (~adult vocabulary), extract • document co-occurrence• dependency features from the parsed corpus

• Reduce dimensionality using SVD. Subsets of this reduced dimensional space is the baseline

• This is also the input (X) to NNSE

• Other representations were also compared (e.g., LDA, Collobert and Weston, etc.), details in the paper

10

Page 55: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Is NNSE effective in

Neurosemantic Decoding?

11

Page 56: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Is NNSE effective in

Neurosemantic Decoding?

11

Page 57: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Is NNSE effective in

Neurosemantic Decoding?

NNSE has similar peak performance as SVD

11

Page 58: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Does NNSE result in sparse

semantic representation?

12

Page 59: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Does NNSE result in sparse

semantic representation?

12

Page 60: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Does NNSE result in sparse

semantic representation?

• NNSE is significantly sparser than SVD

12

Page 61: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Does NNSE result in sparse

semantic representation?

• NNSE is significantly sparser than SVD• Words per dimension is significantly lower in NNSE

12

Page 62: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Does NNSE result in sparse

semantic representation?

• NNSE is significantly sparser than SVD• Words per dimension is significantly lower in NNSE• Growth in active dimensions per word is sub-linear in NNSE

12

Page 63: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?13

Page 64: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)

13

Page 65: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)• for each dimension, select top ranked N (N=5) words

13

Page 66: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)• for each dimension, select top ranked N (N=5) words• intrude it with a low ranking word from this dimension

13

Page 67: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)• for each dimension, select top ranked N (N=5) words• intrude it with a low ranking word from this dimension• intruding word should be high ranking in another dimension

13

Page 68: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)• for each dimension, select top ranked N (N=5) words• intrude it with a low ranking word from this dimension• intruding word should be high ranking in another dimension• ask a human to identify the intruding word

13

Page 69: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)• for each dimension, select top ranked N (N=5) words• intrude it with a low ranking word from this dimension• intruding word should be high ranking in another dimension• ask a human to identify the intruding word• repeat multiple times for each dimension, calculate precision

13

Page 70: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)• for each dimension, select top ranked N (N=5) words• intrude it with a low ranking word from this dimension• intruding word should be high ranking in another dimension• ask a human to identify the intruding word• repeat multiple times for each dimension, calculate precision

• An intruded set from an NNSE1000 dimension

13

Page 71: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

• Word Intrusion Task (Boyd-Graber et al., NIPS 2009)• for each dimension, select top ranked N (N=5) words• intrude it with a low ranking word from this dimension• intruding word should be high ranking in another dimension• ask a human to identify the intruding word• repeat multiple times for each dimension, calculate precision

• An intruded set from an NNSE1000 dimension

intruder

13

Page 72: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?14

Page 73: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?14

Page 74: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Are NNSE Dimensions Coherent?

NNSE dimensions are significantly more coherent than SVD-based dimensions

14

Page 75: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

SVD and NNSE Dimensions: Examples

Examples of top 5 words from 5 randomly chosen dimensions from SVD300 and NNSE1000

15

Page 76: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

SVD and NNSE Dimensions: Examples

Examples of top 5 words from 5 randomly chosen dimensions from SVD300 and NNSE1000

NNSE dimensions are significantly more coherent than SVD-based dimensions

15

Page 77: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Top 5 NNSE1000 Dimensions for “apple”16

Page 78: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Top 5 NNSE1000 Dimensions for “apple”

0.40 raspberry, peach, pear, mango, melon

0.26 ripper, aac, converter, vcd, rm

0.14 cpu, intel, mips, pentium, risc

0.13 motorola, lg, samsung, vodafone, alcatel

0.11 peaches, apricots, pears, cherries, blueberries

16

Page 79: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Top 5 NNSE1000 Dimensions for “apple”

0.40 raspberry, peach, pear, mango, melon

0.26 ripper, aac, converter, vcd, rm

0.14 cpu, intel, mips, pentium, risc

0.13 motorola, lg, samsung, vodafone, alcatel

0.11 peaches, apricots, pears, cherries, blueberries

Fruit

Music

Processor

Company

Fruit

16

Page 80: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Top 5 NNSE1000 Dimensions for “apple”

0.40 raspberry, peach, pear, mango, melon

0.26 ripper, aac, converter, vcd, rm

0.14 cpu, intel, mips, pentium, risc

0.13 motorola, lg, samsung, vodafone, alcatel

0.11 peaches, apricots, pears, cherries, blueberries

Fruit

Music

Processor

Company

Fruit

Different senses of the word are not mixed, each dimension corresponds to only

one sense of “apple”!

16

Page 81: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Conclusion17

Page 82: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Conclusion

•Non-Negative Sparse Embedding (NNSE)•broad coverage, sparse, non-negative• interpretable, effective in practice•probably first semantic model with all these

desirable traits

17

Page 83: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Conclusion

•Non-Negative Sparse Embedding (NNSE)•broad coverage, sparse, non-negative• interpretable, effective in practice•probably first semantic model with all these

desirable traits

•Exploited large text corpus, including deep linguistic features (e.g., dependency parses)

17

Page 84: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Conclusion

•Non-Negative Sparse Embedding (NNSE)•broad coverage, sparse, non-negative• interpretable, effective in practice•probably first semantic model with all these

desirable traits

•Exploited large text corpus, including deep linguistic features (e.g., dependency parses)

•Future work• multi-word extension; using NNSE representations

in non-neurosemantic domains (e.g., NELL)

17

Page 85: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Acknowledgment

• Khaled El-Arini (CMU) and Min Xu (CMU)

• Sheshadri Sridharan (CMU), Marco Baroni (Trento)

• Justin Betteridge (CMU), Jamie Callan (CMU)

• DARPA, Google, NIH

18

Page 86: Learning Effective and Interpretable Semantic Models using ... · Are NNSE Dimensions Coherent? •Word Intrusion Task (Boyd-Graber et al., NIPS 2009) • for each dimension, select

Thank You!

NNSE embeddings available at:http://www.cs.cmu.edu/~bmurphy/NNSE/

19