14
LEARNING VALID ADVERB- ADJECTIVE PAIRS CAROLINE SUEN CS224U WINTER 2013

LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

  • Upload
    dohanh

  • View
    273

  • Download
    2

Embed Size (px)

Citation preview

Page 1: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

LEARNING VALID ADVERB-ADJECTIVE PAIRS CAROLINE SUEN

CS224U WINTER 2013

Page 2: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

THE CHALLENGE We can say:

•  “The glass is half full.” •  or “Wow, Bob is really tall.”

But can we say:

•  “Wow, Bob is half tall”. •  or “The glass is really full.” ?

Goal: develop a model that can learn whether an adverb and an adjective can be used together and make grammatical sense.

Page 3: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

PRIOR WORK Syrett and Lidz (2010)

•  Use linguistics to develop patterns

Sentiment analysis

•  Benemara et. al (2007), Liu et. al (2009)

Adjective-noun pairs •  Hatzivassiloglou et. al (1993)

Page 4: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

EXTRACTING DATA half completely extremely nearly

full 5 3 3 1 tall 0 0 4 0

smart 0 1 4 0 daylong 0 0 0 1

•  New York Times dataset, ~18000 articles •  Stanford POS tagger to find valid adverb-adjective pairs •  1019 adverbs, 4876 adjectives, 19337 pairs

Page 5: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

BUILDING A GRAPH half

completely

extremely

nearly

full

tall

smart

daylong

Relatively sparse bipartite graph

Page 6: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

PARTITIONING

half

completely

extremely

nearly

full

tall

smart

daylong

Page 7: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

BUILDING A GRAPH: TECHNICAL DETAILS •  Used Stanford Network Analysis Platform •  Experimented:

•  Find dense bipartite subgraphs using the frequent itemset algorithm

•  Build adverb graphs and adjective graphs and run community detection algorithms on these graphs

•  Based on common neighbors

Page 8: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

half

completely

extremely

nearly

full

tall

smart

daylong

half

completely extremely

nearly

full

tall

smart

daylong

Adjective graph

Adverb graph

Page 9: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

From Wikipedia

CLIQUE PERCOLATION

Page 10: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

CLASSIFY: DOES AN EDGE BELONG? Use the communities that adverbs u and adjective v are in. If, by combining these communities, the edge density is sufficiently high, we claim that u and v can be paired up. Harder case:

•  An adverb is in communities C1 and C2. How likely is it to be connected to an adjective in communities D1, D2, and D3?

•  Thankfully, this is rare! •  Larger and more densely connected communities are

given higher weight

Page 11: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

EVALUATION: RECALL •  Find “test data” (1100 edges) – remaining edges is

“training data” •  Find communities based on training data •  Observe fraction of test data edges recovered

Page 12: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

EVALUATION: RECALL

Not enough connections: 260 (21.7%)

Not discovered by community detection algorithm: 129 (11.7%)

Correctly discovered by community detection algorithm: 711 (64.6%)

Page 13: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

CHALLENGES + NEXT STEPS •  Not enough pairings

•  (recall for test data with enough connections: 84.6%) •  Clique percolation is slow

•  priority was building evaluation framework first •  next steps: experimenting with clustering

•  Adjective edge connections are much more important than adverb connections

•  Current framework does not test precision •  MTurk for crowd-sourced, hand-labeled data

•  Potential next step:

•  Check Syrett and Lidz’ linguistic results

Page 14: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013

THE END

THANKS FOR LISTENING! J