33
Natural Language Processing Conclusion and Review

Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Natural Language Processing

Conclusion and Review

Page 2: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Levels of Linguistic Knowledge

phonetics

phonology

orthography

morphology

syntax

semantics

pragmatics

discourse

“shallower”

“deeper”

spoken

written

Page 3: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

uygarlastıramadıklarımızdanmıssınızcasına “(behaving) as if you are among those whom we could not civilize”

Page 4: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

uygarlastıramadıklarımızdanmıssınızcasına “(behaving) as if you are among those whom we could not civilize”

uygar “civilized”

+las “become”

+tır “cause to”

+ama “not able”

+dık past participle

+lar plural

+ımız first person plural possessive (“our”)

+dan second person plural (“y’all”)

+mıs past

+sınız ablative case (“from/among”)

+casına finite verb → adverb (“as if”)

Page 5: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Finite-State Automaton

• Q: a finite set of states

• q0 Q: a special start state∈• F Q: a set of final states ⊆• Σ: a finite alphabet

• Transitions:

• Encodes a set of strings that can be recognized by following paths from q0 to some state in F.

qiqjs Σ*∈

......

Page 6: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Levels of Linguistic Knowledge

phonetics

phonology

orthography

morphology

syntax

semantics

pragmatics

discourse

“shallower”

“deeper”

spoken

written

ambiguity

Page 7: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Noisy Channel

sourcesource

channel

y x

decode

What you want What you see

Page 8: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Noisy Channel

sourcesource

channel

y x

decode

Cats meow oftenNN VB RB

Page 9: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Noisy Channel

sourcesource

channel

y x

decode

你好吗?How are you?

Page 10: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Noisy Channel

sourcesource

channel

y x

decode

Okay, Google

Page 11: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Starting and Stopping

Unigram model:

...

Bigram model:

...

Trigram model:

...

Page 12: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Language Modeling Questions

• Why do we use context?

• What does smoothing do, and why is it necessary?

• What do we use to evaluate language models?

Page 13: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Tagging

Page 14: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Broad POS categories

closed classesopen classes

nouns

verbs

adjectives

adverbs

prepositions

determiners

pronouns

conjunctions

auxiliary verbs

particles

numerals

Page 15: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Syntax

Page 16: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Parsing

• CKY vs. Earley’s Algorithm– Both dynamic programming

– CNF vs. general forms

Page 17: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

CKY Algorithm: Chart

Noun,Verb

- VP,S - S

book Det NP - NP

this Noun - -

flight Prep PP

through PNoun,NP

Houston

Page 18: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

CKY EquationsCKY Equations

Page 19: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Semantics

Page 20: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Where’s the beef?

Sentences from the brown corpus. Extracted from the concordancer in The Compleat Lexical Tutor, http://www.lextutor.ca/

Page 21: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

chicken

Page 22: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Synsets for dog (n)

• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds) "the dog barked all night"

• S: (n) frump, dog (a dull unattractive unpleasant girl or woman) "she got a reputation as a frump"; "she's a real dog"

• S: (n) dog (informal term for a man) "you lucky dog" • S: (n) cad, bounder, blackguard, dog, hound, heel (someone who is

morally reprehensible) "you dirty dog" • S: (n) frank, frankfurter, hotdog, hot dog, dog, wiener, wienerwurst,

weenie (a smooth-textured sausage of minced beef or pork usually smoked; often served on a bread roll)

• S: (n) pawl, detent, click, dog (a hinged catch that fits into a notch of a ratchet to move a wheel forward or prevent it from moving backward)

• S: (n) andiron, firedog, dog, dog-iron (metal supports for logs in a fireplace) "the andirons were too hot to touch"

22

Page 23: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably
Page 24: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Entity Linking

Mary picked up the ball. She threw it to me.

Page 25: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

Semantic Roles

PropBank is a set of verb-sense-specific “frames” with informal descriptions for their arguments.

Consider the word “Agree”

•ARG0: agreer

•ARG1: proposition

•ARG2: other entity agreeing

[The group] ARG0 agreed [it wouldn’t make an offer]ARG1.

Usually [John] ARG0 agrees [with Mary on everything] ARG2.

Page 26: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

“Fall (move downward)” in PropBank

• arg1: logical subject, patient, thing falling• arg2: extent, amount fallen• arg3: starting point• arg4: ending point• argM-loc: medium

Sales fell to $251.2 million from $278.8 million.The average junk bond fell by 4.2%.The meteor fell through the atmosphere, crashing

into Cambridge.

Page 27: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

MRL #1: First-Order Logic

DressCode(ThePorch)

Serves(UnionGrill, AmericanFood)

Restaurant(UnionGrill)

Have(Speaker, FiveDollars) ^ ¬ Have(Speaker, LotOfTime)

∀x Person(x) Have(x, FiveDollars)⇒∃x,y Person(x) ^ Restaurant(y) ^ ¬HasVisited(x,y)

Function

Predicates

Page 28: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

First Order Logic: Advantages

• Flexible

• Well-understood

• Widely used

Page 29: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

EM

• We often have unlabeled or incomplete data

• EM is an for learning without labels, e.g., “classification” without classes

The Solution

•Pick random centroids!

•Iterate the following:!

•Use centroids to label the data!

•Compute centroids using the labeled data!

•Keep doing this until labels don’t change

E-step

M-step

Page 30: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

NLP UsesNLP UsesAnswer questions using the WebAnswer questions using the WebTranslate documents from one language to another Translate documents from one language to another Do library research; summarizeDo library research; summarizeManage messages intelligently Manage messages intelligently Help make informed decisionsHelp make informed decisionsFollow directions given by any user Follow directions given by any user Fix your spelling or grammarFix your spelling or grammarGrade examsGrade examsWrite poems or novelsWrite poems or novelsListen and give adviceListen and give adviceEstimate public opinionEstimate public opinionRead everything and make predictionsRead everything and make predictionsInteractively help people learnInteractively help people learnHelp disabled peopleHelp disabled peopleHelp refugees/disaster victims Help refugees/disaster victims Document or reinvigorate indigenous languagesDocument or reinvigorate indigenous languages

Page 31: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

More NLP ...

• Language Technologies Minor

– 4 LT courses plus LT project

• 5th year Masters in Language Technologies

Page 32: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

More NLP Courses

• 11-492/692 Speech Processing

– Fall: Alan W Black

– Practical Systems for Speech

• 11-711 Algorithms and NLP

– Fall: Emma Strubell, Robert Frederking

– Research oriented

• 11-727 Computational Semantics

– Spring: Ed Hovy, Teruko Mitamura

Page 33: Conclusion and Reviewdemo.clab.cs.cmu.edu/NLP/F20/files/slides/30-conclusion.pdfSynsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably

More NLP Courses

• 11-747 Neural Networks for NLP

– Spring: Graham Neubig

• 11-830 Computational Ethics for NLP

– Spring: Yulia Tsvetkov, Alan W Black

• 11-777 Advanced Multimodal ML

– Fall: Louis-Philippe Morency

– Visual, Gesture, Speech

• Most Neural Net Classes

– Always involve NLP