Upload
vuongcong
View
216
Download
0
Embed Size (px)
Citation preview
Outline
Introduction
Machine Learning
TasksLanguage ModelsPOS TaggingChunking/ParsingNamed Entity RecognitionCoreference ResolutionSentiment AnalysisTopic ModelingWikifierMachine TranslationTrustworthiness
Cognitive Computation Group
Stephen Mayhew (UIUC) 2 / 30
What is NLP?
It is NOT:
• Neuro Linguistic Programming
• Speech processing, although they are similar.
It is:
• Subfield of AI
• Synthesis of: statistics, math, linguistics, computer science,probability theory, cognitive science.
Vague goal: Natural Language Understanding.
Stephen Mayhew (UIUC) 3 / 30
What is NLP?
It is NOT:
• Neuro Linguistic Programming
• Speech processing, although they are similar.
It is:
• Subfield of AI
• Synthesis of: statistics, math, linguistics, computer science,probability theory, cognitive science.
Vague goal: Natural Language Understanding.
Stephen Mayhew (UIUC) 3 / 30
What is NLP?
It is NOT:
• Neuro Linguistic Programming
• Speech processing, although they are similar.
It is:
• Subfield of AI
• Synthesis of: statistics, math, linguistics, computer science,probability theory, cognitive science.
Vague goal: Natural Language Understanding.
Stephen Mayhew (UIUC) 3 / 30
What is NLP?
It is NOT:
• Neuro Linguistic Programming
• Speech processing, although they are similar.
It is:
• Subfield of AI
• Synthesis of: statistics, math, linguistics, computer science,probability theory, cognitive science.
Vague goal: Natural Language Understanding.
Stephen Mayhew (UIUC) 3 / 30
What is NLP?
It is NOT:
• Neuro Linguistic Programming
• Speech processing, although they are similar.
It is:
• Subfield of AI
• Synthesis of: statistics, math, linguistics, computer science,probability theory, cognitive science.
Vague goal: Natural Language Understanding.
Stephen Mayhew (UIUC) 3 / 30
What is NLP?
It is NOT:
• Neuro Linguistic Programming
• Speech processing, although they are similar.
It is:
• Subfield of AI
• Synthesis of: statistics, math, linguistics, computer science,probability theory, cognitive science.
Vague goal: Natural Language Understanding.
Stephen Mayhew (UIUC) 3 / 30
What is NLP?
It is NOT:
• Neuro Linguistic Programming
• Speech processing, although they are similar.
It is:
• Subfield of AI
• Synthesis of: statistics, math, linguistics, computer science,probability theory, cognitive science.
Vague goal: Natural Language Understanding.
Stephen Mayhew (UIUC) 3 / 30
History of NLP
• 1940s and 1950s Birth of the computer
• 1957 - 1983 Two camps: grammatical, statistical
• 1983 - 2000 FSMs, “Empiricism Strikes Back”
• 2000 - present Rise of Machine Learning
Stephen Mayhew (UIUC) 4 / 30
Machine Learning
Simple definition: Machine learning is essentially finding aseparating hyperplane among a set of points in some dimension.
Complicated definition: take a class.
Stephen Mayhew (UIUC) 5 / 30
Language Models
n-grams:“The cat sat on the mat” →(The, cat), (cat, sat), (sat, on), (on, the), (the, mat).
A language model is a distribution over word frequencies,estimated from data.
Pr(wi|wi−2wi−1)
What about unseen words? Smoothing.
What size n-grams make sense?
Stephen Mayhew (UIUC) 11 / 30
Language Models
n-grams:“The cat sat on the mat” →(The, cat), (cat, sat), (sat, on), (on, the), (the, mat).
A language model is a distribution over word frequencies,estimated from data.
Pr(wi|wi−2wi−1)
What about unseen words? Smoothing.
What size n-grams make sense?
Stephen Mayhew (UIUC) 11 / 30
Language Models
n-grams:“The cat sat on the mat” →(The, cat), (cat, sat), (sat, on), (on, the), (the, mat).
A language model is a distribution over word frequencies,estimated from data.
Pr(wi|wi−2wi−1)
What about unseen words? Smoothing.
What size n-grams make sense?
Stephen Mayhew (UIUC) 11 / 30
Language Models
n-grams:“The cat sat on the mat” →(The, cat), (cat, sat), (sat, on), (on, the), (the, mat).
A language model is a distribution over word frequencies,estimated from data.
Pr(wi|wi−2wi−1)
What about unseen words? Smoothing.
What size n-grams make sense?
Stephen Mayhew (UIUC) 11 / 30
Generating Shakespeare
Generating sentences with random unigrams:• Every enter now severally so, let• Hill he late speaks; or! a more to leg less first you enter
With bigrams• What means, sir. I confess she? then all sorts, he is trim,
captain.• Why dost stand forth thy canopy, forsooth; he is this palpable
hit the King Henry.
Trigrams• Sweet prince, Falstaff shall die.• This shall forbid it should be branded, if renown made it empty.
Quadrigrams• What! I will go seek the traitor Gloucester.• Will you not tell me who I am?
Stephen Mayhew (UIUC) 12 / 30
Google n-grams
https://books.google.com/ngrams/Interesting examples: “Ford Model T”, “steam engine”,“artificial intelligence”
Stephen Mayhew (UIUC) 13 / 30
Part-of-Speech Tagging
.Example..
......
Fruit flies like a banana, time flies like an arrow.
NNP/Fruit VBZ/flies IN/like DT/a NN/banana ,/, NN/timeVBZ/flies IN/like DT/an NN/arrow ./.
• Sequence tagging task
• Choose from a fixed set of tags (in English, about 44)
• Solved using Viterbi algorithm (HMM)
• In English, state-of-the-art is about 97%. (Solved).
Stephen Mayhew (UIUC) 14 / 30
Part-of-Speech Tagging
.Example..
......
Fruit flies like a banana, time flies like an arrow.
NNP/Fruit VBZ/flies IN/like DT/a NN/banana ,/, NN/timeVBZ/flies IN/like DT/an NN/arrow ./.
• Sequence tagging task
• Choose from a fixed set of tags (in English, about 44)
• Solved using Viterbi algorithm (HMM)
• In English, state-of-the-art is about 97%. (Solved).
Stephen Mayhew (UIUC) 14 / 30
Part-of-Speech Tagging
.Example..
......
Fruit flies like a banana, time flies like an arrow.
NNP/Fruit VBZ/flies IN/like DT/a NN/banana ,/, NN/timeVBZ/flies IN/like DT/an NN/arrow ./.
• Sequence tagging task
• Choose from a fixed set of tags (in English, about 44)
• Solved using Viterbi algorithm (HMM)
• In English, state-of-the-art is about 97%. (Solved).
Stephen Mayhew (UIUC) 14 / 30
Part-of-Speech Tagging
.Example..
......
Fruit flies like a banana, time flies like an arrow.
NNP/Fruit VBZ/flies IN/like DT/a NN/banana ,/, NN/timeVBZ/flies IN/like DT/an NN/arrow ./.
• Sequence tagging task
• Choose from a fixed set of tags (in English, about 44)
• Solved using Viterbi algorithm (HMM)
• In English, state-of-the-art is about 97%. (Solved).
Stephen Mayhew (UIUC) 14 / 30
Part-of-Speech Tagging
.Example..
......
Fruit flies like a banana, time flies like an arrow.
NNP/Fruit VBZ/flies IN/like DT/a NN/banana ,/, NN/timeVBZ/flies IN/like DT/an NN/arrow ./.
• Sequence tagging task
• Choose from a fixed set of tags (in English, about 44)
• Solved using Viterbi algorithm (HMM)
• In English, state-of-the-art is about 97%. (Solved).
Stephen Mayhew (UIUC) 14 / 30
Part-of-Speech Tagging
.Example..
......
Fruit flies like a banana, time flies like an arrow.
NNP/Fruit VBZ/flies IN/like DT/a NN/banana ,/, NN/timeVBZ/flies IN/like DT/an NN/arrow ./.
• Sequence tagging task
• Choose from a fixed set of tags (in English, about 44)
• Solved using Viterbi algorithm (HMM)
• In English, state-of-the-art is about 97%. (Solved).
Stephen Mayhew (UIUC) 14 / 30
Chunking
.Example..
......
Chunking is not far from POS tagging.
[NP Chunking] [VP is] not [ADVP far] [PP from] [NP POStagging].
• Also a sequence tagging task
• Smaller set of fixed tags (NP, VP, ADVP, etc.)
Stephen Mayhew (UIUC) 15 / 30
Chunking
.Example..
......
Chunking is not far from POS tagging.
[NP Chunking] [VP is] not [ADVP far] [PP from] [NP POStagging].
• Also a sequence tagging task
• Smaller set of fixed tags (NP, VP, ADVP, etc.)
Stephen Mayhew (UIUC) 15 / 30
Chunking
.Example..
......
Chunking is not far from POS tagging.
[NP Chunking] [VP is] not [ADVP far] [PP from] [NP POStagging].
• Also a sequence tagging task
• Smaller set of fixed tags (NP, VP, ADVP, etc.)
Stephen Mayhew (UIUC) 15 / 30
Chunking
.Example..
......
Chunking is not far from POS tagging.
[NP Chunking] [VP is] not [ADVP far] [PP from] [NP POStagging].
• Also a sequence tagging task
• Smaller set of fixed tags (NP, VP, ADVP, etc.)
Stephen Mayhew (UIUC) 15 / 30
Parsing
• More complicated thanchunking
• No longer a sequencetagging problem
• A difficult problem
• Used as input to otherproblems
Stephen Mayhew (UIUC) 16 / 30
Parsing
• More complicated thanchunking
• No longer a sequencetagging problem
• A difficult problem
• Used as input to otherproblems
Stephen Mayhew (UIUC) 16 / 30
Parsing
• More complicated thanchunking
• No longer a sequencetagging problem
• A difficult problem
• Used as input to otherproblems
Stephen Mayhew (UIUC) 16 / 30
Parsing
• More complicated thanchunking
• No longer a sequencetagging problem
• A difficult problem
• Used as input to otherproblems
Stephen Mayhew (UIUC) 16 / 30
Named Entity Recognition
.Example..
......
I’ve got a feeling we’re not in [LOC Kansas] anymore.I’m sorry, [PER Dave], I’m afraid I can’t do that.I’m going to [LOC Lohmann Park] with [PER AbcdeRedbottom] next week.
• Also a sequence labeling task
• Labels: BIO label for each word
• Note: if you have the training data, this can recognize anytype of label.
Stephen Mayhew (UIUC) 17 / 30
Named Entity Recognition
.Example..
......
I’ve got a feeling we’re not in [LOC Kansas] anymore.I’m sorry, [PER Dave], I’m afraid I can’t do that.I’m going to [LOC Lohmann Park] with [PER AbcdeRedbottom] next week.
• Also a sequence labeling task
• Labels: BIO label for each word
• Note: if you have the training data, this can recognize anytype of label.
Stephen Mayhew (UIUC) 17 / 30
Named Entity Recognition
.Example..
......
I’ve got a feeling we’re not in [LOC Kansas] anymore.I’m sorry, [PER Dave], I’m afraid I can’t do that.I’m going to [LOC Lohmann Park] with [PER AbcdeRedbottom] next week.
• Also a sequence labeling task
• Labels: BIO label for each word
• Note: if you have the training data, this can recognize anytype of label.
Stephen Mayhew (UIUC) 17 / 30
Named Entity Recognition
.Example..
......
I’ve got a feeling we’re not in [LOC Kansas] anymore.I’m sorry, [PER Dave], I’m afraid I can’t do that.I’m going to [LOC Lohmann Park] with [PER AbcdeRedbottom] next week.
• Also a sequence labeling task
• Labels: BIO label for each word
• Note: if you have the training data, this can recognize anytype of label.
Stephen Mayhew (UIUC) 17 / 30
Coreference Resolution
.Example..
......
The ball crashed through the table because [it] was made ofstyrofoam.
vs.
The ball crashed through the table because [it] was made ofsteel.
• Very difficult task, even for humans
Stephen Mayhew (UIUC) 18 / 30
Sentiment Analysis
Positive: “Having never been to a Brazilian steakhouse, this place sets the bar
high. Food was awesome! Service was the best I’ve ever had. Always around and
promptly responding if anything was needed, and checking on us, but not being
annoying. Will definitely be back!”
Negative: “Overall this place could be good but is just a disappointment. They
have a great selection of vegetables, meats, sauces, and other ingredients, but even
when following their “recipes” the food isn’t that great. It was extremely salty
and just not very impressive. I think that the grill maybe got my food mixed up
with someone else’s food maybe, it just wasn’t good. Overall it was edible but I
would never go back for the price I paid for salty, mediocre stir fry.”
Like Mozart: Too easy for beginners, too hard for experts.
Stephen Mayhew (UIUC) 19 / 30
Sentiment Analysis
Positive: “Having never been to a Brazilian steakhouse, this place sets the bar
high. Food was awesome! Service was the best I’ve ever had. Always around and
promptly responding if anything was needed, and checking on us, but not being
annoying. Will definitely be back!”
Negative: “Overall this place could be good but is just a disappointment. They
have a great selection of vegetables, meats, sauces, and other ingredients, but even
when following their “recipes” the food isn’t that great. It was extremely salty
and just not very impressive. I think that the grill maybe got my food mixed up
with someone else’s food maybe, it just wasn’t good. Overall it was edible but I
would never go back for the price I paid for salty, mediocre stir fry.”
Like Mozart: Too easy for beginners, too hard for experts.
Stephen Mayhew (UIUC) 19 / 30
Sentiment Analysis
Positive: “Having never been to a Brazilian steakhouse, this place sets the bar
high. Food was awesome! Service was the best I’ve ever had. Always around and
promptly responding if anything was needed, and checking on us, but not being
annoying. Will definitely be back!”
Negative: “Overall this place could be good but is just a disappointment. They
have a great selection of vegetables, meats, sauces, and other ingredients, but even
when following their “recipes” the food isn’t that great. It was extremely salty
and just not very impressive. I think that the grill maybe got my food mixed up
with someone else’s food maybe, it just wasn’t good. Overall it was edible but I
would never go back for the price I paid for salty, mediocre stir fry.”
Like Mozart: Too easy for beginners, too hard for experts.
Stephen Mayhew (UIUC) 19 / 30
Sentiment Analysis
Positive: “Having never been to a Brazilian steakhouse, this place sets the bar
high. Food was awesome! Service was the best I’ve ever had. Always around and
promptly responding if anything was needed, and checking on us, but not being
annoying. Will definitely be back!”
Negative: “Overall this place could be good but is just a disappointment. They
have a great selection of vegetables, meats, sauces, and other ingredients, but even
when following their “recipes” the food isn’t that great. It was extremely salty
and just not very impressive. I think that the grill maybe got my food mixed up
with someone else’s food maybe, it just wasn’t good. Overall it was edible but I
would never go back for the price I paid for salty, mediocre stir fry.”
Like Mozart: Too easy for beginners, too hard for experts.
Stephen Mayhew (UIUC) 19 / 30
Sentiment Analysis
Positive: “Having never been to a Brazilian steakhouse, this place sets the bar
high. Food was awesome! Service was the best I’ve ever had. Always around and
promptly responding if anything was needed, and checking on us, but not being
annoying. Will definitely be back!”
Negative: “Overall this place could be good but is just a disappointment. They
have a great selection of vegetables, meats, sauces, and other ingredients, but even
when following their “recipes” the food isn’t that great. It was extremely salty
and just not very impressive. I think that the grill maybe got my food mixed up
with someone else’s food maybe, it just wasn’t good. Overall it was edible but I
would never go back for the price I paid for salty, mediocre stir fry.”
Like Mozart: Too easy for beginners, too hard for experts.
Stephen Mayhew (UIUC) 19 / 30
Topic Modeling
Topic 1:fire, los, angeles, homes, firefighters, miles, area, officials,people, park, san, ...
Topic 2:health, smoking, medical, children, doctors, cigarettes, percent,public, group, ...
Topic 3:farmers, farm, trade, agriculture, agricultural, yeutter, tons,grain, products, ...
Stephen Mayhew (UIUC) 21 / 30
Machine Translation
Huge task, very difficult.
E = argmaxE
Pr(E | F )
= argmaxE
Pr(F | E) Pr(E)
Note the need of a language model.
• Parallel Corpora
• Alignment
• Phrase-based translation
• More data gets better results
Stephen Mayhew (UIUC) 23 / 30
Machine Translation
Huge task, very difficult.
E = argmaxE
Pr(E | F )
= argmaxE
Pr(F | E) Pr(E)
Note the need of a language model.
• Parallel Corpora
• Alignment
• Phrase-based translation
• More data gets better results
Stephen Mayhew (UIUC) 23 / 30
Machine Translation
Huge task, very difficult.
E = argmaxE
Pr(E | F )
= argmaxE
Pr(F | E) Pr(E)
Note the need of a language model.
• Parallel Corpora
• Alignment
• Phrase-based translation
• More data gets better results
Stephen Mayhew (UIUC) 23 / 30
Machine Translation
Huge task, very difficult.
E = argmaxE
Pr(E | F )
= argmaxE
Pr(F | E) Pr(E)
Note the need of a language model.
• Parallel Corpora
• Alignment
• Phrase-based translation
• More data gets better results
Stephen Mayhew (UIUC) 23 / 30
Machine Translation
Huge task, very difficult.
E = argmaxE
Pr(E | F )
= argmaxE
Pr(F | E) Pr(E)
Note the need of a language model.
• Parallel Corpora
• Alignment
• Phrase-based translation
• More data gets better results
Stephen Mayhew (UIUC) 23 / 30
Google Translate Fail
“Tesco found that 40% of apples are wasted, as are just underhalf of bakery items.”
−→ To Spanish −→
“Tesco found that 40% of the blocks are wasted because theyare slightly less than half of the bakery products.”
Stephen Mayhew (UIUC) 25 / 30
Trustworthiness
...1 ..
2
..
3
..
4
..
1
..2
..
3
..
4
..
5
..
6
..
S
..
C
Bipartite source-claimgraph, what sort ofguarantees or interestingconclusions can we get?
Stephen Mayhew (UIUC) 26 / 30
Things I didn’t talk about
• Grammar induction
• Bayesian methods
• Text generation
• Event extraction
• Information retrieval
• Query expansion
• Word sensedisambiguation
• Textual entailment
• Similarity measures
• Context sensitive spellingcorrection
• ESL correction
• Relation extraction
• Transliteration
• Concept extraction
• Question answering
• ...
Stephen Mayhew (UIUC) 27 / 30
CCG Tools
• Learning Based Java (LBJ)
• JLIS (structured learning)
• Named Entity Recognition
• Wikifier
• Coreference resolution
• Demos
• Much more...
Stephen Mayhew (UIUC) 29 / 30
References
Concept graph: Chen-Tse Tsai, UIUC
Machine learning graph: http://scikit-learn.org/
n-grams slide: http://www.cs.columbia.edu/ kathy/NLP/ClassSlides/Class3-
ngrams09/ngrams.pdf
Wikipedia diagram: Xiao Cheng, UIUC
Parse tree: http://geniferology.blogspot.com/
Topic Models: http://www.cs.princeton.edu/ blei/lda-c/index.html
LDA Graphic:
http://www.cs.cornell.edu/courses/cs6784/2010sp/lecture/30-BleiEtAl03.pdf
Various Examples: http://cogcomp.cs.illinois.edu
Vauquois triangle: Julia Hockenmaier’s slides,
http://courses.engr.illinois.edu/cs498jh/fa2012/Slides/Lecture21HO.pdf
Stephen Mayhew (UIUC) 30 / 30