Upload
zorita-craig
View
25
Download
3
Embed Size (px)
DESCRIPTION
Language Models for Information Retrieval. Andy Luong and Nikita Sudan. Outline. Language Model Types of Language Models Query Likelihood Model Smoothing Evaluation Comparison with other approaches. Language Model. - PowerPoint PPT Presentation
Citation preview
Language Models for Information RetrievalAndy Luong and Nikita Sudan
Outline Language Model Types of Language Models Query Likelihood Model Smoothing Evaluation Comparison with other approaches
Language Model A language model is a function that puts
a probability measure over strings drawn from some vocabulary.
Language Models
P(q|Md) instead of P(R=1|q,d)
Example Doc1: “frog said that toad likes frog” Doc2: “toad likes frog”
frog said that toad likes STOP
M1 1/6 1/6 1/6 .2
M2 1/3 0 0 1/3 .2
1/3 1/6
1/3
Example Continuedq = “frog likes toad”
P(q | M1) = (1/3)*(1/6)*(1/6)*0.8*0.8*0.2
P(q | M2) = (1/3)*(1/3)*(1/3)*0.8*0.8*0.2
P(q | M1) < P (S | M2)
frog said that toad likes STOP
M1 1/3 1/6 1/6 1/6 1/6 .2
M2 1/3 0 0 1/3 1/3 .2
Types of Language Models
CHAIN RULE
UNIGRAM LM
BIGRAM LM
Multinomial distribution
M is the size of the term vocabulary
Order Constraint Frequency
Query Likelihood Model
≈
Query Likelihood ModelInfer LM for each documentEstimate P(q | Md(i))Rank documents based on
probabilities
MLE
Smoothing Basic Intuition
New word or unseen word in the document
P( t | Md ) = 0 Zero probabilities will make P ( q | Md) = 0
Why else should we smooth?
Smoothing ContinuedNon-occurring term
Probability Bound
Linear Interpolation Language Model
Example Doc1: “frog said that toad likes frog” Doc2: “toad likes frog”
frog said that toad likes
M1 1/3 1/6 1/6 1/6 1/6
M2 1/3 0 0 1/3 1/3
C 1/3 1/9 1/9 2/9 2/9
Example Continuedq = “frog said” λ = ½
P(q | M1) = [(1/3 + 1/3)*(1/2)] * [(1/6 + 1/9)*(1/2)]
= .046
P(q | M2) = [(1/3 + 1/3)*(1/2)] * [(0 + 1/9)*(1/2)] = .018
P(q | M1) > P (q | M2)
Evaluation Precision = (relevant documents ∩
retrieved documents)/ retrieved documents
Recall = (relevant documents ∩ retrieved documents)/ relevant documents
Tf-Idf The importance increases proportionally
to the number of times a word appears in the document but is offset by the frequency of the word in the corpus.
Ponte and Croft’s Experiments
Pros and Cons “Mathematically precise, conceptually
simple, computationally tractable and intuitively appealing.”
Relevancy is not captured
Query vs. Document Model
(a) Query Likelihood (b) Document Likelihood (c) Model Comparison
KL divergence
Thank you.
Questions?