Language Models for Information Retrieval

References:1. W. B. Croft and J. Lafferty (Editors). Language Modeling for Information Retrieval. July 20032. X. Liu and W.B. Croft, Statistical Language Modeling For Information Retrieval, the Annual Review of Information Science and

Technology, vol. 39, 2005 3. Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University

Press, 2008. (Chapter 12)4. D. A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer, 2004 (Chapter 2)5. C.X. Zhai. Statistical Language Models for Information Retrieval (Synthesis Lectures Series on Human Language Technologies).

Morgan & Claypool Publishers, 2008

Berlin ChenDepartment of Computer Science & Information Engineering

National Taiwan Normal University

IR – Berlin Chen 2

Taxonomy of Classic IR Models

Document Property

TextLinksMultimedia Proximal Nodes, others

XML-based

Semi-structured Text

Classic Models

BooleanVectorProbabilistic

Set Theoretic

FuzzyExtended BooleanSet-based

Probabilistic

BM25Language ModelsDivergence from RamdomnessBayesian Networks

Algebraic

Generalized VectorLatent Semanti IndexingNeural NetworksSupport Vector Machines

Page RankHubs & Authorities

Image retrievalAudio and Music RetrievalVideo Retrieval

Multimedia Retrieval

Statistical Language Models (1/2)

• A probabilistic mechanism for “generating” a piece of text– Define a distribution over all possible word sequences

– Used LM to quantify the acceptability of a given word sequence • What is LM Used for ?

– Speech recognition– Spelling correction– Handwriting recognition– Optical character recognition– Machine translation– Document classification and routing– Information retrieval …

LwwwW 21

Statistical Language Models (2/2)

• (Statistical) language models (LM) have been widely used for speech recognition and language (machine) translation for more than thirty years

• However, their use for information retrieval started only in 1998 [Ponte and Croft, SIGIR 1998]– Basically, a query is considered generated from an “ideal”

document that satisfies the information need– The system’s job is then to estimate the likelihood of each

document in the collection being the ideal document and rank then accordingly (in decreasing order)

Ponte and Croft. A language modeling approach to information retrieval. SIGIR 1998

Three Ways of Developing LM Approaches for IR

(a) Query likelihood(b) Document likelihood(c) Model comparison

literal term matching or concept matching

Query-Likelihood Language Models

• Criterion: Documents are ranked based on Bayes (decision) rule

– is the same for all documents, and can be ignored

– might have to do with authority, length, genre, etc.• There is no general way to estimate it• Can be treated as uniform across all documents

• Documents can therefore be ranked based on

– The user has a prototype (ideal) document in mind, and generates a query based on words that appear in this document

– A document is treated as a model to predict (generate) the query

DPDQPQDP

M as denotedor DQPDQP

document model

Another Criterion: Maximum Mutual Information

• Documents can be ranked based their mutual information with the query (in decreasing order)

• Document ranking by mutual information (MI) is equivalent that by likelihood

DPQPDQPDQMI

loglog

being the same for all documents, and hence can be ignored

DQPDQMIDDmaxarg,maxarg

Yet Another Criterion: Minimum KL Divergence

• Documents are ranked by Kullback-Leibler (KL) divergence (in increasing order)

loglog

rankDQPDwPQwc

DwPQwP

DwPQwPQwPQwP

DwPQwP

QwPDQKL

The same for all document=> can be disregarded

Relevant documents are deemed tohave lower cross entropies

Cross entropy between the language models of a query and a document

Documentmodel

Querymodel

Equivalent to ranking in decreasing order of

Schematic Depiction for Query-Likelihood Approach

DocumentCollection

query (Q)

DocumentModels

Building Document Models: n-grams

• Multiplication (Chain) rule

– Decompose the probability of a sequence of events into the probability of each successive events conditioned on earlier events

• n-gram assumption– Unigram

• Each word occurs independently of the other words• The so-called “bag-of-words” model (e.g., how to distinguish

“street market” from “market street)– Bigram

– Most language-modeling work in IR has used unigram models• IR does not directly depend on the structure of sentences

12121312121 LLL wwwwPwwwPwwPwPw....wwP

LL wPwPwPwPw....wwP 32121

12312121 LLL wwPwwPwwPwPw....wwP

Unigram Model (1/4)

• The likelihood of a query given a document

– Words are conditionally independent of each other given the document

– How to estimate the probability of a (query) word given the document ?

• Assume that words follow a multinomial distributiongiven the document

wPwPwPQP

Lw....wwQ 21 D

Vi wDiw

λ,wPλ

!wcwc,...,wcP

occursword a timesofnumber the where

permutation is considered here

Unigram Model (2/4)

• Use each document itself a sample for estimating its corresponding unigram (multinomial) model– If Maximum Likelihood Estimation (MLE) is adopted

P(wb|MD)=0.3

P(wc |MD)=0.2P(wd |MD)=0.1P(we |MD)=0.0P(wf |MD)=0.0

DD,wcD:D

Dw:D,wc

, oflength in occurs timesofnumber

The zero-probability problemIf we and wf do not occur in Dthen P(we |MD)= P(wf |MD)=0

This will cause a problem in predicting the query likelihood (See the equation for the query likelihood in the preceding slide)

Unigram Model (3/4)

• Smooth the document-specific unigram model with a collection model (two states, or a mixture of two multinomials)

• The role of the collection unigram model– Help to solve zero-probability problem– Help to differentiate the contributions of different missing terms in

a document (global information like IDF ? )

• The collection unigram model can be estimated in a similar way as what we do for the document-specific unigram model

Li CiDiD wPλwPλQP 1 1 MMM

CiwP M

ll w l

Collection,wcCollection,wcwP or M ii wn

N containing collection in the doc ofnumber :

collection in the doc ofnumber :

Normalized doc freq

DiwP M

CiwP M

A document model

Lw....wwQ 21

Unigram Model (4/4)

• An evaluation on the Topic Detection and Tracking (TDT) corpora– Language Model

– Vector Space Model

mAP Unigram Unigram+Bigram

TQ/TD 0.6327 0.5427

TDT2 TQ/SD 0.5658 0.4803

TQ/TD 0.6569 0.6141

TDT3 TQ/SD 0.6308 0.5808

mAP Unigram Unigram+Bigram

TQ/TD 0.5548 0.5623

TDT2 TQ/SD 0.5122 0.5225

TQ/TD 0.6505 0.6531

TDT3 TQ/SD 0.6216 0.6233

CiLi Di

DUnigram

MwPλMwPλ

Li CiDi

DBigramUnigram

M,wwPλλλ

M,wwPλ

MwPλMwPλ

• Consideration of contextual information (Higher-order language models, e.g., bigrams)will not always lead to improved performance

Training Mixture Weights of LMs

• Expectation-Maximization (EM) Training– The weights are tied among the documents

– E.g. m1 of Type I HMM can be trained using the following equation:

• Where is the set of training query exemplars, is the set of docs that are relevant to a specific

training query exemplar , is the length of the query , and is the total number of docs relevant to the query

Q QR n

TrainSetQQR

TrainSetQ DocD Qq nn

DocQCorpusqPmDqPm

QTrainSet QRDoc to

Q Q QRDoc to

the old weight

the new weight819 queries ≦2265 docs

Discriminative Training of LMs (1/5)

• Minimum Classification Error (MCE) Training– Given a query and a desired relevant doc , define the

classification error function as:

“>0”: means misclassified; “<=0”: means a correct decision

– Transform the error function to the loss function

• In the range between 0 and 1

RDQPRDQPQ

not is 'logmax is log1),( '

)),(exp(11),(

DQEDQL

),( *DQE

),( *DQL

B. Chen et al., “A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents,” ACM Transactions on Asian Language Information Processing, 3(2), pp. 128-145, June 2004

• Minimum Classification Error (MCE) Training– Apply the loss function to the MCE procedure for iteratively

updating the weighting parameters• Constraints:

• Parameter Transformation, (e.g.,Type I HMM)and

– Iteratively update (e.g., Type I HMM)

• Where,

1 , 0 k

DQLiimim **

11 ~,~1~

),(),(

DQEDQLi

mDQLimD

),(1),(),(),( **

*DQLDQL

DQEDQL

Gradient descent

• Minimum Classification Error (MCE) Training– Iteratively update (e.g., Type I HMM)

CorpusqPmDqPmDqPm

CorpusqPee

eDqPee

CorpusqPee

eDqPee

eCorpusqPeDqPeeee

CorpusqPee

eDqPee

xgxfxgxfxgxf

• Minimum Classification Error (MCE) Training– Iteratively update (e.g., Type I HMM)

**~, 1*

n CorpusqPimDqPimDqPim

DQLDQLii

~~)(~~~)(~

)(~)(~

2~,*1~,*

212~,*2211~,*

211~,*1

2~,*21~,*

imimiimimimiim

imimiim

iimiim

eimeim

eeeeeeee

the new weight

the old weight

)(~1~1

~,*11 iimimmD

• Minimum Classification Error (MCE) Training– Final Equations

• Iteratively update

• can be updated in the similar way

CorpusqPimDqPimDqPim

DQLDQLii

)(1~,*

11 1 i

eimeimeimim

Statistical Translation Model (1/2)

• A query is viewed as a translation or distillation from a document– That is, the similarity measure is computed by estimating the

probability that the query would have been generated as a translation of that document

• Assumption of context-independence (the ability to handle the ambiguity of word senses is limited)

• However, it has the capability of handling the issues of synonymy (multiple terms having similar meaning) and polysemy (the same term having multiple meanings)

Qqc DwPwqPDqPDQPDQsim ,,Trans,

Berger & Lafferty (1999)

A. Berger and J. Lafferty. Information retrieval as statistical translation. SIGIR 1999

word-to-word translation

QqcCqDqPDQP ,

Trans M1ˆ

Statistical Translation Model (2/2)

• Weakness of the statistical translation model– The need of a large collection of training data for estimating

translation probabilities, and inefficiency for ranking documents

• Jin et al. (2002) proposed a “Title Language Model” approach to capture the intrinsic document to query translation patterns– Queries are more like titles than documents (queries and titles

both tend to be very short and concise descriptions of information, and created through a similar generation process)

– Train the statistical translation model based on the document-title pairs in the whole collection

R. Jin et al. Title language model for information retrieval. SIGIR 2002

j TtjMM

DwPwtP

DtPDTPM

maxarg

maxarg maxarg

Probabilistic Latent Semantic Analysis (PLSA)

• Also called The Aspect Model, Probabilistic Latent Semantic Indexing (PLSI)– Graphical Model Representation (a kind of Bayesian Networks)

T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 2001

Hofmann (1999)

Q,wcCD MwPλMwPλ

DPDQPQPDPDQP

QDPD,Qsim

DTPTwP

DwPDQPD,Qsim

The latent variables=>The unobservable class variables Tk

(topics or domains)

Language (unigram) model

PLSA: Formulation

• Definition– : the prob. when selecting a doc

– : the prob. when pick a latent class for the doc

– : the prob. when generating a word from the class

D DTP k kT

kTwP kTw

PLSA: Assumptions

• Bag-of-words: treat docs as memoryless source, words are generated independently

• Conditional independent: the doc and word are independent conditioned on the state of the associated latent variable

kkk TDPTwPTDwP ,

Q,wcDwPDQPD,Qsim

DTPTwP

DPDTPTwP

DPTPTDPTwP

DPTPTDwP

DPTDwPDTwPDwP

PLSA: Training (1/2)

• Probabilities are estimated by maximizing the collection likelihood using the Expectation-Maximization (EM) algorithm

D w Tkk

DTPTwPlogD,wc

DwPlogD,wcL

EM tutorial:- Jeff A. Bilmes "A Gentle Tutorial of the EM Algorithm and its Application

to Parameter Estimation for Gaussian Mixture and Hidden Markov Models," U.C. Berkeley TR-97-021

PLSA: Training (2/2)

• E (expectation) step

• M (Maximization) step

D kk D,wTPD,wc

D,wTPD,wcTwP

kwk Dwc

DwTPDwcDTP

kkk DTPTwP

DTPTwPDwTP ,

PLSA: Latent Probability Space (1/2)

image sequenceanalysis

medical imagingcontext of contourboundary detection

phonetic segmentation

TDPTPTwP

DTPDTwPDTwPDwP

kjkj TwP

,:U kkTPdiag:Σ

kiki TDP,

:V DWP , .= .

Dimensionality K=128 (latent classes)

PLSA: Latent Probability Space (2/2)

D1 D2 Di Dn

P Umxk

Σk VTkxn

T1…Tk… TK

kjij TDPTPTwPDwPk

kj TwP

kTPw1w2

ki TDP

ij DwP ,

D1 D2 Di Dn

PLSA: One more example on TDT1 dataset

aviation space missions family love Hollywood love

PLSA: Experiment Results (1/4)

• Experimental Results – Two ways to smoothen empirical distribution with PLSA

• Combine the cosine score with that of the vector space model (so does LSA)PLSA-U* (See next slide)

• Combine the multinomials individually PLSA-Q*

Both provide almost identical performance– The performance of PLSA ( ) is not promising

when being used alone

)|()1()|()|(* DwPDwPDwP PLSAEmpiricalQPLSA

DcDwcDwPEmpirical

kkkPLSA DTPTwPDwP

PLSAEmpiricalQPLSA DwPDwPDQP ,* )|()1()|()|(

D|wPPLSA

PLSA-U*• Use the low-dimensional representation and

(be viewed in a k-dimensional latent space) to evaluate relevance by means of cosine measure

• Combine the cosine score with that of the vector space model

• Use the ad hoc approach to re-weight the different model components (dimensions) by

)|( QTP k )|( DTP k

UPLSADTPQTP

DTPQTPDQR

22* ),(

where,

k Q,wc

Q,wTPQ,wcQTP

),(1),(),(~** DQRDQRDQR VSMUPLSAUPLSA

online folded-in

• Why ?

– Reminder that in LSA, the relations between any two docs can be formulated as

– PLSA mimics LSA in similarity measure

iQPLSIDTPQTP

DTPQTPDQR

22* ),(

A’TA’=(U’Σ’V’T)T ’(U’Σ’V’T) =V’Σ’T’UT U’Σ’V’T=(V’Σ’)(V’Σ’)T

ksskiik

kkskkki

siQPLSI

DTPDTP

DPDTPDPDTP

TPTDPTPTDP

TDPTPTPTDPDDR

vectorsrow are ˆ and ˆ

ˆˆˆˆ

)ˆ,ˆ(,2

DDDDDDcoineDDsim

iikkki DPDTPTPTDP

PLSA vs. LSA

• Decomposition/Approximation– LSA: least-squares criterion measured on the L2- or Frobenius

norms of the word-doc matrices– PLSA: maximization of the likelihoods functions based on the

cross entropy or Kullback-Leibler divergence between the empirical distribution and the model

• Computational complexity– LSA: SVD decomposition– PLSA: EM training, is time-consuming for iterations ?

– The model complexity of Both LSA and PLSA grows linearly with the number of training documents

• There is no general way to estimate or predict the vector representation (of LSA) or the model parameters (of PLSA) for a newly observed document

Latent Dirichlet Allocation (LDA) (1/2)

• The basic generative process of LDA closely resembles PLSA; however,– In PLSA, the topic mixture is conditioned on each

document ( is fixed, unknown)– While in LDA, the topic mixture is drawn from a Dirichlet

distribution, so-called the conjugate prior, ( is unknown and follows a probability distribution)

Blei et al. (2003)

parameter withondistributi lmultinomiaa from orda Pick 4)parameter

withondistributi lmultinomiaa from ,,2,1 a topic Pick 3) parameter withondistributiet Dirichl

a from document eachfor ondistributi lmultinomiaa Pick )2parameter withondistributiet Dirichl

a from topiceachfor ondistributi lmultinomiaa Pick )1 LDA withcorpusa generating of Process

Blei et al. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003

Latent Dirichlet Allocation (2/2)

word 2

word 3

word 1

X (P(w1))Y (P(w2))

Z (P(w3))

X+Y+Z=1

kDkzkiD

kT ddTPTwPpPL

1 11LDA ,||

Word Topic Models (WTM)

• Each word of language are treated as a word topical mixture model for predicting the occurrences of other words

• WTM also can be viewed as a nonnegative factorization of a “word-word” matrix consisting probability entries – Each column encodes the vicinity information of all occurrences of

a distinct word in the document collection

kwkkiwi jj

TPTwPwP1

WTM M||M|

B. Chen, “Word topic models for spoken document retrieval and transcription,” ACM Transactions on Asian Language Information Processing, 8(1), pp. 2:1-2:27, March 2009

Comparison of WTM and PLSA/LDA

• A schematic comparison for the matrix factorizations of PLSA/LDA and WTM

documents

vicinities of words

documentstopics

topics vicinities of words

normalized “word-document”co-occurrence matrix

normalized “word-word”co-occurrence matrix

mixture components

mixture weights

mixture components

mixture weights

PLSA/LDA

kwkkiwi jj

TPTwPwP1

WTM M||M|

kDkkiDi TPTwPwP

1PLSA M||M|

WTM: Information Retrieval (1/4)

• The relevance measure between a query and a document can be expressed by

• Unsupervised training– The WTM of each word can be trained by concatenating those

words occurring within a context window of size around each occurrence of the word, which are postulated to be relevant to the word

kwkkiDj

TPTwPDQP,

1,WTM M

.Mlog,Mloglog WTMWTM

j jwijj

jjj w Qw

ww wPOwcOPL

jw jw jw

Nwwww jjjjOOOO ,2,1, ,,,

2,jwO Nw j

• Supervised training: The model parameters are trained using a training set of query exemplars and the associated query-document relevance information– Maximize the log-likelihood of the training set of query

exemplars generated by their relevant documents

TrainSet QR

TrainSet Q DDQPL

toWTMloglog

• Example topic distributions of WTM

Topic 13

word weight

Vena (靜脈) 1.202

Resection (切除) 0.674

Myoma (肌瘤) 0.668

Cephalitis (腦炎) 0.618

Uterus (子宮) 0.501

Bronchus (支氣管) 0.500

Topic 23

word weight

Cholera (霍亂) 0.752

Colorectal cancer (大腸直腸癌)

Salmonella enterica(沙門氏菌)

Aphtae epizooticae(口蹄疫)

Thyroid (甲狀腺) 0.303

Gastric cancer (胃癌) 0.298

Topic 14

word weight

Land tax (土地稅) 0.704

Tobacco and alcohol tax law (菸酒稅法)

Tax (財稅) 0.457

Amend drafts (修正草案) 0.446

Acquisition (購併) 0.396

Insurance law (保險法) 0.373

• Pairing of PLSA and WTM – Sharing the same set of latent topics

documentsvicinity

documents

PLSA WTM wor

topics

documentsvicinity

documents

normalized “word-document” & “word-word”

co-occurrence matrix

mixture components

mixture weights

TwP DTP WTP M

Applying Relevance Feedback to LM Framework (1/2)

• There is still no formal mechanism to incorporate relevance feedback (judgments) into the language modeling framework (especially for the query-likelihood approach)– The query is a fixed sample while focusing on estimating

accurate estimation of document language models

• Ponte (1998) proposed a limited way to incorporate blind reference feedback into the LM framework – Think of example relevant documents as examples of

what the query might have been, and re-sample (or expand) the query by adding k highly descriptive words from the these documents (blind reference feedback)

w wPwP

logmaxarg

J. M. Ponte, A language modeling approach to information retrieval, Ph.D. dissertation, UMass, 1998

Applying Relevance Feedback to LM Framework (2/2)

• Miller et al. (1999) propose two relevance feedback approach – Query expansion: add those words to the initial query that

appear in two or more of the top m retrieved documents– Document model re-estimation: use a set of outside training

query exemplars to train the transition probabilities of the document models

• Where is the set of training query exemplars, is the set of documents that are relevant to a specific training query

exemplar , is the length of the query , and is the total number of documents relevant to the query

DiwP M

CiwP M

A document model

Lw....wwQ 21

Q QR n

TrainSetQQR

TrainSetQ DocD Qq CnDn

DocQqPλqPλ

to MMM

the old weight

the new weight 819 queries ≦2265 docs

QTrainSet QRDoc to

Q Q QRDoc toQ

Miller et al. , A hidden Markov model information retrieval system, SIGIR 1999

Relevance Modeling (1/3)

• A schematic illustration of the information retrieval system with relevance feedback for improved query modeling

Initial Query Model

Document Models

Initial Round of Retrieval

Top-Ranked Documents

Representative Documents

Various Query Modeling Second Round of Retrieval

Document Collection

Retrieval Result

Top-Ranked Documents

K.-Y. Chen et al., "Exploring the use of unsupervised query modeling techniques for speech recognition and summarization,“Speech Communication, 80, pp. 49-59, June 2016.

• Use the top-ranked pseudo-relevant documents to approximate the relevance class of each query– The joint probability of a query Q and any word w being generated by

the relevance class of is computed as follows, on the basis of the top-ranked list of pseudo-relevant documents obtained from the initial round of retrieval:

• If we further assume that words are conditionally independent given Dm and their order is of no importance

Mm mLm DwqqqPDPwQP 1 21RM |,,,,

Ll mlmm DqPDwPDPwQP 1 1RM ||,

• The enhanced query model , therefore, can be expressed by

• Incorporation of Latent Topic Information

Ll mlm

Ll mlmm

DqPDPDqPDwPDP

QPwQPQwP

RMRM |

Kk mkkm DTPTwPDwP 1 |||~

Ll klkmkm TqPTwPDTPDPwQP 1 1 1TRM ||| ,

Leveraging Non-relevant Information

• Hypothesize that the low-ranked (or pseudo non-relevant) documents can provide useful cues as well to boost the retrieval effectiveness of a given query– For this idea to work, we may estimate a non-relevance model

for each test query based on those selected pseudo non-relevant documents

• The similarity measure between a query and document thus can be computed as follows

DNRKLDQKLDQSIM Q ,

Incorporating Prior Knowledge into LM Framework

• Several efforts have been paid to using prior knowledge for the LM framework, especially modeling the document prior– Document length– Document source– Average word-length – Aging (time information/period)– URL– Page links

DPDQPQDP

Implementation Notes: Probability Manipulation

• For language modeling approaches to IR, many conditional probabilities are usually multiplied. This can result in a “floating point underflow”

• It is better to perform the computation by “adding” logarithms of probabilities instead– The logarithm function is monotonic (order-preserving)

• We also should avoid the problem of “zero probabilities (or estimates)” owing to sparse data, by using appropriate probability smoothing techniques

iD wPwPQP M1MlogMlog 1

Implementation Notes: Converting to tf-idf-like Weighting

• The query likelihood retrieval model

0, ,0, ,

M1MlogMlog

Dwci i

Li CiDiD

wPwPQP

Therefore, the similarity score is directly proportional to the document frequency and inversely proportional to the collection frequency.=> Can be efficiently implemented with inverted files(To be discussed later on!)

Logarithm is a monotonic (rank‐preserving) transformation

Being the same for all documents

=> can be discarded!

Language Models for Information Retrieval - Berlin …berlin.csie.ntnu.edu.tw/Courses/Information...

Documents

Introduction to Information Retrieval Information ...zhaojin/cs3245_2018/w11.pdf · Introduction to Information Retrieval CS3245 Information Retrieval Lecture 11: Probabilistic IR

week 10 Information retrieval presentationlsir retrieval...Information Retrieval and Data Mining Part 1 – Information Retrieval. 2 ©2006/7, Karl Aberer, ... "web information retrieval"

Probabilistic Models in Information Retrieval SI650: Information Retrieval

Multimedia Information Retrieval -Slidesmhd/8337sp09/mir.pdfMultimedia Information RetrievalMultimedia Information Retrieval Text-based Information Retrieval ¾Too many images to annotate

1 Retrieval: Getting Information Out Module 27. 2 Retrieval: Getting Information Out Retrieval Cues

Information Retrieval and Extraction - 國立臺灣師範 …berlin.csie.ntnu.edu.tw/Courses/Information Retrieval and...– Text, speech, audio and video contents – Multidisciplinary

Information Retrieval and Extraction - Berlin Chen's ...berlin.csie.ntnu.edu.tw/PastCourses/InformationRetrieval2003S/... · ... speech, audio and video contents ... Synthesis Spoken

Department of Computer Science & Information Engineering ...berlin.csie.ntnu.edu.tw/Courses/Information... · Modern Information Retrieval, chapter 8 2. Information Retrieval: Data

Models for Retrieval and Browsing - 國立臺灣師範大學berlin.csie.ntnu.edu.tw/PastCourses/InformationRetrieval2003S/Slides/IR2003-Modeling...Models for Retrieval and Browsing-Fuzzy

Introductionto InformationRetrieval · IntroductiontoInformationRetrieval Introductionto InformationRetrieval CS276:InformationRetrievalandWebSearch PanduNayakandPrabhakarRaghavan

Web Search Basics - 國立臺灣師範大學berlin.csie.ntnu.edu.tw/Courses/Information Retrieval and...IR – Berlin Chen 3 Web Characteristics • No Control: democratization of

Introduction to Information Retrieval Introduction to Information Retrieval Evaluation

CS336: Intelligent Information Retrieval Why is Information Retrieval difficult?

Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval

Introduction to Information Retrieval - cis.csuohio.educis.csuohio.edu/~sschung/cis612/Lecture_Intro_IR_PhrasePositioning.pdf · Information Retrieval • Information Retrieval (IR)

Introduction to Information Retrieval Introduction to Information Retrieval CS276 Information Retrieval…

IR2009F-Lecture13-Learning to Rank Using Language Models and …berlin.csie.ntnu.edu.tw/Courses/Information Retrieval and... · 2010-01-12 · Learning to Rank using Langgguage Models

2 Information Retrieval. Prof. Dr. Knut Hinkelmann 2 Information Retrieval and Knowledge Organisation - 2 Information Retrieval Motivation Information

Introduction to Information Retrieval - csuohio.edueecs.csuohio.edu/~sschung/CIS660/Lecture_Intro_IR_Phrase...Information Retrieval • Information Retrieval (IR) is finding material

Latent Semantic Analysis (LSA)berlin.csie.ntnu.edu.tw/Courses/Information Retrieval and...Latent Semantic Analyy( )sis (LSA) • Also called Latent Semantic Indexingg( ), (LSI), Latent