The Maximum Entropy Method for Analyzing Retrieval Measures entropy_files/SIGIR05.pdf ·...

The Maximum Entropy Method for Analyzing Retrieval Measures

Javed AslamEmine YilmazVilgil Pavlu

Northeastern University

Evaluation Measures:Top Document Relevance

Evaluation Measures:First Page Relevance

Natural Question...• Many measures of retrieval performance• 23 standard measures used in TREC

• Are some measures “better” than others?• system-oriented vs. user-oriented measures

• Q: What can be learned from a measure?• A: Good overall measures:• reduce uncertainty about underlying phenomenon

• allow one to infer underlying phenomenon

• E.g., Health:• BMI vs. blood pressure vs. cholesterol vs. shoe size

Research Goals

• What can be reasonably inferred from a measure?

• maximum entropy method...

• How good are those inferences?

• compare inferences to reality (e.g., TREC)

• Assess quality of measures

• error, prediction, reduction in uncertainty

Outline

• Introduction

• Standard measures for query retrieval

• The maximum entropy method• dice example

• measures as constraints

• MEM for query retrieval measures

• Experimental results

Evaluation Measures Setup

• Ranked list of retrieved documents

• Binary relevance judgments

• Good performance:

• “many” relevant docs “high” in list

Traditional IR Measures

• Precision of top k documents, for k:

• 5, 10, 15, 20, 30, 100, 200, 500, 1000

• R-precision:

• precision of top R documents, where R = # relevant docs

• Average precision:

• average of precisions at all R relevant documents...

Traditional IR Measures:Average Precision

RNRNNRNNNR

List: AP =1 + 2/3 + 3/6 + 4/10

4! 0.6417

Most Commonly UsedIR Measures

• Average precision

• Precision at 10 (30, etc.) documents

• R-precision

• Precision-recall curves...

Visualizing Retrieval Performance:Precision-Recall Curves

RNRNNRNNNR

0.25 0.5 0.75 1.0

Recall

Analyzing Retrieval Measures: Setup

• A list or its P-R curve defines performance

• How much does a measure reduce one’s uncertainty in the underlying list or its P-R curve?

• Good measures: large reduction in uncertainty

• Poor measures: little or no reduction in uncertainty

• How to measure reduction in uncertainty?

• Maximum entropy method...

Outline

• Introduction

Maximum Entropy Method:Dice Example

• Given an unknown six-sided dice, what is probability for each die face (1, 2, 3, 4, 5, 6)?

• Under-constrained problem• most “reasonable” answer is uniform (1/6, 1/6, ..., 1/6)

• What if average die roll is 4.5?• problem still under-constrained, but what is the most

“reasonable” answer?

• maximum entropy method to the rescue...

Maximum Entropy Method

• Goal: infer probability distribution (belief) from statistics (measures or constraints) over that distribution

• Uses: prediction, coding, gambling, etc.

• MEM dictates the most “reasonable” solution

Back to Dice Example

• Average die roll is 4.5; what is distribution?

• One solution:

• Principle of “maximal ignorance:” pick distribution which is least predictable (most random) subject to constraints

• How to measure randomness? Entropy

• Thus, max entropy distribution subject to constraints

H(!p) =6!

pi lg(1/pi)

Mathematical Justification

• Entropy Concentration Theorems

• “weak” and “strong”

• Nature favors maximum entropy solutions

• e.g., temperature and particle speed

Max Entropy Distributions: Dice Examples

1 2 3 4 5 60

1 2 3 4 5 6

E[X] = 3.5 E[X] = 4.5 E[X] = 5.5

Outline

• Introduction

• properties of the maximum entropy distribution

MEM for IR Measures:An Analogy

Problem Events Distribution Constraint

Dice die faces over die faces expecteddie roll

IR lists over lists expectedAP, RP, P@10

IR Setup: Distribution over Lists

• possible relevance list

• distribution over lists

• independence assumption

• probability-at-rank

(x1, x2, . . . , xm)

p(x1, x2, . . . , xm)

pi = pi(xi)

p(x1, x2, . . . , xm) = p1(x1) · p2(x2) · · · pm(xm)

How to Find Max Ent Dist?

• Assumption:

• Entropy:

• Constraints:

• measure (AP, RP, P@10)

• total number of relevant documents R

p(x1, x2, . . . , xm) = p1(x1) · p2(x2) · · · pm(xm)

H(p(x1, . . . , xm)) =m!

Setup for PC@10 Constraint

• Maximize:

• Subject to:

pi = R

pi = 10 · PC(10)

Setup for RP Constraint

• Maximize:

• Subject to:

pi = R

pi = R · RP

Setup for AP Constraint

• Maximize:

• Subject to:

#1 +i!1!

% = R · AP

pi = R

Solutions

• Analytical: Lagrange multipliers

• Numerical: MatLab, Mathematica, etc.

Outline

• Introduction

• properties of the maximum entropy distribution

Experimental Results

• Take actual lists from TREC conference

• Compute AP, RP, P@10

• Find max ent dist for these constraints

• Compare to actual list

Inferred Probability at Rank

0 100 200 300 400 500 600 700 800 900 10000

abilit

Probability at rank curve for system INQ601 Query 445AP maxent distRP maxent distPC maxent dist

Actual and InferredP-R Curves

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

recall

Precision at recall curve for system INQ601 Query 445actual distAP maxent distRP maxent distPC maxent dist

P-R Curves

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

recall

Precision at recall curve for system fub99a Query 435actual distAP maxent distRP maxent distPC maxent dist

P-R Curves

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

recall

Precision at recall curve for system MITSLStd Query 404actual distAP maxent distRP maxent distPC maxent dist

P-R Curves

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

recall

Precision at recall curve for system plt8ah1 Query 446actual distAP maxent distRP maxent distPC maxent dist

TREC3 TREC5 TREC6 TREC7 TREC8 TREC90.1

ap maxent prec−recallrp maxent prec−recallpc−10 maxent prec−recallpc−100 maxent prec−recall

Future Work & Questions?

The Maximum Entropy Method for Analyzing Retrieval Measures entropy_files/SIGIR05.pdf ·...

Documents

Asymptotic Distribution Theory for Nonparametric Entropy … · 2014-04-23 · ASYMPTOTIC DISTRIBUTION THEORY FOR NONPARAMETRIC ENTROPY MEASURES OF SERIAL DEPENDENCE BY YONGMIAO HONG

OPEN ACCESS entropy - Semantic Scholar · Entropy 2013, 15 4910 density matrix to extend the notion of entropy to quantum mechanics. The entropy of a random variable measures uncertainty

Maximum Entropy Principle with General Deviation …pdfs.semanticscholar.org/60a5/d4485035975da5d5237c1fa...Maximum Entropy Principle with General Deviation Measures Bogdan Grechuk

Information Entropy and Measures of Market Riskcentaur.reading.ac.uk/70371/1/entropy-19-00226.pdf · entropy Article Information Entropy and Measures of Market Risk Daniel Traian

On measures of entropy and information

Measures of User Performance in Video Retrieval Research

Diversified portfolios with different entropy measures...Diversiﬁed portfolios with different entropy measures Jing-Rung Yua, , Wen-Yi Leeb, Wan-Jiun Paul Chiouc a Department of

Stability of Word Retrieval and Discourse Measures in Aphasia

Distance Measures for Color Image Retrieval

Sample Entropy and Traditional Measures of Heart Rate ... · sample entropy (SampEn) was lower during ISO (p < 0.001). Concluding, contraction mode itself is a significant modulator

Entropy Optimized Feature-based Bag-of-Words ......Entropy Optimized Feature-based Bag-of-Words Representation for Information Retrieval Nikolaos Passalis and Anastasios Tefas Abstract—In

1 Application of Graph-Entropy for Knowledge Discovery and Data … · Consequently, the inclusion of entropy measures for discovery of knowledge in bibliometric data promises to

Reverse Brunn-Minkowski and reverse entropy power ...mm888//Pubs/2012/BM12-repi.pdf · Reverse Brunn-Minkowski and reverse entropy power inequalities for convex measures Sergey Bobkova,

An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures

Similarity measures for content-based image retrieval based on intuitionistic fuzzy ... · 2017-10-23 · Similarity measures for content-based image retrieval based on intuitionistic

Image matching using alpha-entropy measures and entropic ...web.eecs.umich.edu/~hero/Preprints/Eurasip02_final.pdfbreast image registration and image retrieval from a multimodality

Image registration using alpha-entropy measures and ...web.eecs.umich.edu/~hero/Preprints/sigproc02.pdfThe image registration problem falls in the general area of indexing and retrieval

Using Entropy-Related Measures in Categorical …zhao/papers/EntropyVis.pdfUsing Entropy-Related Measures in Categorical Data Visualization Jamal Alsakran∗ The University of Jordan

Image matching using alpha-entropy measures and entropic ...web.eecs.umich.edu/~hero/Preprints/Eurasip02_submitted.pdf · feature similarity measures that is based on on entropy,

Low Entropy Method (following 'Invariant measures and