29
Query Operators Shown Beneficial for Improving Search Results Gilles Hubert, Guillaume Cabanac, Christian Sallaberry, Damien Palacio TPDL’11: International Conference on Theory and Practice of Digital Libraries September 25-29, Berlin, Germany

TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Embed Size (px)

Citation preview

Page 1: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Query Operators Shown Beneficial for Improving Search Results

Gilles Hubert, Guillaume Cabanac,

Christian Sallaberry, Damien Palacio

TPDL’11: International Conference on Theory and Practice of Digital Libraries September 25-29, Berlin, Germany

Page 2: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

2

Outline

1. Context Operators in Search Queries

2. Methodology Assessing the effects of query operators

3. Experiments Potential of effectiveness yielded

and Results by operators

4. Conclusion and Future Work

Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.

Page 3: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

3

Outline

1. Context Operators in Search Queries

2. Methodology Assessing the effects of query operators

3. Experiments Potential of effectiveness yielded

and Results by operators

4. Conclusion and Future Work

Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.

Page 4: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Various Operators Quotation marks, Must appear (+), boosting operator (^),

Boolean operators, proximity operators… 4

1. Context Operators in Search Queries G. Hubert et al.

Information need

“I’m looking for research projects funded in the DL domain”

Regular query Query with operators

Search Engines Offer Query Operators

Page 5: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Case 1: What designers of search engines may expect

5

1. Context Operators in Search Queries G. Hubert et al.

Information need

“I’m looking for research projects funded in the DL domain”

Regular query Query with operators

Search Engines Offer Query Operators

Page 6: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Case 2: What users of search engines may believe

6

1. Context Operators in Search Queries G. Hubert et al.

Information need

“I’m looking for research projects funded in the DL domain”

Regular query Query with operators

Search Engines Offer Query Operators

Page 7: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Case 3: What designers of search engines may fear

7

1. Context Operators in Search Queries G. Hubert et al.

Information need

“I’m looking for research projects funded in the DL domain”

Regular query Query with operators

Search Engines Offer Query Operators

Page 8: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Quantitative Studies

Possible Explanations Unknown features?

No improvement observed?

1. Context Operators in Search Queries G. Hubert et al.

8

0%

5%

10%

15%

20%

25%

1999 2000 2001 2002 2003 2004 2005 2006 2007

Altavista [Silverstein et al., 1999]

Excite [Jansen et al. 2000]

Excite [Spink et al., 2001]

Google+MSN Search+Yahoo! [White and Morris, 2007]

Qu

erie

s w

ith

op

erat

ors

Usage of Query Operators

Page 9: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Usage of Query Operators

Qualitative Studies Users

Average users not comfortable with “advanced means of searching” [Jansen et al., 2000]

Expert users recourse to query operators more frequently [Hölscher and Strube, 2000; Lucas and Topi, 2002; White and Morris, 2007]

Information Needs

More used in dedicated search [Jansen and Pooch, 2001]

Difficulty in finding information (e.g., complex information needs) [Aula et al., 2010]

Appropriateness

Operators used in a “semantically appropriate manner” [Eastman and Jansen, 2004]

1. Context Operators in Search Queries G. Hubert et al.

9

Page 10: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Effects of Query Operators on Effectiveness

1. Context Operators in Search Queries G. Hubert et al.

10

Usage of Query Operators

[Eastman and Jansen, 2003]

Eastman and Jansen studied queries with operators

Real users: AOL, Google and MSN Search

Operators: AND, OR, MUST APPEAR and PHRASE

No statistically significant improvement P@10

Page 11: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Effects of Query Operators on Effectiveness

1. Context Operators in Search Queries G. Hubert et al.

11

Usage of Query Operators

[Eastman and Jansen, 2003]

Study on 20% of all queries

Expert users

Complex needs (Queries with operators)

Page 12: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Effects of Query Operators on Effectiveness

1. Context Operators in Search Queries G. Hubert et al.

12

Usage of Query Operators

[Eastman and Jansen, 2001]

What about the other 80% of all queries ?!

Average users

Regular queries (no operators)

Page 13: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

13

Outline

1. Context Operators in Search Queries

2. Methodology Assessing the effects of query operators

3. Experiments Potential of effectiveness yielded

and Results by operators

4. Conclusion and Future Work

Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.

Page 14: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Our Research Questions

2. Methodology Assessing the effects of query operators G. Hubert et al.

Q = Do query operators lead to improved search results?

Q1 = Maximum gain in effectiveness when enriching

a query with operators?

Q2 = Do users succeed in formulating better queries

involving operators?

14

Page 15: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

15

Our Methodology in a Nutshell

2. Methodology Assessing the effects of query operators G. Hubert et al.

Regular query V1: Query variant with operators

0733.010

5

2

3

1

AP 4633.010

6

5

5

4

3

3

2

2

1

1

AP

V3 V2

V4 VN . . .

Page 16: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

16

Overview of the Methodology

3. Methodology Assessing the effects of query operators G. Hubert et al.

{v1, … , vi, …, vn} Query Variant

Generator

Search

Engine

Evaluation

Procedure

preOps

query

postOps

corpus

IR model

qrels

metrics

measures of

effectiveness

l(vi)

Usual evaluation framework in IR

Components introduced for this study

Page 17: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

17

Outline

1. Context Operators in Search Queries

2. Methodology Assessing the effects of query operators

3. Experiments Potential of effectiveness yielded

and Results by operators

4. Conclusion and Future Work

Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.

Page 18: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

18

Experiment Settings

Standard Test Collections TREC-7

TREC-8

Query Operators Must appear (+)

Term boosting (^N)

Variant Generation Must appear ‘+’ only

Boost ‘^’ only with weights ^10, ^20, ^30, ^40, and ^50

Both ‘+’ and ‘^’

Search engine Terrier with various models: BM25, DFR_BM25, InL2, PL2, TF_IDF

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

Variant # Query variants generated with preOps and postOps

1 encryption equipment export

2 encryption +equipment +export

… … … …

124 encryption +equipment export^10

… … … …

338 encryption^30 equipment^40 export^50

Page 19: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

19

Results

TREC-7 per Topic Analysis: Boxplots ‘+’ and ‘^’

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

Page 20: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

20

Results

Per Topic Analysis: Boxplot

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

AP of TREC’s regular query

Query variant highest AP

32 Topics

AP (

Ave

rage P

reci

sion)

0.2

0.1

0.3

0.4

Query variant lowest AP

Page 21: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

21

Results

TREC-7 Per Topic Analysis ‘+’ and ‘^’

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

MAP = 0.1554

MAP ┬ = 0.2099 +35.1%

Page 22: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

22

Results

TREC-8 per Topic Analysis ‘+’ and ‘^’

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

MAP = 0.1840

MAP ┬ = 0.2288 +24.3%

Page 23: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

23

Results

Global Analysis: MAP ‘+’ only

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

TREC-7 TREC-8

MAP MAP

Model Baseline VOP (%) Baseline VOP (%)

BM25 0.1677 0.1836 9.5** 0.1957 0.2154 10.2*

DFR_BM25 0.1683 0.1843 9.5** 0.1965 0.2162 10.0*

InL2 0.1710 0.1852 8.3** 0.1996 0.2172 8.8*

PL2 0.1554 0.1826 17.5** 0.1840 0.2106 14.5**

TF_IDF 0.1674 0.1833 9.5** 0.1964 0.2158 9.9**

Statistical significance is denoted by ‘*’ for p < 0.05 (‘**’ for p < 0.01)

Page 24: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

24

Results

Global Analysis: MAP ‘^’ only

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

TREC-7 TREC-8

MAP MAP

Model Baseline VOP (%) Baseline VOP (%)

BM25 0.1677 0.2027 20.9** 0.1957 0.2312 18.1**

DFR_BM25 0.1683 0.2034 20.9** 0.1965 0.2316 17.9**

InL2 0.1710 0.2059 20.4** 0.1996 0.2352 17.8**

PL2 0.1554 0.1926 23.9** 0.1840 0.2173 18.1**

TF_IDF 0.1674 0.2026 21.0** 0.1964 0.2312 17.7**

Statistical significance is denoted by ‘*’ for p < 0.05 (‘**’ for p < 0.01)

Page 25: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

25

Results

Global Analysis: MAP ‘+’ and ‘^’

3. Experiments and Results Potential of effectiveness yielded by operators G. Hubert et al.

TREC-7 TREC-8

MAP MAP

Model Baseline VOP (%) Baseline VOP (%)

BM25 0.1677 0.2132 27.1** 0.1957 0.2381 21.7**

DFR_BM25 0.1683 0.2133 26.7** 0.1965 0.2387 21.5**

InL2 0.1710 0.2144 25.4** 0.1996 0.2407 20.6**

PL2 0.1554 0.2099 35.1** 0.1840 0.2288 24.3**

TF_IDF 0.1674 0.2131 27.3** 0.1964 0.2383 21.3**

Statistical significance is denoted by ‘*’ for p < 0.05 (‘**’ for p < 0.01)

Page 26: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

26

Outline

1. Context Operators in Search Queries

2. Methodology Assessing the effects of query operators

3. Experiments Potential of effectiveness yielded

and Results by operators

4. Conclusion and Future Work

Query Operators Shown Beneficial for Improving Search Results G. Hubert et al.

Page 27: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

27

Conclusions H: the Proper Use of Query Operators Improves Search Results

Methodology to Validate H

Standard IR Test Collections: TREC-7 and TREC-8

Must Appear (+) and Boosting Operators (^)

Findings Observed gain up to 35.1%

Statistically significant

For all tested IR models and collections

Users Should Use Query Operators More Often

4. Conclusion and Future Work G. Hubert et al.

Page 28: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

28

Future Work Short Term

Experimenting our methodology in various contexts

Additional IR collections

Additional IR models

Additional query operators

Medium Term Address Q2: Do users succeed in formulating queries with operators,

so that these lead to a significant gain in effectiveness?

Study other factors

Number of terms

Selection of terms

Long Term Additional dimensions of information

Geographic IR

4. Conclusion and Future Work G. Hubert et al.

Page 29: TPDL'11: Query Operators Shown Beneficial for Improving Search Results

Thank you

TPDL’11: International Conference on Theory and Practice of Digital Libraries September 25-29, Berlin, Germany