View
223
Download
2
Category
Tags:
Preview:
Citation preview
November 10, 2004 Dmitriy Fradkin, CIKM'04 1
A Design Space Approach to Analysis of Information Retrieval
Adaptive Filtering Systems
Dmitriy Fradkin, Paul Kantor
DIMACS,
Rutgers University
November 10, 2004 Dmitriy Fradkin, CIKM'04 2
What Is This Work About?
• Small-scale view: We analyze differences between two implementations of Rocchio method and discuss choices of parameters.
• Large-scale view: The problem of constructing an IR/AF system can be seen as an optimization problem in a large design space. (Well-known methods are simply points in this space.)
November 10, 2004 Dmitriy Fradkin, CIKM'04 3
Large-Scale View
• Use optimization methods to find optimal choices of parameters. These optimal choices do not have to correspond to well-known methods or standard practices.
• Design space optimization methods have been suggested for designing VLSI chips [Bahuman et. al. 2002], airplanes [Schwabacher and Gelsey, 1996; Zha et. al. 19996] and HVAC systems [Szykman 1997].
November 10, 2004 Dmitriy Fradkin, CIKM'04 4
What’s in a name?
• We find that even a single “name” involves an enormous number of design choices.
• TREC2002 Adaptive Filtering– DIMACS: Rocchio method
– Chinese Academy of Sciences: Rocchio Method
• One method performs almost twice as well as the other.
November 10, 2004 Dmitriy Fradkin, CIKM'04 5
For any system:
• Choose Data Representation• Construct Initial Classifier• Training Phase:
• Incorporate labeled examples
• Supplement with “pseudo positives” and “pseudo negatives”
• Set the threshold
• Filtering Phase: as new documents arrive • Evaluate performance
• Update the classifier model
• Update threshold
November 10, 2004 Dmitriy Fradkin, CIKM'04 6
All of these are usually:
• Characterized informally, as a choice, and the exclusion of alternatives.
• Seen as points on a map – but to understand the significance of these choices we need to explore the real territory.
• So: we must interpolate between the choices made in one method and those made in another.
November 10, 2004 Dmitriy Fradkin, CIKM'04 7
Interpolation
• Identify the corresponding design decisions
• Develop a “path” between them – sometimes called a “homotopy” from the
topological concept of smoothly distorting one shape (say a coffee cup) into another (say, a doughnut).
• Study the effectiveness along various paths among design options.
November 10, 2004 Dmitriy Fradkin, CIKM'04 8
Interpolation Aspects for IR/AF
• Term Representation
• Term Weighting
• Computing Scores
• Setting Classifier Threshold
• Document Set Representation
• Pseudolabeled Documents in Training
November 10, 2004 Dmitriy Fradkin, CIKM'04 9
Interpolation Aspects (cont.)
• Query Initialization
• Unjudged document in test
• Query Update
• Quitting Strategy
November 10, 2004 Dmitriy Fradkin, CIKM'04 10
Example: Term Representation
otherwise 0
0,d)(t,f' if d)),(t,log(f'1 d)f(t,
Where f’(t,d) is number of times a term occurs in a document
November 10, 2004 Dmitriy Fradkin, CIKM'04 11
Example: Term Weighting
• DIMACS: • CAS:
• Homotopy:
)i'(t)
T((t)iD
1
1log
60
61
1log
, i'(t)
,)), if i'(ti'(t)
T(
(t)iC
iC iD i (t)λi)λ(t)(i)i(t,λ 1
i’(t) is the number of documents, in training set T, containing term t.
November 10, 2004 Dmitriy Fradkin, CIKM'04 12
Example: Score Computation
• DIMACS:
• CAS:
• Homotopy:
q)W(dq)(ds DD,,
elsewhere 0
diagonal,on 1 wλiw ))(i(t,λ)w(t,λ
i’(t) is the number of documents, in training set T, containing term t. W is a diagonal matrix of weights
||||||||
,,
qWd
q)W(dq)(ds
C
CC
;)i(t,λ(t)w iD
;2)i(t,λ(t)w iC
November 10, 2004 Dmitriy Fradkin, CIKM'04 13
Example: Score Interpolation
)λ(1s)λφ(s)λ,s,s(s SDSSDC C
(d)sm
m(d))φ(s C
C
DC
Same mapping for scoresand for thresholds from CASscale to DIMACS scale:
Homotopy:
November 10, 2004 Dmitriy Fradkin, CIKM'04 14
Example: Setting Thresholds
• DIMACS:
• CAS:
• Homotopy:
is chosen to optimize utility
Threshold for query q after seeing document i:
(q,i)τD
.submissionlast thesinceseen
documents ofnumber - ,005.0 where
otherwise ,1
6000 if ,1
negative isutility if ,1
1
1
i
C
iC
C
C
z
)(q,iτ
z)(q,iτ
)(q,iτ
(q,i)τ
)λ(q,i)(τ(q,i))λφ(τ)λτ(q,i SSC DS 1,
November 10, 2004 Dmitriy Fradkin, CIKM'04 15
Example: Set Representation
• DIMACS
• CAS
• Homotopy
Sx
xS
Sv1
)(
Sx
xSv )(
Sxr
r x)S)(λ(
)λv(S111
1,
November 10, 2004 Dmitriy Fradkin, CIKM'04 16
Example: Pseudo-labeled Documents
• CAS method does not make use of pseudo-labeled documents in training stage
• DIMACS method: Given “density” parameters (d+ and d-) and “proportion” (p+ and p-), score unlabeled training documents and choose top and bottom sets according to “proportion”. Then pick documents out of these sets according to corresponding “density”.
• Interpolate between density and proportion parameters (DIMACS) and 0 (CAS).
November 10, 2004 Dmitriy Fradkin, CIKM'04 17
Example: Query Initialization
)(')(')(')('' ip
ip
iitermsinit DvyDvxDvDvqq General Formula:
DIMACS:
CAS:
)(')1()(')1()())1(3(),( ipp
ipp
itermsp
init DvyDvxDvqq
Homotopy:
0' 0,y' 0, x'1,' ,3'
0' , |D|
5y' ,
|D|
2 x'1,' ,1'
-ip
ip
November 10, 2004 Dmitriy Fradkin, CIKM'04 18
Example: Unjudged Documents
• A submitted document for which there is no label is “unjudged”. DIMACS ignores such documents. CAS considers such documents pseudo-negative if its score is less than 0.6.
• Can view this as a threshold:
uuuu 6.00)1(6.0)(u
November 10, 2004 Dmitriy Fradkin, CIKM'04 19
Example: Query Update
)()()()( ppinit DyvDxvDvDvqq
)())1(0125.03.1()())1(125.08.1()(),( pyyinit
y DvDvDvqq
General Formula:
DIMACS:
CAS: 3.1y 0, x1.8, 1, ,1
0.0125y ,0 x125,.0 1, ,1
Homotopy:
November 10, 2004 Dmitriy Fradkin, CIKM'04 20
Example: Quitting Strategy
• DIMACS: if after 50 submissions the utility is negative, stop submitting for this topic
• CAS: no quitting strategy Alternatively:
)1(02.0
1)(
0 :CAS
0.02 :DIMACS
negative. isutility documents 1
submittingafter ifQuit
q
q
November 10, 2004 Dmitriy Fradkin, CIKM'04 21
Experimental Evaluation• TREC11 Data - Reuters Corpus v1• 23,000 training; 800,000 test• 100 topics (50 assessor, 50 intersection)• 3 positive and 0 negative examples per topic
5.1
5.0)5.0T11NU,max(T11SU
||2
|)||(|||2T11NU
T
DDD u
T+ - all positive documents; D+ - submitted positive;D- - submitted negative; Du – submitted unlabelled
November 10, 2004 Dmitriy Fradkin, CIKM'04 22
Diagonal Interpolation
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 0.2 0.4 0.6 0.8 1
No quitting
With Quitting
Lambda 0 0.2 0.4 0.6 0.8 1 CAS Average T11SU, no quitting 0.033 0.103 0.26 0.364 0.404 0.394 0.405Average T11SU, with quitting 0.113 0.139 0.263 0.364 0.404 0.394 0.405
November 10, 2004 Dmitriy Fradkin, CIKM'04 23
Documents Retrieved
November 10, 2004 Dmitriy Fradkin, CIKM'04 24
Parameter Analysis
• It is possible to analyze effect of individual parameters at each point in space by taking “small steps” along the parameter axis.
• Requires a lot of computational effort
• Results may not be easy to interpret
November 10, 2004 Dmitriy Fradkin, CIKM'04 25
Example of Parameter Analysis
\lambda 0.7 0.7 0.8 0.8 0.9 0.9 relevant nonrelevant relevant nonrelevant relevant nonrelevant\lambda_\alpha 2086 975 ... ... 2089 1010\lambda_\gamma 2273 1233 ... ... 1923 830\lambda_p 2043 1129 ... ... 2014 939\lambda_y 2106 1005 ... ... 2029 948\lambda_u 2065 1037 ... ... 2071 977\lambda_i 2062 977 2062 977 2062 977\lambda_w 2055 1000 ... ... 2119 1007\lambda_S 2153 1021 ... ... 2123 1031\lambda_r 2037 993 ... ... 2149 1044\lambda_q 2062 977 ... ... 2062 977
Effect of individual parameters on number of relevant andnonrelevant documents retrieved around 0.8 point
November 10, 2004 Dmitriy Fradkin, CIKM'04 26
Results based on topic type
assessor intersection # topics avg. T11SU difference # topics avg. T11SU differenceCAS better than 0.8 18 -0.047 25 -0.0370.8 better than CAS 25 0.062 13 0.011CAS and 0.8 equal 7 0 12 0Total 50 0.014 50 -0.015
Comparison of CAS results and 0.8 diagonal homotopy point
November 10, 2004 Dmitriy Fradkin, CIKM'04 27
Additional Experiments
• Reordered TREC documents
• Experimented with 77 topics on OHSUMED dataset (1987-1988 as training data, 1989-1991 as test)
The results are similar to those on the original
TREC task.
November 10, 2004 Dmitriy Fradkin, CIKM'04 28
Result of Experiments with Reordering
Lambda 0.0 0.8 1.0
Average T11SU
0.108 0.406 0.391
Standard Deviation
0.002 0.002 0.004
Average Results on 5 re-orderings of TREC test set:
November 10, 2004 Dmitriy Fradkin, CIKM'04 29
OHSUMED Results
Lambda 0 0.2 0.4 0.6 0.7 0.8 0.9 1Mean T11SU, no quitting 0.005 0.051 0.361 0.467 0.474 0.463 0.464 0.482Mean T11SU, with quitting 0.138 0.132 0.319 0.467 0.474 0.463 0.464 0.482
0
0.1
0.2
0.3
0.4
0.5
0.6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
No quitting
With quitting
November 10, 2004 Dmitriy Fradkin, CIKM'04 30
Documents Retrieved: OHSUMED
Documents Retrieved
0
2000
4000
6000
8000
10000
12000
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
lambda
Nu
mb
er
of
Do
cum
en
ts
Not Relevant
Relevant
November 10, 2004 Dmitriy Fradkin, CIKM'04 31
Discussion
• We demonstrate the design complexity hidden under “Rocchio method”
• We provide specific models for interpolating between design choices
• These interpolation options can work for methods that are significantly more different (for example Rocchio and SVM).
November 10, 2004 Dmitriy Fradkin, CIKM'04 32
Discussion (cont.)
• These models should help researchers explore their systems, and regions “between systems”
• Suggests a new approach to designing IR systems: finding a set of (interpolation) parameters optimizing performance
• This can be done with existing optimization methods.
November 10, 2004 Dmitriy Fradkin, CIKM'04 33
A Note on Interpolation Limits
The need for two endpoint systems is not
very restrictive:
• Some interpolation parameters can be moved beyond [0,1] interval.
• The endpoints themselves can be moved.
November 10, 2004 Dmitriy Fradkin, CIKM'04 34
Abstract Interpolation
• More abstractly: do not interpolate every single parameter –work at higher abstraction levels
• Ex: representation block, scoring block, thresholding block, etc.
• Can use this with several systems• This is at a lower level than ensembles of
classifiers.
November 10, 2004 Dmitriy Fradkin, CIKM'04 35
Caveat
In moving to large design space we still face two major problems:
• The range of parameters cannot be explored exhaustively, and non-smooth optimization is needed
• Requires a lot of labeled data that is usually produced manually and is in short supply.
November 10, 2004 Dmitriy Fradkin, CIKM'04 36
Acknowledgments
• KD-D group via NSF grant EIA-0087022
• Andrei Anghelescu, Vladimir Menkov
• Jamie Callan
• Members of DIMACS MMS project
• CAS researchers
• Ian Soboroff
• Anonymous reviewers
Recommended