Quo Quo VadisVadisInformation RetrievalInformation Retrieval
ResearchResearchKal JärvelinKal Järvelin
University of TampereUniversity of Tampere
AgendaAgenda 1. Introduction1. Introduction 2. Some 2. Some ConceptsConcepts 3. The Laboratory Approach to IR3. The Laboratory Approach to IR 4. Mounting Critical Evidence4. Mounting Critical Evidence 5. Cognitive Framework for Research on IIR5. Cognitive Framework for Research on IIR 6. Concluding Discussion6. Concluding Discussion
1. Introduction1. Introduction
IR theory?IR theory? SIGIR theory = Formal Retrieval ModelsSIGIR theory = Formal Retrieval Models
Not really empirical theories to be confirmed orNot really empirical theories to be confirmed orrefutedrefuted
Are there other types of theories?Are there other types of theories?
What theories are we trying to construct?What theories are we trying to construct?
MotivationMotivation
The ultimateThe ultimate goal of information retrieval isgoal of information retrieval issupport humans to better access information insupport humans to better access information inorder to carry out their task.order to carry out their task.
AgendaAgenda 1. Introduction1. Introduction 2. Some 2. Some ConceptsConcepts 3. The Laboratory Approach to IR3. The Laboratory Approach to IR 4. Mounting Critical Evidence4. Mounting Critical Evidence 5. Cognitive Framework for Research on IIR5. Cognitive Framework for Research on IIR 6. Conclusions6. Conclusions
2. 2. Concepts : Frameworks, ModelsConcepts : Frameworks, Models
FrameworksFrameworks in research in research Essential Essential objectsobjects to study to study The The relationshipsrelationships of of
objectsobjects The The changeschanges in the in the
objects / relationshipsobjects / relationshipsthat affect the functioningthat affect the functioningof the systemof the system
Promising Promising goalsgoals and andmethodsmethods of research of research
The concept The concept modelmodel A precise (often formal)A precise (often formal)
representation of objectsrepresentation of objectsand relationships (orand relationships (orprocesses) within aprocesses) within aframeworkframework
Modeling may also inModeling may also inprinciple encompassprinciple encompasshuman actors andhuman actors andorganizationsorganizations
Hypotheses, Laws, TheoriesHypotheses, Laws, Theories
VariablesVariables represent objects etc.represent objects etc. are used in hypotheses, laws ...are used in hypotheses, laws ...
HypothesesHypotheses state verifiable facts / relationships whose truth is unknown.state verifiable facts / relationships whose truth is unknown.
Scientific lawsScientific laws empirical laws express verified relationships between objects,empirical laws express verified relationships between objects,
properties or eventsproperties or events
TheoriesTheories systematic collections of theoretical and empirical lawssystematic collections of theoretical and empirical laws
VariablesVariables
Types of variables in study designs:Types of variables in study designs: dependent variablesdependent variables –– the variation of which is the variation of which is
explainedexplained independent variablesindependent variables –– the ones systematically the ones systematically
varied in order to see the responses in the dependentvaried in order to see the responses in the dependentonesones
controlled variablescontrolled variables –– the ones fixed to prevent the ones fixed to preventuncontrolled variation in the resultsuncontrolled variation in the results
hidden variableshidden variables –– all other variables all other variables
AgendaAgenda 1. Introduction1. Introduction 2. Some 2. Some ConceptsConcepts 3. The Laboratory Approach to IR, ltd.3. The Laboratory Approach to IR, ltd.
The Framework and Model(s)The Framework and Model(s)
Variables, Hypotheses, Laws and TheoriesVariables, Hypotheses, Laws and Theories
4. Mounting Critical Evidence4. Mounting Critical Evidence 5. Cognitive Framework for Research on IIR5. Cognitive Framework for Research on IIR 6. Conclusions6. Conclusions
3. The Laboratory Approach to IR3. The Laboratory Approach to IR
Docu-ments
Represen-tation
Database
Searchrequest
Query
Matching
Represen-tation
QueryResult
This is information retrieval, isn’t it? But where is the lab?
The Lab Included – and User Dropped
Docu-ments
Represen-tation
Database
Searchrequest
Query
Matching
Represen-tation
QueryResult
EvaluationResultEvaluation
Relevance assessment
Recallbase
The Lab IR Cave
Docu-ments
Represen-tation
Database
Searchrequest
Query
Matching
Represen-tation
QueryResult
EvaluationResultEvaluation
Relevance assessment
Recallbase
Context
The Lab IR Cave, with a Visitor
Docu-ments
Represen-tation
Database
Searchrequest
Query
Matching
Represen-tation
QueryResult
EvaluationResultEvaluation
Relevance assessment
Recallbase
Context
Lab IR: The Model(s)Lab IR: The Model(s)
Models in IR are Models in IR are retrievalretrievalmodelsmodels which specify which specify document and requestdocument and request
representationsrepresentations, and, and the the matching algorithmmatching algorithm
for comparing thesefor comparing theserepresentationsrepresentations
Result
Docu-ments
Represen-tation
Database
Searchrequest
Query
Matching
Represen-tation
Lab IR: ModelsLab IR: Models
Network-Based Models
Matching Models
Best-MatchModels
Document-Based Models
Structure-Based Models
Feature-Based Models
ProbabilisticModels
BrowsingModels
Spreading Activation
ModelsCluster-
Based Models
Non-FormalModels
FormalModels
Fuzzy SetModels
HypertextNavigation
StructuredDocument
Models
Logic-basedModels
Exact-MatchModels
LanguageModels
Vector-BasedModels
Lab IR: ModelsLab IR: Models
Network-Based Models
Matching Models
Best-MatchModels
Document-Based Models
Structure-Based Models
Feature-Based Models
ProbabilisticModels
BrowsingModels
Spreading Activation
ModelsCluster-
Based Models
Non-FormalModels
FormalModels
Fuzzy SetModels
HypertextNavigation
StructuredDocument
Models
Logic-basedModels
Exact-MatchModels
LanguageModels
Vector-BasedModels
Theory?
The Laboratory Setting: VariablesThe Laboratory Setting: Variables
Documents
Rep MethodRi
Rep SetDi
Topic
Query VersQj
Matching MethMk
Rep MethodTj
EvaluationProcedure
Pe
Relevance Assessment
RecallBase
EvaluationResult ER
QueryResult Aijk
So What is Lab IR All About?So What is Lab IR All About?
Variables, Hypotheses, Laws, and Theories Variables, Hypotheses, Laws, and Theories are about theare about theexplanation of explanation of IR effectivenessIR effectiveness..
The The dependentdependent variables typically are variables typically are recallrecall and and precisionprecision or orderived from them (e.g., MAP, derived from them (e.g., MAP, nDCGnDCG).).
The explaining factors, the The explaining factors, the independentindependent variables, are the use / variables, are the use /non-use of selected non-use of selected techniquestechniques implementing retrieval models. implementing retrieval models.
The The controlledcontrolled variables variables are test collections, topics,are test collections, topics,assessments.assessments.
There hardly are any There hardly are any critical hypothesescritical hypotheses - cf. Physics - cf. Physics
... All About?... All About?
Strong standardization of the research designsStrong standardization of the research designsfacilitates comparison of results and has,facilitates comparison of results and has,admittedly, led to much progress in IR practiceadmittedly, led to much progress in IR practice
However, there strength and success of theHowever, there strength and success of theapproach may be a straightjacket.approach may be a straightjacket.
There is mounting evidence that the Lab IRThere is mounting evidence that the Lab IRapproach may be ...approach may be ...
…… Plato Plato’’s Caves Cave
M A P
LabIRLabIR: Framework Issues: Framework Issues
W Y D S I W Y D UW Y D S I W Y D U Tasks Searchers Relevance assessments Interface functionalities Search processes
C N C L ?
Top Fuel - Which one is better for real life?
AgendaAgenda 1. Introduction1. Introduction 2. Some 2. Some ConceptsConcepts 3. The Laboratory Approach to IR3. The Laboratory Approach to IR 4. Mounting Critical Evidence on ...4. Mounting Critical Evidence on ...
information needs, relevanceinformation needs, relevance system system vsvs. user performance. user performance sessions, systems integrationsessions, systems integration
5. Cognitive Framework for Research on IIR5. Cognitive Framework for Research on IIR 6. Conclusions6. Conclusions
Information needs, good queries?Information needs, good queries?
Articulated needs assumed in IRArticulated needs assumed in IR Kuhlthau Kuhlthau (1994) and (1994) and BystrByström öm & Järvelin (1995)& Järvelin (1995)
have shown that sometimes there are nohave shown that sometimes there are noarticulated information needs precedingarticulated information needs precedinginformation access.information access.
Embarrassment and confusion often presentEmbarrassment and confusion often present --BelkinBelkin’’s s ASK (1982)ASK (1982)
Kuhlthau Kuhlthau 19941994
Stages Initiation Selection Exploration Formulation Collection Presentation
Feelings Uncertainty Optimism Confusion,
frustration, doubt
Clarity Sense of
direction,
confidence
Relief,
satisfaction or
disappointment
Thoughts Vague Clearer Increased
interest
Focused
Actions Seeking
background
information
Seeking relevant
information
Seeking
pertinent
information
Appropriate
tasks
Recognize Identify,
investigate
Identify,
investigate
Formulate Gather Complete
Information needs, ..., relevance?Information needs, ..., relevance?
RelevanceRelevance multiple degrees, multi-dimensional, individualmultiple degrees, multi-dimensional, individual
...... depends on problem stage, difficulty, anddepends on problem stage, difficulty, andinformation construction.information construction.
Clicks perhaps not reliable predictors ofClicks perhaps not reliable predictors ofrelevance.relevance.
What about human performance?What about human performance?
Allan, Allan, Carterette Carterette & Lewis (2005) searcher& Lewis (2005) searcherproductivity in a passage-based QA taskproductivity in a passage-based QA task
Turpin & Turpin & Scholer Scholer (2006) studied user(2006) studied userperformance on simple web search tasksperformance on simple web search tasks
Smith & Smith & Kantor Kantor (2008) explored the relation(2008) explored the relationbetween system performance and searcherbetween system performance and searcherbehaviorbehavior
... all insiders we can rely on... all insiders we can rely on
Exploring the outside...
Docu-ments
Represen-tation
Database
Searchrequest
Query
Matching
Represen-tation
QueryResult
EvaluationResultEvaluation
Relevance assessment
Recallbase
Humantask
Taskoutcome
***
**
... and all interactions ...
Docu-ments
Represen-tation
Database
Searchrequest
Matching
Represen-tation
QueryResult
EvaluationResultEvaluation
Relevance assessment
Recallbase
Humantask
Taskoutcome
***
**
Query
Single shotSingle shot
A hunter having oneA hunter having one
cartridge in his shotgun ...cartridge in his shotgun ...
Single shot Single shot vsvs. process. process
Information retrieval Information retrieval processesprocesses have have not beennot beensufficiently described; thereforesufficiently described; therefore they cannot be understoodthey cannot be understood
they cannotthey cannot be properly supported by IR techniquesbe properly supported by IR techniques they cannot be properly evaluatedthey cannot be properly evaluated
TREC IIR evaluationTREC IIR evaluation Session-based evaluation (Järvelin & al, 2008)Session-based evaluation (Järvelin & al, 2008)
session-DCG, individual queriessession-DCG, individual queries
Individual sessions, Top-10, Systems A and B (Individual sessions, Top-10, Systems A and B (bb=2=2; ; bqbq=4=4; 0-1-10-100); 0-1-10-100)
s(D)CG Feedback Simulations(D)CG Feedback Simulation
2, 2, 0-1-5-10
s(D)CG Feedback Simulations(D)CG Feedback Simulation
IR in isolation?IR in isolation?
Information needs and seekingInformation needs and seeking research in LISresearch in LISin 1960in 1960’’ss - 1980- 1980’’s - ARIST reviewss - ARIST reviews
Models Models vsvs. practice. practice of researchof research
The work group
Invisiblecollege
The profession
The individual within the organization
Formal informationsystems
The individual as aninformationprocessor
AllenAllen’’s IS Models IS Model
IR in isolation?IR in isolation?
IR rarely is the userIR rarely is the user’’s main tasks main task maybe just a pain in the neckmaybe just a pain in the neck
The information environmentThe information environment IR isIR is not performed in isolationnot performed in isolation in practicein practice
often multi-source, multi-tool informationoften multi-source, multi-tool informationenvironmentenvironment
often IR integrated into other toolsoften IR integrated into other tools
AgendaAgenda
1. Introduction1. Introduction 2. Some 2. Some ConceptsConcepts 3. The Laboratory Approach to IR3. The Laboratory Approach to IR 4. Mounting Critical Evidence4. Mounting Critical Evidence 5. Cognitive Framework for Research on IIR5. Cognitive Framework for Research on IIR 6. Concluding Discussion6. Concluding Discussion
A Cognitive Framework for IS&RA Cognitive Framework for IS&R
Information
objects
IT: Engines
Logics
Algorithms
InterfaceCognitive
Actor(s)(team)
Org.
Cultural
Social
Context
Information
objects
IT: Engines
Logics
Algorithms
InterfaceCognitive
Actor(s)(team)
Org.
Cultural
Social
Context
A Cognitive Framework for IS&RA Cognitive Framework for IS&R
Information
objects
IT: Engines
Logics
Algorithms
InterfaceCognitive
Actor(s)(team)
Org.
Cultural
Social
Context
Information
objects
IT: Engines
Logics
Algorithms
InterfaceCognitive
Actor(s)(team)
Org.
Cultural
Social
Context
The Lab IRModel
The Applications of the ModelThe Applications of the Model
Illustrating the roles of actors in a variety of cases ofIllustrating the roles of actors in a variety of cases ofinformation behavior;information behavior;
Pointing to core components and informationPointing to core components and informationprocesses depending on (or influencing) such cases processes depending on (or influencing) such cases ––i.e.,i.e.,
Pointing to kinds of context;Pointing to kinds of context; Pointing out central variables involved in a variety ofPointing out central variables involved in a variety of
research designs research designs –– with a number of independent with a number of independentvariablesvariables
Pointing out new research questions and study designsPointing out new research questions and study designs
Cognitive Framework and Evaluation CriteriaCognitive Framework and Evaluation Criteria
Docs
Repr
DB
Request
QueryMatch
Repr
Result
A: Recall, precision, efficiency, quality of information/processB: Usability, quality of information/process, learning
C: Quality of info & work process/result, learning
Work TaskSeeking Task
SeekingProcess
WorkProcess
Task Result
Seeking Result
EvaluationCriteria:
Work task contextSeeking context
IR context
Socio-organizational& cultural context
D: Socio-cognitive relevance; quality of work task result
Lawn Mower?Lawn Mower?
Cognitive Framework and Evaluation CriteriaCognitive Framework and Evaluation Criteria
Docs
Repr
DB
Request
QueryMatch
Repr
Result
A: Recall, precision, efficiency, quality of information/processB: Usability, quality of information/process, learning
C: Quality of info & work process/result, learning
Work TaskSeeking Task
SeekingProcess
WorkProcess
Task Result
Seeking Result
EvaluationCriteria:
Work task contextSeeking context
IR context
Socio-organizational& cultural context
D: Socio-cognitive relevance; quality of work task result
Developing understandingDeveloping understanding
AgendaAgenda 1. Introduction1. Introduction 2. Some 2. Some ConceptsConcepts 3.The Laboratory Approach to IR3.The Laboratory Approach to IR 4. Mounting Critical Evidence4. Mounting Critical Evidence 5. Cognitive Framework for Research on IIR5. Cognitive Framework for Research on IIR 6. Conclusions6. Conclusions
The Cognitive Research Framework SuggestsThe Cognitive Research Framework Suggests
Analyses of retrieval and access in differentAnalyses of retrieval and access in differenttypes of types of collectionscollections
Analyses of various Analyses of various actor typesactor types Analyses of various Analyses of various simulatedsimulated task typestask types for for
experimental control or experimental control or real task typesreal task types for forunderstanding real situationsunderstanding real situations
Analyses of Analyses of actor supportactor support in search processes in search processes
http://www.springeronline.com/1-4020-3850-X
Thank you!Thank you!
LiteratureLiteratureBorlund, P. Borlund, P. (2003). (2003). The IIR evaluation model: A framework for evaluation of interactiveThe IIR evaluation model: A framework for evaluation of interactive
information retrieval systems. information retrieval systems. Information Research, 8Information Research, 8(3), paper no. 152 [available at:(3), paper no. 152 [available at:http://informationr.net/ir/8-3/paper152.htmlhttp://informationr.net/ir/8-3/paper152.html]. May 13, 2003.]. May 13, 2003.
Ellis, D. (1996). Ellis, D. (1996). Progress and problems in information retrievalProgress and problems in information retrieval. London, UK: Library. London, UK: LibraryAssAssoociciaation Publishing.tion Publishing.
Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: ElIngwersen, P. (1996). Cognitive perspectives of information retrieval interaction: Elements of aements of acognitive IR theory. cognitive IR theory. Journal of DocumentationJournal of Documentation, 52(1), 3-50. (Available: , 52(1), 3-50. (Available: http://www.db.dk/pihttp://www.db.dk/pi ) )
Ingwersen, P. & Järvelin, K. (2005). Ingwersen, P. & Järvelin, K. (2005). The Turn: Integration of Information Seeking and RetrievalThe Turn: Integration of Information Seeking and Retrievalin Context.in Context. Dortrecht, NL: Springer. ISBN 1-4020-3850-X Dortrecht, NL: Springer. ISBN 1-4020-3850-X
Järvelin, K. (2007). An Analysis of Two Approaches in Information Retrieval: From FrameworksJärvelin, K. (2007). An Analysis of Two Approaches in Information Retrieval: From Frameworksto Study Designs. to Study Designs. Journal of the American Society for Information Science and TechnologyJournal of the American Society for Information Science and Technology58(7): 971-986. (Available: 58(7): 971-986. (Available: http://www.info.uta.fi/tutkimus/fire/archive/2007/KJ-http://www.info.uta.fi/tutkimus/fire/archive/2007/KJ-IRtheoryIRtheory’’06.pdf06.pdf
Järvelin, K. & Ingwersen, P. (2004). Information seeking research needs extension towards tasksJärvelin, K. & Ingwersen, P. (2004). Information seeking research needs extension towards tasksand technology. and technology. Information Research, 10(1), paper 212. Information Research, 10(1), paper 212. (Available:(Available:http://informationr.net/ir/10-1/paper212.htmlhttp://informationr.net/ir/10-1/paper212.html
Robertson, S. E. (2000). On theoretical argument in information retrieval. SIGIR Forum, 34(1), 1-Robertson, S. E. (2000). On theoretical argument in information retrieval. SIGIR Forum, 34(1), 1-10. (Available: 10. (Available: http://www.acm.org/sigs/sigir/forum/S2000/salton_lecture.pdfhttp://www.acm.org/sigs/sigir/forum/S2000/salton_lecture.pdf
Saracevic, T, (1997). Users lost: Reflections of the past, future and limits of information science.Saracevic, T, (1997). Users lost: Reflections of the past, future and limits of information science.ACM ACM Sigir Sigir Forum 31(2), 16-27.Forum 31(2), 16-27.