3
Editorial Adaptive information retrieval: Introduction to the special topic issue of information processing and management The role of adaptive information retrieval systems is becoming ever more important given the problems of information overload and the difficulties involved in the information seeking process. Data is accumulating everywhere, such as on the World Wide Web, specialised domains like digital libraries, private organisations, and personal collections (desktop, home archives), among others. This accumulated data, often generated as a side effect of our daily activities, has turned out to be a valuable source of information. Though powerful techniques have been developed for retrieving relevant documents, the retrieval process is still often deemed to be unsatisfactory. The information seeking process involves a number of uncertain- ties: the uncertain nature of an information need and the associated query formulation process; the ambiguity in represent- ing a document; the difficulties involved in matching an uncertain query to inaccurate document representations; and the difficulties involved in presenting the retrieval results to the users. It is proposed, however, that adaptive retrieval tech- niques can alleviate many of the above difficulties (Joho, Urban, Villa, Jose, & van Rijsbergen, 2008). One of the goals of adap- tive information retrieval (AIR) research is to develop retrieval technology that can predict what information a searcher will need to complete their tasks and decide how and when to present that information to the user. Information needs arise due to many factors, for example, a gap in the knowledge of a user, an external situational stim- ulus that forces a user to search for information, or a long-standing need to complete a task (Belkin, 2000). If we know more about a user, we can subsequently adapt the retrieval process to the needs of the user, by adapting the query to user’s needs or context, for example, or adapting the retrieval process to the class of information needs tackled by the user in question. During a retrieval process, users often learn while interacting with retrieved results, which can influence the subsequent queries carried out by the user, or their underlying information need. Novel results presentation schemes have been pro- posed and investigated with the aim of understanding how result presentation can enhance the effectiveness of the infor- mation seeking process (Hearst & Pederson, 1996; White, Jose, & Ruthven, 2003). The objective is to inform the user about the kinds of information available in the retrieved set and guide the user to the right document or documents. In a complex searching task, it is important from a user’s perspective to understand why the documents are retrieved and what kind of information relevant to their need is available in the collection. This can help the users in understanding their underlying need and facilitate the relevance feedback process. Relevance feedback is an early technique to adapt a user’s queries to relevant documents, based on relevance information explicitly provided by a user on the documents retrieved against an initial query. While this is a very valuable adaptation approach, its real-life usage has been limited due to issues with the feedback process. Many evaluative studies have found that the indication of relevance is burdensome, either cognitively demanding for searchers, or searchers may be unable to identify what information is relevant. Such difficulties led to the development of implicit feedback systems where user inter- action data is mined to infer users’ developing information needs. User studies of experimental systems have shown that implicit feedback based systems were effective in inferring user needs and thus helping the information retrieval process. These types of techniques try to adapt the query to the user needs based on the user interaction and are one of the most prevalent forms of adaptation techniques. Such adaptation can modify user queries by expanding them, deleting some of the terms and/or completely suggesting new queries. However, feedback based adaptation is limited since it deals with only user needs within a search session. Information needs can span more than a session, and often users have long-term needs such as looking for materials on their research topic or following a developing news story. Such interest may evolve over a period of time and is more dynamic in nature. Adapting to such long-term needs is a real challenge. Such challenges require new and robust solutions. Recently contextual methods were proposed as a way of meeting many of the difficulties described above (Ingwersen & Belkin, 2004; Ingwersen & Järvelin, 2005a; Ingwersen & Järvelin, 2005b). Given a context, it is possible to adapt to the contextual factors. However, this area of research is in its nascent form 0306-4573/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.ipm.2008.08.002 Information Processing and Management 44 (2008) 1819–1821 Contents lists available at ScienceDirect Information Processing and Management journal homepage: www.elsevier.com/locate/infoproman

Adaptive information retrieval: Introduction to the special topic issue of information processing and management

Embed Size (px)

Citation preview

Page 1: Adaptive information retrieval: Introduction to the special topic issue of information processing and management

Information Processing and Management 44 (2008) 1819–1821

Contents lists available at ScienceDirect

Information Processing and Management

journal homepage: www.elsevier .com/ locate/ infoproman

Editorial

Adaptive information retrieval: Introduction to the special topic issueof information processing and management

The role of adaptive information retrieval systems is becoming ever more important given the problems of informationoverload and the difficulties involved in the information seeking process. Data is accumulating everywhere, such as on theWorld Wide Web, specialised domains like digital libraries, private organisations, and personal collections (desktop, homearchives), among others. This accumulated data, often generated as a side effect of our daily activities, has turned out to be avaluable source of information. Though powerful techniques have been developed for retrieving relevant documents, theretrieval process is still often deemed to be unsatisfactory. The information seeking process involves a number of uncertain-ties: the uncertain nature of an information need and the associated query formulation process; the ambiguity in represent-ing a document; the difficulties involved in matching an uncertain query to inaccurate document representations; and thedifficulties involved in presenting the retrieval results to the users. It is proposed, however, that adaptive retrieval tech-niques can alleviate many of the above difficulties (Joho, Urban, Villa, Jose, & van Rijsbergen, 2008). One of the goals of adap-tive information retrieval (AIR) research is to develop retrieval technology that can predict what information a searcher willneed to complete their tasks and decide how and when to present that information to the user.

Information needs arise due to many factors, for example, a gap in the knowledge of a user, an external situational stim-ulus that forces a user to search for information, or a long-standing need to complete a task (Belkin, 2000). If we know moreabout a user, we can subsequently adapt the retrieval process to the needs of the user, by adapting the query to user’s needsor context, for example, or adapting the retrieval process to the class of information needs tackled by the user in question.During a retrieval process, users often learn while interacting with retrieved results, which can influence the subsequentqueries carried out by the user, or their underlying information need. Novel results presentation schemes have been pro-posed and investigated with the aim of understanding how result presentation can enhance the effectiveness of the infor-mation seeking process (Hearst & Pederson, 1996; White, Jose, & Ruthven, 2003). The objective is to inform the user aboutthe kinds of information available in the retrieved set and guide the user to the right document or documents. In a complexsearching task, it is important from a user’s perspective to understand why the documents are retrieved and what kind ofinformation relevant to their need is available in the collection. This can help the users in understanding their underlyingneed and facilitate the relevance feedback process.

Relevance feedback is an early technique to adapt a user’s queries to relevant documents, based on relevance informationexplicitly provided by a user on the documents retrieved against an initial query. While this is a very valuable adaptationapproach, its real-life usage has been limited due to issues with the feedback process. Many evaluative studies have foundthat the indication of relevance is burdensome, either cognitively demanding for searchers, or searchers may be unable toidentify what information is relevant. Such difficulties led to the development of implicit feedback systems where user inter-action data is mined to infer users’ developing information needs. User studies of experimental systems have shown thatimplicit feedback based systems were effective in inferring user needs and thus helping the information retrieval process.These types of techniques try to adapt the query to the user needs based on the user interaction and are one of the mostprevalent forms of adaptation techniques. Such adaptation can modify user queries by expanding them, deleting some ofthe terms and/or completely suggesting new queries.

However, feedback based adaptation is limited since it deals with only user needs within a search session. Informationneeds can span more than a session, and often users have long-term needs such as looking for materials on their researchtopic or following a developing news story. Such interest may evolve over a period of time and is more dynamic in nature.Adapting to such long-term needs is a real challenge.

Such challenges require new and robust solutions. Recently contextual methods were proposed as a way of meetingmany of the difficulties described above (Ingwersen & Belkin, 2004; Ingwersen & Järvelin, 2005a; Ingwersen & Järvelin,2005b). Given a context, it is possible to adapt to the contextual factors. However, this area of research is in its nascent form

0306-4573/$ - see front matter � 2008 Elsevier Ltd. All rights reserved.doi:10.1016/j.ipm.2008.08.002

Page 2: Adaptive information retrieval: Introduction to the special topic issue of information processing and management

1820 Editorial / Information Processing and Management 44 (2008) 1819–1821

and is still undefined to an extent. Another issue that requires urgent attention is the development of adaptive retrievalmodels. Most research so far addresses the issue by developing solutions that react to user interaction or feedback. However,developing a retrieval model which can provide seamless retrieval solutions is a challenge. Moreover, current evaluationmethodologies are limited in evaluating adaptive solutions. Newer forms of evaluation that can evaluate and benchmarkadaptive retrieval techniques are required. Simulated evaluation schemes are an alternative, however, incomplete from amethodological perspective.

Some of the issues have been discussed in the first international workshop on adaptive information retrieval (Joho et al.,2008) organised by the guest editors. The research problems and new ideas presented in the workshop motivated this spe-cial issue. In this issue we present papers that tackle adaptation from various perspectives. All papers address interestingdimensions covering evaluation issues, user issues, adaptive models etc. In other words, the papers presented in this issuetry to answer some of the fundamental questions in adaptive IR such as what to adapt, how to adapt, and how to evaluate.An overview of the accepted papers follows.

Li and Belkin review existing models of tasks in different domains and propose a new faceted model that aims to facilitateresearch in both the information science and information retrieval communities. Based on a comprehensive literature re-view, the authors point out that existing models tended to look at partial aspects of tasks, and that a faceted approachcan provide a more holistic model. The proposed model offers seven main facets such as source of task, task doer, time, prod-uct, process, goal, task characteristics, and user’s perception of task.

Kumaran and Allan describe work on adapting user queries by modifying them based on user interaction. They explore anumber of techniques for analysing queries and suggesting whether it makes sense to modify a query or not. Specifically, theauthors motivate the utility of adapting user queries and develop automatic techniques for adapting queries. They havedeveloped techniques to automatically analyse and infer the utility of involving users in the adaptation process and infersituations where such adaptation is fruitless. The experimental results show that there will be gains by adapting to userqueries.

In the article entitled, ‘‘using genetic algorithms to evolve a population of topical queries”, Cechini et al. looks into theproblem of query adaptation. The objective of this work is to design intelligent techniques to automatically refine searchqueries and to accommodate resources relevant to thematic context as a whole. This approach is based on a genetic algo-rithm based framework which tries to address the quality of information in formulating new queries.

Voorhees, in the paper entitled, ‘‘On Test Collections for Adaptive Information Retrieval”, analyses the Cranfield paradigmof test collections and reflects on the features needed for an adaptive test collection. The author argues that the current Cran-field evaluation methodology provides little support for AIR research. The traditional test collection approach abstracts theretrieval task which is known as the core-competency of retrieval and is necessary but not sufficient for user retrieval tasks.However, such a core-competency is not identified in adaptive retrieval. The author reflects on the Cranfield methodologyand argues that building effective AIR test collections will critically depend on identifying those factors that represent theessence of adaptive retrieval behaviour.

Yi Zhang in the article entitled, ‘‘Complex Adaptive Filtering User profile using Graphical Models”, explores how to devel-op complex data driven user models. A graphical modelling framework is employed to learn rich user models which satisfycomplex user criteria in an information filtering situation. User data, collected over a period of a month, is gathered from aweb based personal news filtering system and employed in the study. The probabilistic graphical model is used to integratemultiple forms of evidence regarding a user and is shown to be effective in describing the casual relationship between var-ious forms of evidence. The experimental results demonstrate the suitability of graphical models in learning complex datadomains and also their effectiveness in adaptive filtering systems.

In the paper entitled, ‘‘Adapting Information retrieval to Query Contexts”, Bai and Nie describe a language modelling ap-proach to integrate a number of contextual factors: the topic domain of the query; the characteristics of the document col-lection; the context words of the query. A new language model is generated based on each contextual factor which isintegrated with the original query model. Experiments on TREC data set demonstrated the positive effect of the contextualfeatures in providing effective retrieval. In this approach the document ranking function integrates more contextual factorsand hence adapts to the user context.

We hope that these papers selected for the special issue enlightens the research on adaptive information retrieval andfurther strengthens research in this field. We would like to thank the former editor-in-chief of information processingand management, Tefko Saracevic, and current editor-in-chief Fabio Crestani, for their support in producing this special issueand the reviewers who gave their time and expertise in selecting these articles.

References

Belkin, N. J. (2000). Helping people find what they don’t know. Communications of the ACM, 43(8), 58–61.Hearst, M. A., & Pederson, J. O. (1996). Re-examining the cluster hypothesis: Scatter/gather on retrieval results. In: Proceedings of the 19th annual

international ACM SIGIR conference on research and development in information retrieval. Zurich, Switzerland: ACM.Ingwersen, P., & Belkin, N. (2004). Information retrieval in context – IRiX: Workshop at SIGIR 2004 (Vol. 38, no. 2, pp. 50–52). SIGIR Forum.Ingwersen, P., & Järvelin, K. (2005). Information retrieval in context: IRiX (Vol. 39, no. 2, pp. 31–39). SIGIR Forum.Ingwersen, P., & Järvelin, K. (2005b). The turn - integration of information seeking and retrieval in context. Springer.Joho, H., Urban, J., Villa, R., Jose, J. M., & van Rijsbergen, C. J. (2008). AIR 2006: First international workshop on adaptive information retrieval (Vol. 42, no. 1,

pp. 63–66). SIGIR Forum.

Page 3: Adaptive information retrieval: Introduction to the special topic issue of information processing and management

Editorial / Information Processing and Management 44 (2008) 1819–1821 1821

White, R. W., Jose, J. M., & Ruthven, I. G. (2003). A granular approach to web search result presentation. In Proceedings of the 9th IFIP TC13 internationalconference on human-computer interaction (INTERACT 2003). Zürich, Switzerland: IOS Press.

Joemon M. JoseHideo Joho

C.J. van RijsbergenDepartment of Computing Science,

University of Glasgow,Glasgow,

United KingdomTel.: +44 141 330 1636; fax: +44 141 330 4913 ðJ:M: JoseÞ

E-mail address: jj@dcs:gla:ac:uk ðJ:M: JoseÞ

Available online 2 October 2008