1
Bandits and Browsing Effective Collection Size as Way of Quantifying Search Efficiency ANALYSIS AND INITIAL RESULTS Ran statistical analysis on the English collection. Found books and topics that are of unusually high use and quantified statistically. Identified improbably understudied items. Found topics of interest for digital collection development. OUR PROJECT Prototype data: University of Illinois Library catalog circulation statistics Use our physical catalog to learn about collection use Apply this to improve search and recommendations in digital collections WHY? Biases in traditional search algorithms send most users to the same high-ranking materials. Digital libraries can adapt to user behavior, identify useful material and send users to relevant but understudied sources. Harriett E. Green, Kirk Hess, and Richard D. Hislop University of Illinois at Urbana-Champaign [email protected] [email protected] [email protected] [email protected] Twitter: @greenharr EFFECTIVE COLLECTION SIZE Effective Collection Size quantifies how efficiently a library uses its collection. It focuses on highlighting understudied works and aims to prevent the omission of useful materials in a collection. NEXT STEPS Analyze the broader University of Illinois catalog. Incorporate analysis into Illinois Harvest digital library search results. Produce a set of tools to help highlight understudied materials during reference and digitization projects. Use results to quantify increases in efficiency of collection use. SELECT REFERENCES Zhou, T., Kuscsik, Z., Liu, J.G., Medo, M., Wakeling, J.R., & Zhang, Y.C. (2010). Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences of the United States of America , 107, 4511-4515. Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. Proceedings of the Nineteenth International Conference on World Wide Web, 661-670. Doi: 10.1145/1772690.1772758 Xie, I. & Cool, C. (2009). Understanding help seeking within the context of searching digital libraries. Journal of the American Society for Information Science and Technology, 60, 477--494. 2011 DLF Forum October 31-November 1, 2011 Circulation of all titles with threshold of 100 checkouts Circulation of all titles with more than 100 checkouts

Bandits and Browsing: Effective Collection Size as Way of Quantifying Search Efficiency

Embed Size (px)

DESCRIPTION

This poster presents our preliminary research on how information can be extracted from user browsing behavior to identify understudied works that are relevant but have too few viewers. We investigate how to apply two types of analysis—a formula called Effective Collection Size and ‘multi-armed bandit’ analysis—to extracted user data to develop alternative methods of retrieving materials from collection that are collated by richer factors of relevancy. We anticipate that these analyses will enable the development of an information retrieval system that presents a broad range of content in a user’s search results.

Citation preview

Page 1: Bandits and Browsing:  Effective Collection Size as Way of Quantifying Search Efficiency

Bandits and BrowsingEffective Collection Size as Way of Quantifying Search Efficiency

ANALYSIS AND INITIAL RESULTS• Ran statistical analysis on the English collection. • Found books and topics that are of unusually high

use and quantified statistically.• Identified improbably understudied items.• Found topics of interest for digital collection

development.

OUR PROJECT• Prototype data: University of

Illinois Library catalog circulation statistics

• Use our physical catalog to learn about collection use

• Apply this to improve search and recommendations in digital collections

WHY? Biases in traditional search algorithms send most users to the same high-ranking materials. Digital libraries can adapt to user behavior, identify useful material and send users to relevant but understudied sources.

Harriett E. Green, Kirk Hess, and Richard D. Hislop University of Illinois at [email protected] [email protected] [email protected]

[email protected] Twitter: @greenharr

EFFECTIVE COLLECTION SIZEEffective Collection Size quantifies how efficiently a library uses its collection. It focuses on highlighting understudied works and aims to prevent the omission of useful materials in a collection.

NEXT STEPS• Analyze the broader University of Illinois catalog.• Incorporate analysis into Illinois Harvest digital

library search results.• Produce a set of tools to help highlight

understudied materials during reference and digitization projects.

• Use results to quantify increases in efficiency of collection use.

SELECT REFERENCESZhou, T., Kuscsik, Z., Liu, J.G., Medo, M., Wakeling, J.R., & Zhang, Y.C. (2010). Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences of the United States of America, 107, 4511-4515.

Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. Proceedings of the Nineteenth International Conference on World Wide Web, 661-670. Doi: 10.1145/1772690.1772758

Xie, I. & Cool, C. (2009). Understanding help seeking within the context of searching digital libraries. Journal of the American Society for Information Science and Technology, 60, 477--494.

2011 DLF Forum October 31-November 1, 2011

Circulation of all titles with threshold of 100 checkouts

Circulation of all titles with more than 100 checkouts