Upload
natalia-ostapuk
View
279
Download
0
Embed Size (px)
Citation preview
IntroductionSystem overviewSciNet backend
Results
Towards Exploratory Searchof Scientific Information
Ksenia Konyushkova
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Table of contents
1 Introduction
2 System overview
3 SciNet backendRetrievalKeyword ExplorationDocument Exploration
4 Results
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Helsinki Institute for Information Technology andUniversity of Helsinki, Department of Computer Science
Directing Exploratory Search: Reinforcement Learning from UserInteractions with Keywords
Dorota G lowackaTuukka Ruotsalo
Ksenia KonyushkovaKumaripaba Athukorala
Samuel KaskiGiulio Jacucci
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Introduction
Goal of the system:
Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces
Techniques:
Reinforcement Learning
Optimized Visualization
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Introduction
Goal of the system:
Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces
Techniques:
Reinforcement Learning
Optimized Visualization
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Introduction
Goal of the system:
Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces
Techniques:
Reinforcement Learning
Optimized Visualization
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Introduction
Goal of the system:
Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces
Techniques:
Reinforcement Learning
Optimized Visualization
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
SciNet: System Interface
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Dataflow
Figure: Overview of data flow in the exploratory search system
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Retrieving and Ranking Documents
Probabilistic multinomial unigram language model
MLE:
P(k |Mdj ) =i∏
ki∈kwi Pmle(ki |Mdj ),
Bayesian Dirichlet smoothing:
Pµ(k|dj) =c(k ; dj) + µp(k |C )∑
k c(k ; dj) + µ,
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Retrieving and Ranking Documents
Probabilistic multinomial unigram language modelMLE:
P(k |Mdj ) =i∏
ki∈kwi Pmle(ki |Mdj ),
Bayesian Dirichlet smoothing:
Pµ(k|dj) =c(k ; dj) + µp(k |C )∑
k c(k ; dj) + µ,
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Retrieving and Ranking Documents
Probabilistic multinomial unigram language modelMLE:
P(k |Mdj ) =i∏
ki∈kwi Pmle(ki |Mdj ),
Bayesian Dirichlet smoothing:
Pµ(k|dj) =c(k ; dj) + µp(k |C )∑
k c(k ; dj) + µ,
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised LearningUnsupervised LearningReinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised Learning
Unsupervised LearningReinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised LearningUnsupervised Learning
Reinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised LearningUnsupervised LearningReinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised LearningUnsupervised LearningReinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised LearningUnsupervised LearningReinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the reward
Exploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised LearningUnsupervised LearningReinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the rewardExploration-Exploitation paradigm
Milti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Reinforcement Learning
Machine Learning:
Supervised LearningUnsupervised LearningReinforcement Learning
Reinforcement Learning:
agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on
Figure: Multi-armed bandits (Microsoft research)Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Keyword Exploration (query drift)
Learning to rank: initial document retrieval returns 300documents
Receive feedback from the user
Keywords representation - tfidf
Exploration - LinRel (Auer, 2002)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Keyword Exploration (query drift)
Learning to rank: initial document retrieval returns 300documents
Receive feedback from the user
Keywords representation - tfidf
Exploration - LinRel (Auer, 2002)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Keyword Exploration (query drift)
Learning to rank: initial document retrieval returns 300documents
Receive feedback from the user
Keywords representation - tfidf
Exploration - LinRel (Auer, 2002)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Keyword Exploration (query drift)
Learning to rank: initial document retrieval returns 300documents
Receive feedback from the user
Keywords representation - tfidf
Exploration - LinRel (Auer, 2002)
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
LinRel
LinRel algorithm (Auer, 2002):
estimate weight vector w by solving a linear regression
r = X · w
calculate estimated relevance score ri = xi · wcalculate upper confidence bound:
ri + γσi
choose keywords with highest upper confidence bound
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
LinRel
LinRel algorithm (Auer, 2002):
estimate weight vector w by solving a linear regression
r = X · w
calculate estimated relevance score ri = xi · wcalculate upper confidence bound:
ri + γσi
choose keywords with highest upper confidence bound
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
LinRel
LinRel algorithm (Auer, 2002):
estimate weight vector w by solving a linear regression
r = X · w
calculate estimated relevance score ri = xi · w
calculate upper confidence bound:
ri + γσi
choose keywords with highest upper confidence bound
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
LinRel
LinRel algorithm (Auer, 2002):
estimate weight vector w by solving a linear regression
r = X · w
calculate estimated relevance score ri = xi · wcalculate upper confidence bound:
ri + γσi
choose keywords with highest upper confidence bound
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
LinRel
LinRel algorithm (Auer, 2002):
estimate weight vector w by solving a linear regression
r = X · w
calculate estimated relevance score ri = xi · wcalculate upper confidence bound:
ri + γσi
choose keywords with highest upper confidence bound
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
GP UCB
Gaussian Process Bandits
Present to the user the object that maximizes
argmax{µi +√β · σi},
whereµ = K∗K
−1r ,
σ = K∗∗ − K∗K−1KT
∗ .
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
GP UCB
Gaussian Process Bandits
Present to the user the object that maximizes
argmax{µi +√β · σi},
whereµ = K∗K
−1r ,
σ = K∗∗ − K∗K−1KT
∗ .
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
GP UCB
Gaussian Process Bandits
Present to the user the object that maximizes
argmax{µi +√β · σi},
whereµ = K∗K
−1r ,
σ = K∗∗ − K∗K−1KT
∗ .
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
GP SOM
Hierarchical Gaussian Process Bandits with Self-OrganizingMaps
Figure: ImSe interfaceKsenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Document Exploration (diversity)
Assumption: relevance of a keyword - relevance of all thedocuments containing this keyword
α - success measure, β - failure measure
Thompson sampling for Bernoulli bandit with Betadistribution (Thompson, 1933; Chapelle, Li, 2011):each document is bandit arm with a Beta distribution -Beta(α, β),
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Document Exploration (diversity)
Assumption: relevance of a keyword - relevance of all thedocuments containing this keyword
α - success measure, β - failure measure
Thompson sampling for Bernoulli bandit with Betadistribution (Thompson, 1933; Chapelle, Li, 2011):each document is bandit arm with a Beta distribution -Beta(α, β),
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Document Exploration (diversity)
Assumption: relevance of a keyword - relevance of all thedocuments containing this keyword
α - success measure, β - failure measure
Thompson sampling for Bernoulli bandit with Betadistribution (Thompson, 1933; Chapelle, Li, 2011):each document is bandit arm with a Beta distribution -Beta(α, β),
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
RetrievalKeyword ExplorationDocument Exploration
Intent Modeling
Figure: Illustration of intent modeling
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
User studies
”You are writing an essay describing the field of ”robotics”. Thisessay should include at least three subfields of ”robotics”, threeapplication areas of ”robotics” and three algorithms commonlyused in the field of ”robotics”.”
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Precision results
Figure: Illustration of precision measure of Baseline and SciNet in termsof relevance, novelty and obviousness
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Recall results
Figure: Illustration of recall measure of Baseline and SciNet in terms ofrelevance, novelty and obviousness
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
F-measure results
Figure: Illustration of F-measure measure of Baseline and SciNet in termsof relevance, novelty and obviousness
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Keywords results
Figure: Cumulative amount of shown and manipulated keywords inSciNet system
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Conclusions
Interactive information retrieval system
Reinforcement Learning
Radar Layout
Performance in Precision, Recall and F-measure in terms ofRelevance, Novelty and Obviousness
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Acknowledgments
The data used in the experiments is derived from the Web of Science prepared by THOMSON REUTERS, Inc.,Philadelphia, Pennsylvania, USA: Copyright THOMSON REUTERS, 2011. All rights reserved; the Digital Libraryof the Association of Computing Machinery (ACM); the Digital Library of Institute of Electrical and ElectronicsEngineers (IEEE), and the Digital Library of Springer.
The work has been partly supported by the Academy of Finland under the Finnish Center of Excellence in
Computational Inference Research (COIN), by the Finnish Funding Agency for Technology and Innovation under
project D2I, and by the IST Programme of the European Community under the PASCAL Network of Excellence.
Ksenia Konyushkova Towards Exploratory Search of Scientific Information
IntroductionSystem overviewSciNet backend
Results
Thanks for your attention!
Ksenia Konyushkova Towards Exploratory Search of Scientific Information