45
Introduction System overview SciNet backend Results Towards Exploratory Search of Scientific Information Ksenia Konyushkova Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Konyushkova

Embed Size (px)

Citation preview

Page 1: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Towards Exploratory Searchof Scientific Information

Ksenia Konyushkova

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 2: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Table of contents

1 Introduction

2 System overview

3 SciNet backendRetrievalKeyword ExplorationDocument Exploration

4 Results

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 3: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Helsinki Institute for Information Technology andUniversity of Helsinki, Department of Computer Science

Directing Exploratory Search: Reinforcement Learning from UserInteractions with Keywords

Dorota G lowackaTuukka Ruotsalo

Ksenia KonyushkovaKumaripaba Athukorala

Samuel KaskiGiulio Jacucci

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 4: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Introduction

Goal of the system:

Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces

Techniques:

Reinforcement Learning

Optimized Visualization

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 5: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Introduction

Goal of the system:

Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces

Techniques:

Reinforcement Learning

Optimized Visualization

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 6: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Introduction

Goal of the system:

Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces

Techniques:

Reinforcement Learning

Optimized Visualization

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 7: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Introduction

Goal of the system:

Support exploratory information seeking behavior of researchers byoffering tools to assist in navigating through complex informationspaces

Techniques:

Reinforcement Learning

Optimized Visualization

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 8: Konyushkova

IntroductionSystem overviewSciNet backend

Results

SciNet: System Interface

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 9: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Dataflow

Figure: Overview of data flow in the exploratory search system

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 10: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Retrieving and Ranking Documents

Probabilistic multinomial unigram language model

MLE:

P(k |Mdj ) =i∏

ki∈kwi Pmle(ki |Mdj ),

Bayesian Dirichlet smoothing:

Pµ(k|dj) =c(k ; dj) + µp(k |C )∑

k c(k ; dj) + µ,

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 11: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Retrieving and Ranking Documents

Probabilistic multinomial unigram language modelMLE:

P(k |Mdj ) =i∏

ki∈kwi Pmle(ki |Mdj ),

Bayesian Dirichlet smoothing:

Pµ(k|dj) =c(k ; dj) + µp(k |C )∑

k c(k ; dj) + µ,

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 12: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Retrieving and Ranking Documents

Probabilistic multinomial unigram language modelMLE:

P(k |Mdj ) =i∏

ki∈kwi Pmle(ki |Mdj ),

Bayesian Dirichlet smoothing:

Pµ(k|dj) =c(k ; dj) + µp(k |C )∑

k c(k ; dj) + µ,

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 13: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised LearningUnsupervised LearningReinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 14: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised Learning

Unsupervised LearningReinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 15: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised LearningUnsupervised Learning

Reinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 16: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised LearningUnsupervised LearningReinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 17: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised LearningUnsupervised LearningReinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 18: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised LearningUnsupervised LearningReinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the reward

Exploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 19: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised LearningUnsupervised LearningReinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the rewardExploration-Exploitation paradigm

Milti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 20: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Reinforcement Learning

Machine Learning:

Supervised LearningUnsupervised LearningReinforcement Learning

Reinforcement Learning:

agents take actions in the environment to maximize the rewardExploration-Exploitation paradigmMilti-armed bandit problem: greedy, epsilon-greedy, UCB-1,UCB-tuned and so on

Figure: Multi-armed bandits (Microsoft research)Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 21: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Keyword Exploration (query drift)

Learning to rank: initial document retrieval returns 300documents

Receive feedback from the user

Keywords representation - tfidf

Exploration - LinRel (Auer, 2002)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 22: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Keyword Exploration (query drift)

Learning to rank: initial document retrieval returns 300documents

Receive feedback from the user

Keywords representation - tfidf

Exploration - LinRel (Auer, 2002)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 23: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Keyword Exploration (query drift)

Learning to rank: initial document retrieval returns 300documents

Receive feedback from the user

Keywords representation - tfidf

Exploration - LinRel (Auer, 2002)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 24: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Keyword Exploration (query drift)

Learning to rank: initial document retrieval returns 300documents

Receive feedback from the user

Keywords representation - tfidf

Exploration - LinRel (Auer, 2002)

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 25: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

LinRel

LinRel algorithm (Auer, 2002):

estimate weight vector w by solving a linear regression

r = X · w

calculate estimated relevance score ri = xi · wcalculate upper confidence bound:

ri + γσi

choose keywords with highest upper confidence bound

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 26: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

LinRel

LinRel algorithm (Auer, 2002):

estimate weight vector w by solving a linear regression

r = X · w

calculate estimated relevance score ri = xi · wcalculate upper confidence bound:

ri + γσi

choose keywords with highest upper confidence bound

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 27: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

LinRel

LinRel algorithm (Auer, 2002):

estimate weight vector w by solving a linear regression

r = X · w

calculate estimated relevance score ri = xi · w

calculate upper confidence bound:

ri + γσi

choose keywords with highest upper confidence bound

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 28: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

LinRel

LinRel algorithm (Auer, 2002):

estimate weight vector w by solving a linear regression

r = X · w

calculate estimated relevance score ri = xi · wcalculate upper confidence bound:

ri + γσi

choose keywords with highest upper confidence bound

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 29: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

LinRel

LinRel algorithm (Auer, 2002):

estimate weight vector w by solving a linear regression

r = X · w

calculate estimated relevance score ri = xi · wcalculate upper confidence bound:

ri + γσi

choose keywords with highest upper confidence bound

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 30: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

GP UCB

Gaussian Process Bandits

Present to the user the object that maximizes

argmax{µi +√β · σi},

whereµ = K∗K

−1r ,

σ = K∗∗ − K∗K−1KT

∗ .

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 31: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

GP UCB

Gaussian Process Bandits

Present to the user the object that maximizes

argmax{µi +√β · σi},

whereµ = K∗K

−1r ,

σ = K∗∗ − K∗K−1KT

∗ .

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 32: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

GP UCB

Gaussian Process Bandits

Present to the user the object that maximizes

argmax{µi +√β · σi},

whereµ = K∗K

−1r ,

σ = K∗∗ − K∗K−1KT

∗ .

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 33: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

GP SOM

Hierarchical Gaussian Process Bandits with Self-OrganizingMaps

Figure: ImSe interfaceKsenia Konyushkova Towards Exploratory Search of Scientific Information

Page 34: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Document Exploration (diversity)

Assumption: relevance of a keyword - relevance of all thedocuments containing this keyword

α - success measure, β - failure measure

Thompson sampling for Bernoulli bandit with Betadistribution (Thompson, 1933; Chapelle, Li, 2011):each document is bandit arm with a Beta distribution -Beta(α, β),

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 35: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Document Exploration (diversity)

Assumption: relevance of a keyword - relevance of all thedocuments containing this keyword

α - success measure, β - failure measure

Thompson sampling for Bernoulli bandit with Betadistribution (Thompson, 1933; Chapelle, Li, 2011):each document is bandit arm with a Beta distribution -Beta(α, β),

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 36: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Document Exploration (diversity)

Assumption: relevance of a keyword - relevance of all thedocuments containing this keyword

α - success measure, β - failure measure

Thompson sampling for Bernoulli bandit with Betadistribution (Thompson, 1933; Chapelle, Li, 2011):each document is bandit arm with a Beta distribution -Beta(α, β),

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 37: Konyushkova

IntroductionSystem overviewSciNet backend

Results

RetrievalKeyword ExplorationDocument Exploration

Intent Modeling

Figure: Illustration of intent modeling

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 38: Konyushkova

IntroductionSystem overviewSciNet backend

Results

User studies

”You are writing an essay describing the field of ”robotics”. Thisessay should include at least three subfields of ”robotics”, threeapplication areas of ”robotics” and three algorithms commonlyused in the field of ”robotics”.”

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 39: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Precision results

Figure: Illustration of precision measure of Baseline and SciNet in termsof relevance, novelty and obviousness

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 40: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Recall results

Figure: Illustration of recall measure of Baseline and SciNet in terms ofrelevance, novelty and obviousness

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 41: Konyushkova

IntroductionSystem overviewSciNet backend

Results

F-measure results

Figure: Illustration of F-measure measure of Baseline and SciNet in termsof relevance, novelty and obviousness

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 42: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Keywords results

Figure: Cumulative amount of shown and manipulated keywords inSciNet system

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 43: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Conclusions

Interactive information retrieval system

Reinforcement Learning

Radar Layout

Performance in Precision, Recall and F-measure in terms ofRelevance, Novelty and Obviousness

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 44: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Acknowledgments

The data used in the experiments is derived from the Web of Science prepared by THOMSON REUTERS, Inc.,Philadelphia, Pennsylvania, USA: Copyright THOMSON REUTERS, 2011. All rights reserved; the Digital Libraryof the Association of Computing Machinery (ACM); the Digital Library of Institute of Electrical and ElectronicsEngineers (IEEE), and the Digital Library of Springer.

The work has been partly supported by the Academy of Finland under the Finnish Center of Excellence in

Computational Inference Research (COIN), by the Finnish Funding Agency for Technology and Innovation under

project D2I, and by the IST Programme of the European Community under the PASCAL Network of Excellence.

Ksenia Konyushkova Towards Exploratory Search of Scientific Information

Page 45: Konyushkova

IntroductionSystem overviewSciNet backend

Results

Thanks for your attention!

Ksenia Konyushkova Towards Exploratory Search of Scientific Information