Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Generation & Evaluation of PersonalisedPush-NotificationsKieran Fraser, Bilal Yousuf, Owen ConlanADAPT Centre, Trinity College Dublin
The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
EvalUMAP Workshop
www.adaptcentre.ieContent
❖ Proposed Challenge
❖ Gym-Push
❖ Evaluation Metrics
❖ Challenge Entry
❖ Results & Discussion
❖ Limitations
❖ Final Thoughts
www.adaptcentre.ieProposed Challenge
Currently no established or standardized means for repeatable and comparative evaluation of algorithms and systems in the UMAP space.
1. Proposal for a Shared Challenge in the UMAP Space, EvalUMAP Whitepaper 2019
Goal
Shared Task
“focuses on user model generation using logged mobile phone data, with an assumed purpose of supporting mobile phone notification suggestion.” 1
www.adaptcentre.ieSocial Influencer Problems
www.adaptcentre.ieProposed Challenge
“the challenge is to create an approach to generate personalized notifications on individuals’ mobile phones, whereby such personalization would consist of deciding what events (SMS received, etc.) to show to the individual and when
to show them.”1
1. Proposal for a Shared Challenge in the UMAP Space, EvalUMAP Whitepaper 2019
Challenge 1
• Given 3 months historical notification data (for training)
• Develop a user model which generates a personalized notification given a context
• Using Gym-Push, user model is evaluated using test data and evaluation metrics
Challenge 2
• Given small sample of notification data (no training)
• Develop an adaptive user model which generates a personalized notification given a context
• Using Gym-Push, user model is evaluated, in simulated “real-time”, using test data and evaluation metrics
www.adaptcentre.ieGym-Push
OpenAI Gym
Open source toolkit for “developing and comparing reinforcement learning algorithms” 1
1. https://gym.openai.com/
Gym-Push
Custom OpenAI Gym environment simulating push-notification overload on mobile device users
www.adaptcentre.ieGym-Push
Gym-Push
• Ease of installation – pip, docker, hosted
• Multiple communities – RL, UMAP, HCI
• End-user interface
• Established Online Leaderboard
www.adaptcentre.ieGym-Push
Challenge 1
www.adaptcentre.ieGym-Push
Challenge 2
www.adaptcentre.ieGym-Push
www.adaptcentre.ieGym-Push
Train on Real, Test on Synthetic 1 RMSE F1 scores differ in range 0.02 – 0.07 indicating synthetic data imitates real world data.
1. Esteban, C., Hyland, S.L., R¨atsch, G.: Real-valued (medical) time series generation
www.adaptcentre.ieGym-Push
www.adaptcentre.ieEvaluation Metrics
Performance Diversity
Response Time
Learning Rate
www.adaptcentre.ieEvaluation Metrics
Accuracy Precision Recall F1
Simulated User• AdaBoost Classifier chosen• Trained on 3 months of historical user data• Acc avg = 83.8%, F1 avg = 72.8%
www.adaptcentre.ieChallenge Entry
• MLP used for Generator & Discriminator• Notifications OHE vector length 28• Trained using RMSProp in 128 mini-batch chunks
over 2000 epochs
Challenge 1
www.adaptcentre.ieResults & Discussion
www.adaptcentre.ieResults & Discussion
www.adaptcentre.ieLimitations
Simulated Data
Simulated User
Evaluation metrics domain specific
Proposed Challenge
Diversity
Online-learning
Challenge
Entry
www.adaptcentre.ieFinal Thoughts
Simulated game
environments
More challenge domains
Cross domain metrics
Domain specific metrics
Create a group of
evaluators per domain
Integrate with existing
evaluation services e.g.
TIRA