20
Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity College Dublin The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund. EvalUMAP Workshop

Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

Generation & Evaluation of PersonalisedPush-NotificationsKieran Fraser, Bilal Yousuf, Owen ConlanADAPT Centre, Trinity College Dublin

The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.

EvalUMAP Workshop

Page 2: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieContent

❖ Proposed Challenge

❖ Gym-Push

❖ Evaluation Metrics

❖ Challenge Entry

❖ Results & Discussion

❖ Limitations

❖ Final Thoughts

Page 3: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieProposed Challenge

Currently no established or standardized means for repeatable and comparative evaluation of algorithms and systems in the UMAP space.

1. Proposal for a Shared Challenge in the UMAP Space, EvalUMAP Whitepaper 2019

Goal

Shared Task

“focuses on user model generation using logged mobile phone data, with an assumed purpose of supporting mobile phone notification suggestion.” 1

Page 4: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieSocial Influencer Problems

Page 5: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieProposed Challenge

“the challenge is to create an approach to generate personalized notifications on individuals’ mobile phones, whereby such personalization would consist of deciding what events (SMS received, etc.) to show to the individual and when

to show them.”1

1. Proposal for a Shared Challenge in the UMAP Space, EvalUMAP Whitepaper 2019

Challenge 1

• Given 3 months historical notification data (for training)

• Develop a user model which generates a personalized notification given a context

• Using Gym-Push, user model is evaluated using test data and evaluation metrics

Challenge 2

• Given small sample of notification data (no training)

• Develop an adaptive user model which generates a personalized notification given a context

• Using Gym-Push, user model is evaluated, in simulated “real-time”, using test data and evaluation metrics

Page 6: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieGym-Push

OpenAI Gym

Open source toolkit for “developing and comparing reinforcement learning algorithms” 1

1. https://gym.openai.com/

Gym-Push

Custom OpenAI Gym environment simulating push-notification overload on mobile device users

Page 7: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieGym-Push

Gym-Push

• Ease of installation – pip, docker, hosted

• Multiple communities – RL, UMAP, HCI

• End-user interface

• Established Online Leaderboard

Page 8: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieGym-Push

Challenge 1

Page 9: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieGym-Push

Challenge 2

Page 10: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieGym-Push

Page 11: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieGym-Push

Train on Real, Test on Synthetic 1 RMSE F1 scores differ in range 0.02 – 0.07 indicating synthetic data imitates real world data.

1. Esteban, C., Hyland, S.L., R¨atsch, G.: Real-valued (medical) time series generation

Page 12: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieGym-Push

Page 13: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieEvaluation Metrics

Performance Diversity

Response Time

Learning Rate

Page 14: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieEvaluation Metrics

Accuracy Precision Recall F1

Simulated User• AdaBoost Classifier chosen• Trained on 3 months of historical user data• Acc avg = 83.8%, F1 avg = 72.8%

Page 15: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieChallenge Entry

• MLP used for Generator & Discriminator• Notifications OHE vector length 28• Trained using RMSProp in 128 mini-batch chunks

over 2000 epochs

Challenge 1

Page 16: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieResults & Discussion

Page 17: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieResults & Discussion

Page 18: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieLimitations

Simulated Data

Simulated User

Evaluation metrics domain specific

Proposed Challenge

Diversity

Online-learning

Challenge

Entry

Page 19: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ieFinal Thoughts

Simulated game

environments

More challenge domains

Cross domain metrics

Domain specific metrics

Create a group of

evaluators per domain

Integrate with existing

evaluation services e.g.

TIRA

Page 20: Generation & Evaluation of Personalised Push-Notifications · Generation & Evaluation of Personalised Push-Notifications Kieran Fraser, Bilal Yousuf, Owen Conlan ADAPT Centre, Trinity

www.adaptcentre.ie

Thank you.

Questions?

Email:[email protected]