25
3/12/2016 1 Massive Online Experiments: Practical Advice Joseph A. Konstan moderator Massive Online Experiments?

Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

1

Massive Online Experiments: Practical Advice

Joseph A. Konstan

moderator

Massive Online Experiments?

Page 2: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

2

Massive Online Experiments?

In September 2000, Amazon.com outraged some customers when its own price discrimination was revealed. One buyer reportedly deleted the cookies on his computer that identified him as a regular Amazon customer. The result? He watched the price of a DVD offered to him for sale drop from $26.24 to $22.74. The company said the difference was the result of a random price test and offered to refund customers who paid the higher prices.

Massive Online Experiments?

Page 3: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

3

Is Offline Different?

Introducing Your Panelists

Duncan Watts Microsoft Research

Jeff Hancock Stanford

University

Elizabeth Churchill Google

Page 4: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

4

Tweet Your Questions

#ACMatSXSW16

Tweet Your Questions

#ACMatSXSW16

Joe

Page 5: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

5

Tweet Your Questions

#ACMatSXSW16

Joe

Tweet Your Questions

#ACMatSXSW16

Joe

Page 6: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

6

Tweet Your Questions

#ACMatSXSW16

The A/B Illusion

Duncan Watts Microsoft Research

SXSW Panel on Massive Online Experiments

Page 7: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

7

Individual Decision Making

Individual Decision Making

Page 8: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

8

Individual Decision Making

Traditionally, Policy Making Has Looked the Same

“Which Policy is Best: A or B?”

Page 9: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

9

Traditionally, Policy Making Has Looked the Same

POLICY A

After some argument….

Traditionally, Policy Making Has Looked the Same

POLICY B

But also possibly….

Page 10: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

10

Traditionally, Policy Making Has Looked the Same

POLICY A

POLICY B

We only ever see A or B, so we never know which would have been better

???

Increasingly Technology Allows us to “A/B Test” Rather than Argue

A/B Testing is pervasive in online settings

Page 11: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

11

A/B Testing Can be Applied To Many (But Not All) Policy Decisions

POLICY A

POLICY B

50%

50%

The A/B Illusion

• When it is unclear ex-ante which of A and B is better: – It seems ethical to apply policy A to 100% of people – It seems ethical to apply policy B to 100% of people

• So why does it seem unethical to randomly assign policy A to 50% of people and policy B to the other 50%?

• Facebook “Emotional Contagion” experiment • OK Cupid matching experiment

• It is not because 50% are getting “worse” treatment – That would be unethical, but it also be unethical to give it

to 100% of people

• Call this the “A/B Illusion” (Meyer and Chabris, 2015)

Page 12: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

12

Why The Illusion?

• Is it just that “experiment” and “manipulation” have negative connotations? – Would “policy testing” or “learning what works” be better?

• Is it that randomization itself is bad – Are arbitrary human decisions better than algorithms?

• Is it that experimentation concedes ignorance? – Would we prefer to keep making mistakes than to

acknowledge the limits or our knowledge?

• Is it that changing how we make decisions shifts power from traditional decision makers? – The “Moneyballization of policy”

• Something else?

You Manipulated What? Lessons from the FB Emotional

Contagion Experiment

Jeff Hancock Stanford University

SXSW Panel on Massive Online Experiments

Page 13: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

13

Reaction from Users

“My friend’s dad just passed away,

and if I was in this

experiment I’d never have known”

2. Newsfeed is important

“How dare you manipulate my

Newsfeed!” 1. Newsfeed is manipulated?

3. Emotions are distinct

4. Big data is personal “I want to know if I was in your

experiment”

“mood control”

“mind control”

Page 14: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

14

Reaction from Media

"unwitting guinea pigs"

"treating people like laboratory rats

"tweaking the newsfeed algorithm”

“tinkering with people's emotions”

manipulated users' emotions

% articles

tinker guinea Pig manipulation

Reaction from Media

Page 15: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

15

* * FB coverage

emotion

non-emotion

% word count

Reaction from Media

1st Person Singular

Anger Anxiety

Reaction from Academia

Page 16: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

16

* * FB coverage

emotion

non-emotion

% word count

1st Person Singular

Anger Anxiety

Reaction from Academia

1. Understanding user’s folk theories 2. Consent 3. Assessing risk beside privacy

Page 17: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

17

1. Evidence-based design 2. What is the control group? 3. Autonomous experiments 4. the 2% vs. 50% problem

EXPERIMENT(AL) DESIGN

Elizabeth F. Churchill

Page 18: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

18

Key takeaway

Design the experiment

as you design the experience

Design programmatically

for the bigger picture

Page 19: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

19

An engagement story, a cautionary tale

Duration of chat session # of Play/

Pause

# of Chats

# of Scrubs

Duration

of chat

session# of Play/

Pause

# of

Chats

# of

Scrubs

Work with David Ayman Shamma

Page 20: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

20

What to collect to measure engagement?

• Type of event (e.g., player command or a normal chat message)

• Anonymous hash (e.g., uniquely identifies the sender and the receiver, without exposing personal account data)

• Timestamp for the event

• The player time (with respect to the specific video) at the point the event occurred

• The number of characters and the number words typed (for chat messages)

• Emoticons used in the chat message

• URL to the shared video

Work with David Ayman Shamma

Volume of actions over time – an arbitrary session

Work with David Ayman Shamma

Page 21: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

21

Volume of actions over time – a human session

Work with David Ayman Shamma

Chat follows the video

CHAT

Work with David Ayman Shamma

Page 22: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

22

www.90percentofeverything.com

Page 23: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

23

Discover Define Develop Deliver

EXPLORE EVALUATE, EXPLORE EVALUATE

Double Diamond model – Design research council -http://www.designcouncil.org.uk

A framework

GLOBAL LOCAL

EXPLORE Open questions,

ethnographic

studies, surveys,

observations,

market analysis

Based on prior and

related data, A/B

tests explore

potential for large

gains from small

changes

EVALUATE Rough prototypes,

A/B tested

triangulate

approaches &

studies

A/B tests for small

changes, strong

hypotheses,

clarifying questions

Work with Rochelle King and Caitlin Tan

Page 24: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

24

Key takeaway

Design the experiment

as you design the experience

Design programmatically

for the bigger picture

Questions?

[email protected]

Also: More information

in our upcoming book.

Page 25: Massive Online Experiments: Practical Advice · SXSW Panel on Massive Online Experiments . 3/12/2016 13 ... “mind control” ... 3/12/2016 24 Key takeaway Design the experiment

3/12/2016

25

Q&A

Remember: Tweet to #ACMatSXSW16