81
The AB Testing Hype Cycle Escaping the Trough of Useless Testing @OptimiseOrDie

eMetrics London - The AB Testing Hype Cycle

Embed Size (px)

Citation preview

Page 1: eMetrics London - The AB Testing Hype Cycle

The AB Testing Hype CycleEscaping the Trough of Useless Testing

@OptimiseOrDie

Page 2: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

The Gartner Hype Cycle ™

Page 3: eMetrics London - The AB Testing Hype Cycle

1 Tool Installed

2 Stupid testing

3

4Peak of Stupidity

5 ROI questioned

6 Statistics debunked

7 Faith crisis

8The Trough of Testing

Slide of

moodiness

------>

Scaled upStupidity

Slop

e of

St

upid

ity

------

>

9 Where, How, Why

10 Data science

11 Testing to learn

12

Innovation Testing

Hacking

Business

Futures

------>

@OptimiseOrDie

Page 4: eMetrics London - The AB Testing Hype Cycle

#fail

@OptimiseOrDie

Page 5: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

108M

Page 6: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

44.9M

Page 7: eMetrics London - The AB Testing Hype Cycle

Oppan Gangnam Style!

@OptimiseOrDie

24.1M

Page 8: eMetrics London - The AB Testing Hype Cycle

You been naughty again?

Page 9: eMetrics London - The AB Testing Hype Cycle

1. Get Analytics Health Checked

2. Test in the right place3. Understand Cross

Device4. Do your Research5. Prioritise your testing

@OptimiseOrDie

6. Perform Pre Flight Checks

7. Know how long to test8. Have a good reason to

test9. Learn from your tests10. Burn down the silos

10 Shortcuts to Testing Success

Page 10: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

1. Your Analytics Setup is Broken

Page 11: eMetrics London - The AB Testing Hype Cycle

• Nearly 100 Sites in 3 years• 95% were broken, often

badly• Trust in data was missing• Management made bad

calls• Nobody checked the tills• Calibrate from the basics

up!@OptimiseOrD

ie

• What sales do we capture?• What categories?• What about refunds, lunch

money, gift certificates?• How do we monitor fraud?• Do we check it adds up?• Where does this data go?

1. What about MY clients?

Page 12: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

1. Bulls***t flows upwards!Cool BS Dashboard

BS reports

BS metrics

BS Collectio

n

BS metrics

BS Collectio

n

BS metrics

BS Collectio

n

BS metrics

BS Collectio

n TILLS

DEPT

STORE

DIVISION

Page 13: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Review takes 1-3 days• Prioritise the issues• Fix directly with developers• Integrate with the Testing Tool1

Get an Analytics Health Check

Page 14: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

2. You Test in the Wrong Place

Page 15: eMetrics London - The AB Testing Hype Cycle

2. Let’s do Random Testing

Let’s try the

homepage

I’ve got targets to

hit!

I hate this job

Let’s test button

colours!

Has lots of opinions but no

data

Spends too much time on

Twitter

Driven by Ego and Competitors

Wishes he cared about testing

Page 16: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

“STOP copying your competitors

They may not know what the f*** they are doing either” Peep Laja, ConversionXL

1. Let’s do Random Testing

Page 17: eMetrics London - The AB Testing Hype Cycle

Best Practice Testing?• Your customers are not the same• Your site is not the same• Your advertising and traffic are not the

same• Your UX is not the same• Your X-Device Mix is not the same• You have no idea of the data

• Use them to inform or suggest approaches• Use them for ideas• Do not use them as a playbook• It will make you very unhappy

@OptimiseOrDie

Page 18: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

2. Modelling - Intent

All traffic

HearingSight

StoreOther

Step 1

Step 2

Step 3

Goal

Page Page

Page

Page

Hearing

Page 19: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

2. Modelling – Multiple Endings

All traffic

InfluenceIntent

Influence

Step 1

Step 2

Step 3Goal

Page Page

Page

Page

Entry Page

1234

1234

1234

Page 20: eMetrics London - The AB Testing Hype Cycle

2. Modelling – Horizontal Funnels

Page 21: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Do some Analytics modelling• Understand the shedding of

layers• Narrow your focus and scope• Bank better gains earlier in time

2Test in the Right Places

Page 22: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

3. Responsive solves everything, right?

Page 23: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Page 24: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Vs.

Page 25: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Page 26: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Page 27: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Page 28: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Page 29: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

1. Motorola Hardware Menu Button2. MS Word Bullet Button3. Android Holo Composition Icon4. Android Context Action Bar Overflow (top right on Android

devices)

Page 30: eMetrics London - The AB Testing Hype Cycle
Page 31: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Increase in revenue of > $200,000 per annum!

bit.ly/hamburgertest

Page 32: eMetrics London - The AB Testing Hype Cycle

Mystery Meats of Mobile

BURGER SHISH DONER

Page 33: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Do you really know your mix?• Most people undercount

Android!• What iPhone models visit?• How big is tablet traffic?• What screen sizes do they have?

• Find out BEFORE you design tests

• Check BEFORE you launch tests

• Use Google Analytics to find out• 3 reports to rule them all

https://www.google.com/analytics/web/template?uid=lpVf8LveSqyd3mdsHjdfzQhttps://www.google.com/analytics/web/template?uid=fmUzp_gzRIy7LnvZJjCDOQhttps://www.google.com/analytics/web/template?uid=y7sYIXDhQrmswHAiNo8iLA

3. Our customers use iPhones, right?

Page 34: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

3. What iPhone Models do we see?

Screen Resolution320 x 480 = iPhone 4/4S 320 x 568 = iPhone 5/5S 375 x 667 = iPhone 6 414 x 736 = iPhone 6+

https://www.google.com/analytics/web/template?uid=lpVf8LveSqyd3mdsHjdfzQhttps://www.google.com/analytics/web/template?uid=fmUzp_gzRIy7LnvZJjCDOQhttps://www.google.com/analytics/web/template?uid=y7sYIXDhQrmswHAiNo8iLA

Page 35: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Desktop Browsers & versions• Tablet Models• Mobile Device Models• Screen Resolutions3

Figure Out the Device Mix for Testing

Page 36: eMetrics London - The AB Testing Hype Cycle

Is there anything holding you back from doing conversion research?

1. Time

2. Client/Company Buy-In

3. Budget

4. Don’t know where to start

4. You don’t do any Research before testing?

@ContentVerve

Page 37: eMetrics London - The AB Testing Hype Cycle

37

Page 38: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

4. If you have 4 hoursPLUS

• Snap interviews (Sales, Customer Services, Tech Support)

• Run a quick poll or survey (See my tools slides)

Less Bullshit!

Page 39: eMetrics London - The AB Testing Hype Cycle
Page 40: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

4. 1 Hour Page Analytics

Influence Pages

Entry PointsLanding Pages

Device MixCustomer

MixTraffic Mix

FlowIntent

Marketing -> Site flow

Page or Process

Next StepsAbandonm

entExitsMix of

abandonment

Flow

Page 41: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

4. 1 Hour Landing Page Analytics• How old are the visitors?

https://www.google.com/analytics/web/template?uid=hab8Ta93SCCffUpjefjtNQ

• What are the key metrics like (e.g. bounce rate, conversion)?https://www.google.com/analytics/web/template?uid=hab8Ta93SCCffUpjefjtNQ

• What is the goal or ecommerce conversion through this page?https://www.google.com/analytics/web/template?uid=hab8Ta93SCCffUpjefjtNQ

• What channel traffic comes to the page?https://www.google.com/analytics/web/template?uid=Kjb9q8M4QN-fsPe8dOGaig

• What is the mix of tablet / mobile / desktop to the page?https://www.google.com/analytics/web/template?uid=wLMUWs8eTIa3_mmQHOtPkw

• What are the resolutions of devices?https://www.google.com/analytics/web/template?uid=wLMUWs8eTIa3_mmQHOtPkw

• How slow are the landing pages? https://www.google.com/analytics/web/template?uid=AavFsgMoRkucYYKnxlB76Q

• What are the pages right after the landing page? (Use a landing page report and choose the ‘Entrance Paths’ to show next pages.)

• What is the flow like from this page? (Use the Behaviour Flow Report)

• What does it look like on the top devices? (Use real devices + Appthwack.com, Crossbrowsertesting.com or Deviceanywhere.com)

Page 42: eMetrics London - The AB Testing Hype Cycle

AdSERP

BannerEmail

CampaignAff

Referrer

Landing Page

TemplateGoal

ReachedInteraction

Step or Layer

PPCOrganicDisplayEmailSocial

Desktop - Mobile - Tablet

Page 43: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

4. If you have 2 hours

• Form Analytics data • Scroll or Click Maps• Session Recording Videos (Hotjar, Decibel Insight,

Yandex)• Make a horizontal funnel from the landing page

• Check the:– Marketing Creatives / SERP fully– Look at Landing page ZERO!

Page 44: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

4. If you have 4 hours• Set up a poll or survey (See my tools slides)• Set up an exit (bail) survey• Friends, Family, New Employee user testing• Guerrilla user testing• Snap Interviews – 5-10 minutes:• Customer Services, Sales team (if applicable) then

Customers • 5 Second Test• Article is here : bit.ly/conversiondeadline

Page 45: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Lean Analytics• UX Research• Interviewing• Surveys and Polls4

No Excuses – Do Your Research

Page 46: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

5. You don’t Prioritise your TestsScoring can be cost, time to market, resource, risk, political complexity

• Cost 1-10 Higher is cheaper• Time 1-10 Higher is shorter• Opportunity 1-10 Higher is greater

Score = Cost * Time * Opportunity

• For financial institutions, risk should be a factor• Want to build your own? – ask me!

Page 47: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

5. Opportunity vs. Cost

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

9

10

Cheap (high is better)

Opportunity

MONEY!

Page 48: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

5. Make a Money Model

Test Description Metric 2% lift 5% lift 10% lift EstimateProduct page Simplification Basket adds 200,000 500,000 1,000,000 500,000

Register new Improve onboarding New register funnel ratio 25,000 62,500 125,000 250,000

IE8 bugs in cart Fix key broken stuff IE8 Conversion 80,000 200,000 400,000 200,000

Category list page Get product higher User Category -> Product 500,000 1,250,000 2,500,000 1,250,000

Payment Page New card handling User Payment -> Thank you 60,000 150,000 300,000 300,000

Page 49: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Score all Test Targets• Use Cost vs. Opportunity

minimum• Check it works!• Make a Money Model

5Prioritise your Testing Targets

Page 50: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

6. You Don’t Test Before Launch• Dirty secret of AB testing?• People break their tests all the time!• Most people don’t notice

Why?

• Because developers can break them very easily• What if your AB test was broken on iPhones?• If you didn’t know, would your results be valid?

• About 40% of my tests fail basic QA

Page 51: eMetrics London - The AB Testing Hype Cycle

Browser Checkswww.crossbrowsertesting.com

www.browserstack.comwww.spoon.netwww.saucelabs.com

Mobile & Tabletwww.appthwack.com

www.deviceanywhere.comwww.opendevicelab.com

Article & Info bit.ly/devicetesting

6. Here is my £80M testing rig!

Page 52: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Check Every Test Works• Browsers and Devices• Check Analytics records

correctly6Perform Pre Flight Checks

Page 53: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

7. You Stop When you Hit 95% Confidence

Page 54: eMetrics London - The AB Testing Hype Cycle

The 95% Stopping Problem

@OptimiseOrDie

• Many people use 95, 99% ‘confidence’ to stop

• This value is unreliable and moves around• Nearly all my tests reach significance

before they are actually ready• You can hit 95% early in a test (18

minutes!)• If you stop, it could be a false result• Read this Nature article : bit.ly/1dwk0if• Optimizely and VWO have updated their

tools• This 95% thingy – must be LAST on your

stop list

Page 55: eMetrics London - The AB Testing Hype Cycle

The 95% Stopping Problem

Scenario 1 Scenario 2 Scenario 3 Scenario 4After 200 observations Insignificant Insignificant Significant! Significant!

After 500 observations Insignificant Significant! Insignificant Significant!

End of experiment Insignificant Significant! Insignificant Significant!

“You should know that stopping a test once it’s significant is deadly sin number 1 in A/B testing land. 77% of A/A tests (testing the same thing as A and B) will reach significance at a certain point.”Ton Wesseling, Online Dialogue

Page 56: eMetrics London - The AB Testing Hype Cycle

• TWO BUSINESS CYCLES minimum (week/month)• 1 PURCHASE CYCLE minimum (or most of one)• 250 CONVERSIONS minimum per creative• 350, 500, more if creative response is similar• FULL WEEKS/CYCLES never part of one

• KNOW what marketing, competitors and cycles are doing• RUN a test length calculator - bit.ly/XqCxuu• SET your test run time , RUN IT, STOP IT, ANALYSE IT

• ONLY RUN LONGER if sample is smaller than expected• DON’T RUN LONGER just because the test isn’t giving the

result you want!@OptimiseOrD

ie

7. Know How Long to Test for…

Page 57: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Most critical mistake• Use a test calculator• Full business cycles, 2 minimum• Don’t waste time hoping7

Know How Long to Test for

Page 58: eMetrics London - The AB Testing Hype Cycle

Insight - Inputs

#FAILCompetitor copying

Guessing

Dice rolling

An article the CEO read

Competitor change

PanicEgo

OpinionCherished notionsMarketing whimsCosmic raysNot ‘on brand’ enough

IT inflexibility

Internal company needs

Some dumbass consultant

Shiny feature blindness

Knee jerk reactons

@OptimiseOrDie

8. So you think you have a Hypothesis?

Page 59: eMetrics London - The AB Testing Hype Cycle

Insight - Inputs

InsightSegmentation

Surveys

Sales and Call Centre

Session Replay

Social analytics

Customer contactEye tracking

Usability testingForms analyticsSearch analyticsVoice of Customer

Market research

A/B and MVT testing

Big & unstructured data

Web analytics

Competitor evals

Customer services

@OptimiseOrDie

8. So you think you have a Hypothesis?

Page 60: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

1. Because we saw (data/feedback)2. We expect that (change) will

cause (impact)3. We’ll measure this using (data metric)

bit.ly/hyp_kit

8. Use this to deflect stupid testing!

Page 61: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

1. Because we saw (an angry email from the CEO)

2. We expect that (changing button colours) will cause (the office to cool down for a day)

3. We’ll measure this using (some metric we pluck out of the air – whatever, man) bit.ly/hyp_kit

8. Let’s try a real one

Page 62: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Don’t do Ego driven testing• Use the Hypothesis Kit!8

Get a Proper Hypothesis Going

bit.ly/hyp_kit

Page 63: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

9. Our Testing teaches us Nothing!

• Either your research or hypothesis is weak• Work back from the outcome!

What if A won – what would that tell us?What if A failed – what would that tell us?

• What is the value to the business in finding out the answer?

• Is the finding actionable widely and deeply?• Testing isn’t about lifts – it’s about learning

Page 64: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

9. Our Testing teaches us Nothing!

“You are trying to run a bundle of tests, whose expected additional information will give you the highest return.”Matt Gershoff, CEO, Conductrics.com

Page 65: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Do your research• Form a solid hypothesis• Work back from the outcomes• Learning useful stuff = huge lifts9

Design Tests for Maximum Learning

Page 66: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

10. Burn Down the Silos• Non agile, non iterative design• Silos work on product

separately• No ‘One Team’ per

product/theme • Large teams, unwieldy

coordination• Pass the product around• More PMs and BAs than a

conference• Endless sucking signoff• AB testing done the same way!

Page 67: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

10. FT Example• Small teams (6-15) with direct access to

publish• Ability to set and get metrics data directly• Tools, Autonomy, Lack of interference• No Project Managers or Business Analysts

• Business defines ‘outcomes’ – teams deliver• No long signoff chain• No pesky meddling fools• 18 Month projects over budget?

Page 68: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

10. FT Example• 100s of releases a day!• MVP approach• Launch as alpha, beta,

pilot, phased rollout• Like getting in a

shower• Read more at

labs.ft.com

Page 69: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

10. Positive Attributes

• Rapid, Iterative, User Centred & Agile Design. No Silos.

• Small empowered autonomous teams

• Polymaths and Overlap• Toolkit & Analytics investment• Persuasive copywriting &

Psychology• Great Testing & Optimisation

Tools

Page 70: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

• Agile, Lean, Iterative x silo teams

• Ability to get and set metrics• Autonomy, Control, Velocity• Iterative MVP approach• Work on outcomes, not features

10Burn Down the Silos!

Page 71: eMetrics London - The AB Testing Hype Cycle

71

 “If you think of technology as something that’s spreading like a sort of fractal stain, almost every point on the edge represents an interesting problem.”Paul Graham

Page 72: eMetrics London - The AB Testing Hype Cycle

72Time

ROI

Page 73: eMetrics London - The AB Testing Hype Cycle

Rumsfeldian Space• What if we changed our prices?• What if we gave away less for free?• What if we took this away?• What about 3 packages, not 5?• What are these potential futures I can

take?• How can I know before I spend money?

• McDonalds Hipster Test Store

bit.ly/1TiURi7

@OptimiseOrDie

Page 74: eMetrics London - The AB Testing Hype Cycle

Congratulations!

Today you’re the lucky winner of our random awards programme.

You get all these extra features for free, on

us.

Enjoy!

Innovation Testing

@OptimiseOrDie

Page 75: eMetrics London - The AB Testing Hype Cycle

@OptimiseOrDie

Page 76: eMetrics London - The AB Testing Hype Cycle

2004 Headspace

What I thought I knew in 2004

Reality

Page 77: eMetrics London - The AB Testing Hype Cycle

2015 Headspace

What I KNOW I know

Me, on a good day

Page 78: eMetrics London - The AB Testing Hype Cycle

WE’RE ALL WINGING IT

Page 79: eMetrics London - The AB Testing Hype Cycle

Guessaholics Anonymous

Page 80: eMetrics London - The AB Testing Hype Cycle

1 Tool Installed

2 Stupid testing

3

4Peak of Stupidity

5 ROI questioned

6 Statistics debunked

7 Faith crisis

8The Trough of Testing

Slide of

moodiness

------>

Scaled upStupidity

Slop

e of

St

upid

ity

------

>

9 Where, How, Why

10 Data science

11 Testing to learn

12

Innovation Testing

Hacking

Business

Futures

------>

@OptimiseOrDie

Page 81: eMetrics London - The AB Testing Hype Cycle

Thank You!

Email me [email protected] http://bit.ly/em2015Linkedin linkd.in/pvrg14