Download pdf - Advanced A/B Testing at Wix - Aviran Mordo and Sagy Rozman, Wix.com

Experimenting on Humans

Aviran Mordo Head of Back-end Engineering

@aviranm

www.linkedin.com/in/aviran

www.aviransplace.com

Sagy Rozman Back-end Guild master

www.linkedin.com/in/sagyrozman

@sagyrozman

Wix In Numbers

•  Over 55M users + 1M new users/month •  Static storage is >1.5Pb of data •  3 data centers + 3 clouds (Google, Amazon, Azure) •  1.5B HTTP requests/day •  900 people work at Wix, of which ~ 300 in R&D

1542 (A/B Tests in 3 months)

•  Basic A/B testing

•  Experiment driven development

•  PETRI – Wix’s 3rd generation open source experiment

system

•  Challenges and best practices

•  How to (code samples)

Agenda

11:31 A/B Test

To B or NOT to B?

A

B

Home page results (How many registered)

Experiment Driven Development

This is the Wix editor

Our gallery manager What can we improve?

Is this better?

Don’t be a loser

Product Experiments Toggles & Reporting

Infrastructure

How do you know what is running?

If I “know” it is better, do I really need to test it?

Why so many?

Sign-up Choose Template Edit site Publish Premium

The theory

Result = Fail

Intent matters

•  EVERY new feature is A/B tested

•  We open the new feature to a % of users ○  Measure success

○  If it is better, we keep it

○  If worse, we check why and improve

•  If flawed, the impact is just for % of our users

Conclusion

Start with 50% / 50% ?

• New code can have bugs

• Conversion can drop

• Usage can drop

• Unexpected cross test dependencies

Sh*t happens (Test could fail)

•  Language

•  GEO

•  Browser

•  User-agent

•  OS

Minimize affected users (in case of failure) Gradual exposure (percentage of…)

•  Company employees

•  User roles

•  Any other criteria you have (extendable)

•  All users

• First time visitors = Never visited wix.com

• New registered users = Untainted users

Not all users are equal

We need that feature

…and failure is not an option

Defensive Testing

Adding a mobile view

First trial failed

Performance had to be improved

Halting the test results in loss of data. What can we do about it?

Solution – Pause the experiment! •  Maintain NEW experience for already exposed users •  No additional users will be exposed to the NEW feature

PETRI’s pause implementation

• Use cookies to persist assignment ○  If user changes browser assignment is unknown

• Server side persistence solves this ○ You pay in performance & scalability

Decision

Keep feature Drop feature

Improve code & resume experiment

Keep backwards compatibility for exposed users forever?

Migrate users to another equivalent feature

Drop it all together (users lose data/work)

The road to success

•  Numbers look good but sample size is small

•  We need more data!

•  Expand

Reaching statistical significance

25% 50% 75% 100%

75% 50% 25% 0% Control Group (A)

Test Group (B)

Keep user experience consistent

Control Group

(A)

Test Group

(B)

•  Signed-in user (Editor) ○  Test group assignment is determined by the user ID ○  Guarantee toss persistency across browsers

•  Anonymous user (Home page)

○  Test group assignment is randomly determined ○  Can not guarantee persistent experience if changing

browser •  11% of Wix users use more than one desktop

browser

Keeping persistent UX

There is MORE than one

# of active experiment

Possible # of states

10 1024

20 1,048,576

30 1,073,741,824

Possible states >= 2^(# experiments)

Wix has ~200 active experiments = 1.606938e+60

A/B testing introduces complexity

•  Override options (URL parameters, cookies, headers…) •  Near real time user BI tools

•  Integrated developer tools in the product

Support tools

Define

Code

Experiment Expand

Merge code

Close

•  Spec = Experiment template (in the code) ○  Define test groups ○  Mandatory limitations (filters, user types) ○  Scope = Group of related experiments (usually by product)

•  Why is it needed ○  Type safety ○  Preventing human errors (typos, user types) ○  Controlled by the developer (developer knows about the context) ○  Conducting experiments in batch

Define spec

public class ExampleSpecDefinition extends SpecDefinition {

@Override protected ExperimentSpecBuilder customize(ExperimentSpecBuilder builder) {

return builder.withOwner("OWNERS_EMAIL_ADDRESS").withScopes(aScopeDefinitionForAllUserTypes(

"SOME_SCOPE")) .withTestGroups(asList("Group A", "Group B")); }}

Spec code snippet

•  Experiment = “If” statement in the code

Conducting experiment

final String result = laboratory.conductExperiment(key, fallback, new StringConverter());

if (result.equals("group a")) // execute group a's logicelse if (result.equals("group b")) // execute group b's logic // in case conducting the experiment failed -

the fallback value is returned// in this case you would usually execute the

'old' logic

•  Upload the specs to Petri server ○  Enables to define an experiment instance

Upload spec

{ "creationDate" : "2014-01-09T13:11:26.846Z", "updateDate" : "2014-01-09T13:11:26.846Z", "scopes" : [ { "name" : "html-editor", "onlyForLoggedInUsers" : true }, { "name" : "html-viewer","onlyForLoggedInUsers" : false } ], "testGroups" : [ "old", "new" ], "persistent" : true, "key" : "clientExperimentFullFlow1", "owner" : "" }

Start new experiment (limited population)

Manage experiment states

1.  Convert A/B Test to Feature Toggle (100% ON)

2.  Merge the code

3.  Close the experiment

4.  Remove experiment instance

Ending successful experiment

• Define spec

• Use Petri client to conduct experiment in the code (defaults to old)

• Sync spec

• Open experiment

• Manage experiment state

• End experiment

Experiment lifecycle

Petri is more than just an A/B test framework

Feature toggle

A/B Test

Personalization

Internal testing

Continuous deployment

Jira integration

Experiments

Dynamic configuration

QA

Automated testing

•  Expose features internally to company employees •  Enable continuous deployment with feature toggles •  Select assignment by sites (not only by users) •  Automatic selection of winning group* •  Exposing feature to #n of users* •  Integration with Jira * Planned feature

Other things we (will) do with Petri

Petri is now an open source project https://github.com/wix/petri

Q&A

Aviran Mordo Head of Back-end Engineering

@aviranm

www.linkedin.com/in/aviran

www.aviransplace.com

https://github.com/wix/petri http://goo.gl/L7pHnd

Sagy Rozman Back-end Guild master

www.linkedin.com/in/sagyrozman

@sagyrozman

Credits http://upload.wikimedia.org/wikipedia/commons/b/b2/Fiber_optics_testing.jpg http://goo.gl/nEiepT https://www.flickr.com/photos/ilo_oli/2421536836 https://www.flickr.com/photos/dexxus/5791228117 http://goo.gl/SdeJ0o https://www.flickr.com/photos/112923805@N05/15005456062 https://www.flickr.com/photos/wiertz/8537791164 https://www.flickr.com/photos/laenulfean/5943132296 https://www.flickr.com/photos/torek/3470257377 https://www.flickr.com/photos/i5design/5393934753 https://www.flickr.com/photos/argonavigo/5320119828

•  Modeled experiment lifecycle

•  Open source (developed using TDD from day 1)

•  Running at scale on production

•  No deployment necessary

•  Both back-end and front-end experiment

•  Flexible architecture

Why Petri

PERTI Server Your app

Laboratory

DB Logs