Download pdf - Expedia 3x3 presentation

Transcript
Page 1: Expedia 3x3 presentation

3x3: Speeding Up Mobile Releases

Drew Hannay

Keqiu Hu

Jingjing Sun

Page 2: Expedia 3x3 presentation

Project Voyager● New version of

flagship LinkedIn app

● 250+ committers

across Android & iOS

● ~1 year of

development

● Investment in mobile

infrastructure at

LinkedIn

Page 3: Expedia 3x3 presentation

Before Voyager● 12 releases per year

● RC build + manual regression suite

● Mad dash to commit code before the RC cutoff

○ Missing the cutoff meant a long wait for the next release

● Product & marketing plans were made around the monthly releases

● Hard to iterate on member feedback

Page 4: Expedia 3x3 presentation

3x3Release three times per day, no more than three

hours from code is commit to member availability

Page 5: Expedia 3x3 presentation

Why three hours?● Not enough time for manual testing steps

● Not enough time to test everything

○ The goal isn’t 100% automation, it’s faster iterations

○ We don’t want engineers spending most of their time maintaining tests that break whenever a

design changes

● UI tests are chosen based on production-critical business flows

● Faster iteration helps emphasize craftsmanship

○ Devs can take the extra time to write quality code since the next release is soon

Page 6: Expedia 3x3 presentation

Commit Pipeline

CodeReview

StaticAnalysis

UnitTests

BuildReleaseArtifacts

UITests

AlphaRelease

FeatureDevelopment

Production Release

BetaRelease

Page 7: Expedia 3x3 presentation

Commit Pipeline

CodeReview

StaticAnalysis

UnitTests

BuildReleaseArtifacts

UITests

AlphaRelease

FeatureDevelopment

Production Release

BetaRelease

Page 8: Expedia 3x3 presentation

Static analysis● Compile-time contract with API server using Rest.li

○ Rest.li data templates are shared between API server & clients

○ Provides static analysis checks that guarantee backwards compatibility

○ Client models are code generated for further safety

● Java Checkstyle

● Android Lint

○ Over 200 checks provided by Google

○ Several custom checks written for LinkedIn-specific patterns

● Swift Lint

○ Forked version of Realm’s SwiftLint

○ Added custom checks for LinkedIn patterns

Page 9: Expedia 3x3 presentation

Building the code● Over 500k lines of code between Android & iOS

● Building production binaries for a large codebase is slow

● iOS & Swift

○ At one point early on, Swift compilation took over two hours

○ Refactoring into sub-projects and modules lead to a more than 50% speed up

● Android Split APKs

○ Separate binary for each combination of screen density and CPU architecture

● Distributed builds

○ Build the release binaries on separate machines while tests are running

○ Same strategy could be used for running automated tests

Page 10: Expedia 3x3 presentation

What do we test?● Unit tests

● Layout tests

○ Unit tests for views

○ Stress test views with long strings, short strings

○ Make sure views don’t overlap, and render properly in right-to-left mode

● Scenario tests

○ Validate that key business metric flows are working properly

○ Usually flows that span multiple screens in the app

○ App gets mock data from a local fixture server

○ Not an exhaustive suite

● Live tests (experimental)

Page 11: Expedia 3x3 presentation

Test stability● Testing infrastructure stability

● Tooling stability

Page 12: Expedia 3x3 presentation

Testing infrastructure stability● Stabilize testing environment – Hermetic Testing

Page 13: Expedia 3x3 presentation

Testing infrastructure stability● Stabilize testing environment

● Stabilize testing framework

○ Added reliable wait

○ Fixed unreliable APIs

Page 14: Expedia 3x3 presentation

Testing infrastructure stability● Stabilize testing environment

● Stabilize testing framework

● Sanitize test suite

Page 15: Expedia 3x3 presentation

Do some Math

99.9%1000 = ?

99% 95% 90% 80% 50%

Page 16: Expedia 3x3 presentation

Do some Math

99.9%1000 = ?

99% 95% 90% 80% 50%36.7%!

Page 17: Expedia 3x3 presentation

36.7% Reliability● Test Quality == Production Quality

● Lost confidence in test

● Developer unhappiness

● Tests should be used to prevent regression, not blocking development

Page 18: Expedia 3x3 presentation

36.7% Reliability● Test Quality == Production Quality

● Lost confidence in test

● Developer unhappiness

● Tests should be used to prevent regression, not blocking developmentFlaky Tests Are Worse Than No Tests

Page 19: Expedia 3x3 presentation

Trunk Guardian● Detect & disable flaky tests

Page 20: Expedia 3x3 presentation

Tooling Stability● Hardware Stability

● Build Environment Stability

● Parallelized Testing Stability

Page 21: Expedia 3x3 presentation
Page 22: Expedia 3x3 presentation
Page 23: Expedia 3x3 presentation

Pool Guardian

Page 24: Expedia 3x3 presentation

Partner teams● Historically, several partner teams validated the

build before a release

● For example, we needed sign off from the

localization team

● Lint checks catch hardcoded or improperly

formatted strings

● Layout tests catch strings that are too long and

RTL layout bugs

● Semantic correctness of translations is still

validated by translators manually

Page 25: Expedia 3x3 presentation

Getting to members● Every three hours, internal alpha testers get a new build

○ Mainly members of the Flagship team

○ Product managers, devs, and execs who want to see the latest code ASAP

● Every week, the rest of the company gets a new beta build

○ iOS build is submitted to Apple for review

● After a week of beta, the build is promoted to production

○ Assuming Apple’s review is complete, iOS is released

○ Take advantage of Google Play staged rollout for Android

Page 26: Expedia 3x3 presentation

Dogfooding● Android: Google Play alpha/beta channel

○ Easy upgrades for employees, even while off the corporate network

○ Somewhat difficult to get set up, but easy once registered

● iOS: TestFlight

○ Nice, but limited number of users

● iOS: Custom enterprise distribution

○ Scales to our number of users, but employees must be on corporate wifi to upgrade

● Splash screen in the production app encourages employees to use beta builds

Page 27: Expedia 3x3 presentation

Minimizing risk & enabling experiments● Take advantage of LinkedIn’s existing A/B testing infrastructure

● New features are developed behind feature flags

○ Code can be ramped dynamically to different groups of members

○ Performance of new features or changes can be monitored

● Dynamic configuration

● Server-controlled kill switch

○ Crashing or buggy code can often be disabled without a new build

Page 28: Expedia 3x3 presentation

Android 3x3

Page 29: Expedia 3x3 presentation

Consistent Environments: Android Devices● Android emulators, Genymotion, physical devices

● Practically infinite number of screen sizes

● Different configs; RAM, heap size, hardware features…

● Tests passed locally, but failed on the build server or other dev machines

● Developers were unhappy

Page 30: Expedia 3x3 presentation

Consistent Environments: Enter Gradle● Script to start emulators was deployed to each build machine

○ Led to bugs where incorrectly provisioned machines caused random build failures

● Creating a new emulator from scratch every time = SLOW

● Only capable of running one emulator at a time

● Already using Gradle to get consistent builds across machines

● Why should tests be different?

Page 31: Expedia 3x3 presentation

Consistent Environments: Gradle Solution● Create a standalone bundle

○ Download fresh system image

○ Create sdcard image

○ Run the emulator once to create all user files

○ Bundle system + user files into a tar that can be extracted and run without dependencies

● Custom Gradle plugin

○ Extracts & starts emulators

○ Manages running up to 16 emulators in parallel on one build machine

Page 32: Expedia 3x3 presentation

Test Stability● Layout & Scenario tests use Google’s Espresso test utility

○ Optimally fast using IdlingResources

● Android testing lifecycle

○ Start the app

○ Run all tests

○ Stop the app

● Tests were unstable due to implicit dependencies

○ Application level objects (like memory cache)

○ Data saved to disk (SharedPreferences, disk cache)

● Tests didn’t always clean up after themselves and

trying to fix it was a losing battle

Page 33: Expedia 3x3 presentation

Test Stability● What if we changed the lifecycle?

○ Start the app

○ Run one test

○ Stop the app & clear package data

○ Repeat (x3000)

● Super stable! And super slow :(

Page 34: Expedia 3x3 presentation

Test Stability: Custom Test Harness● Custom annotation processor that computes the list of all tests to run

● Construct a queue of “Test” objects -> (test method, device, locale, …)

● Start up to 16 device threads which poll the queue for a test to run

○ Much faster than static sharding, since all devices are always busy running tests

● Output a custom html + junit test report

○ Includes logcat data for each test

○ Includes screenshots for failing tests

● Runs 4500+ tests in < 14 minutes on one build machine

Page 35: Expedia 3x3 presentation

Android multi-emulator test run

Page 36: Expedia 3x3 presentation

iOS 3x3

Page 38: Expedia 3x3 presentation

Reliability

Scenario Tests

iOS – KIF

Page 39: Expedia 3x3 presentation

Speed

Speed up compiling time!

- Compiler

Page 40: Expedia 3x3 presentation

Speed

Speed up compiling time!

- Compiler

Page 41: Expedia 3x3 presentation

Speed

Speed up compiling time!

- Compiler

- Buy hardware .. (Mac Pro)

Page 42: Expedia 3x3 presentation

Speed

Speed up test!

- Speed up KIF

Page 43: Expedia 3x3 presentation

Speed

Speed up test!

- Distributed build/testing

Page 44: Expedia 3x3 presentation

Running in 10 machines

Each node has a reliability of 95% -> 95%10 = 60%

Parallelized Testing Stability

Page 45: Expedia 3x3 presentation

Improve node stability

Each node has a reliability of 98% -> 98%10 = 82%

Parallelized Testing Stability

Page 46: Expedia 3x3 presentation

Multi-sim in iOS

Page 47: Expedia 3x3 presentation

API server 3x3

Page 48: Expedia 3x3 presentation

API server: develop● Monitor

○ Build a monitoring system to ensure the API server is well covered

○ Monitor JVM stats, user requests stats, and etc.

● Log

○ Logging context setup

■ Test-specific logs

■ UUID to link request logs

○ Tools support for production

■ ELK (Elasticsearch, Logstash, and Kibana)

Page 49: Expedia 3x3 presentation

API server: build● Static analysis

○ Rest.li snapshot compatibility checker to ensure API changes are backwards compatible

● Test

○ Unit test

○ Smoke test

Page 50: Expedia 3x3 presentation

API server: canary & deploy● Canary release candidate

○ Run live tests against canary instances

○ Compare metrics between the canary version and the current released version

○ Error log analysis

● Promote healthy canary or rollback bad ones

Page 51: Expedia 3x3 presentation

3x3 after 5 months: areas to improve● Release automation

○ Production uploads to the app stores are still a manual process

○ Getting release notes & translations is painful

● Automated performance testing

○ We can sample performance of the app in production,

but don’t have a great way of catching issues before release

● Android Monkey testing

○ Enables wide range of API level & device coverage with very low overhead cost

● iOS speed improvements

○ Keep up with Swift evolution

● Bring 3x3 framework to other LinkedIn apps

Page 52: Expedia 3x3 presentation

Questions

Page 53: Expedia 3x3 presentation

3x3 blogs & videos● 3x3: Speeding up mobile releases

● 3x3: iOS Build Speed and Stability

● Test Stability - How We Make UI Tests Stable

● UI Automation: Keep it Functional - and Stable!

● Consistent Android Testing Environments with Gradle (slides)

● Effective Layout Testing Library for iOS

● Managing iOS Continuous Integration at Enterprise Scale


Recommended