Upload
dhaval-patel
View
56
Download
2
Embed Size (px)
Citation preview
Daniel Liem & Chirayu Krishnappa
Ship Fast & Stable @ Uber Scale
11.1.2016
99.99%
● Have a dedicated Release team
● Aggressive weekly release cadence
● Build cuts (CI) from Master
● Nightly (alpha) vs. Beta / Production builds
● Internal beta dogfooding vs. External Beta testing
● FF (Feature Flagging) wherever possible
● Avoid alphafixes, betafixes, hotfixes & rollbacks
● Soft Upgrades vs. Force Upgrades
Our Process
Staged Rollouts
iOS
Android
Staged Rollouts
iOS Sanity Tests / Patches
Android Sanity Tests / Patches
Staged Rollouts
iOS Sanity Tests / Patches
Upload to iTunes, enable Testflight
(Weekend dogfooding)
Android Sanity Tests / Patches
β channel(Weekend
dogfooding)
Staged Rollouts
iOS Sanity Tests / Patches
Upload to iTunes, enable Testflight
(Weekend dogfooding)
Wait for Approval
Android Sanity Tests / Patches
β channel(Weekend
dogfooding)
Staged Rollouts (1% → 10% → 50% → 100%)
Staged Rollouts
iOS Sanity Tests / Patches
Upload to iTunes, enable Testflight
(Weekend dogfooding)
Launch 100%(All users at once!)
Wait for Approval
Android Sanity Tests / Patches
β channel(Weekend
dogfooding)
Staged Rollouts (1% → 10% → 50% → 100%)
(Partner app only)Soft Upgrade + Force Upgrade
How many apps?
How many teams?
Developers!
How long does it take to “deploy to production”?
Build, sign, and more.
Mostly deterministic.
Submit to App Store for approval
WAIT … !
Approved! Now what?
WAIT … !
Your users will upgrade…eventually.
http://www.publicdomainpictures.net/view-image.php?image=139317&picture=snail-man
SPEED
Not the speed of your mobile app
SPEED
Speed of deployment
SPEED
Speed of reactions when you discover issues
SPEED
Speed of rollbacks.
What are rollbacks?
How many versions of your app are out there?
Adoption across versions
Need for a trusted system
Goal: Develop at full speed!
High quality bar!
Goal: Develop at full speed
High quality bar!Goal: Develop at full speed
Because slow to deploy/rollback
Signals
Signals
Soon to be 100s of signals
Signals
How can you stay on top?
How do you react to these signals?
Ticketing system at the core
If there’s a failure that should block the train, there’s a ticket for it.
Block specific versions
Verify patched versions
Track mitigation tasksWeekly reports
Intelligent Subsystemscrash detection at alpha stages vs. production
Every alpha crash gets a
ticket
Feature Flagging
Automate to track features per version and turn off
based on classified crashes
Reminders
E.g. When a new build is ready
(also in-app “upgrade” notifications)
Notifications/Alerts
We’re pushing at 4pm
It’s 2pm and we have new / unresolved tickets.
Production alerts are separate
e.g. Spike in crashes, E2E,App Store ratings dropped.
https://pixabay.com/en/go-button-3d-icon-sign-symbol-1067074/
Stage Everything
Rollout to employees
Test flight, alpha, beta channels
Stage Everything
Alpha channel rollout?
Stage it. 1%, 10%, 50% 100%
Stage Everything
In app upgrade prompt?
Stage that too! 1%, 10%, 100%.
In a nutshell…
Deploys are slowCollect all signalsTicketing at core
Automated remindersAd hoc notifications / alerts
Thank you
Proprietary and confidential © 2016 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without permission in writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains information that is privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified that the information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent necessary for consultations with authorized personnel of Uber.
Ship Fast & Stable @ Uber Scale
Daniel Liem & Chirayu Krishnappa
Building At Uber ScaleLike a ‘BAUS’
Robbert van Ginkel & Gautam Korlam11.1.2016
Overview
Challenges with mobile development at scale
Team Size
Build Time
Infrastructure
Improving Developer Experience while tackling scale
Architecture
Workflow
Tooling
Mobile Scale @ Uber
Several Hundred Mobile Developers
Hundreds of commits daily
50% of code changes every month
Shared modular codebase with hundreds of modules
Team Size
Commits
Architecting for Scale
Code architectureFeatures are built as Plugins and shared between apps
Code infrastructure and toolingMonorepo helps with modularization and sharing
Regressions block the whole teamAlways keep master green
Guard as much as possible at compile timeFail fast
Workflow at Scale
Asynchronous change mergingSubmit queueStacked Diffs
Run expensive code quality checks pre mergeUI TestsDeep static analysis - Infer etc.Performance regressions on real devices - cold start, battery, network etc.
Build Time
Waiting for builds...
With more modules, come more problems
Build Tools Scaling Issues
CocoapodsDoes not scale well with more targets (15 min pod install time)
XcodeIncorrect incremental builds (non deterministic and hard to debug)Xcode project file merge conflicts
GradleDoes not scale well with large android projects (15 min for a single line change)Android Studio performance degrades
Building at Scale
Both iOS and Android use Buck to build at UberIncremental everywhereScale non exponentially as more code is addedCache immutable state - avoid rebuildingTransparent Dependency ManagementWorks well for monorepoiOS - Clean ~4x faster, Incremental ~20x fasterAndroid - Clean ~6x faster, Incremental ~30x fasterRemote Build Cache
Infrastructure
Uber’s CI Infrastructure
CI capacity needs increased exponentially400+ Busy Executors on CI hourly50k+ CI Jobs run per day
600
400
200
100
CI Executors with Time
Optimizing the CI Pipeline
Perform relevant checks at the right stageCode Formatting - pre diffBuild, Unit Tests - diffUI, Static Analysis - pre merge
Use CI resources effectivelyRemote build artifact cachingBuild in elasticity to meet peak demand
Open Source
Projects to Watch
OkBuck - Gradle plugin that lets you use gradle projects with buckBuck Http Cache - A distributed build artifact cache serviceComing soon: Swift Support in Buck
https://github.com/uber/buck-http-cache
1https://github.com/uber/okbuck , Slide Deck
Takeaway
Invest in the right build tools early onScaling hardware only works till a certain pointHaving shared workflow/tools across platforms helps a lot in the long run
Fail earlier and keep master always green
Scaling the Build Process at UberRobbert van Ginkel & Gautam Korlam
Thank You
In-product features for release engineers
Christian [email protected]
App
↑ ↑ ↑http://www.pocketship.com
In-product features for release engineers
In-product
ReleaseEnginee
rs
ProductEnginee
rsApp
Empathy
Agency
Agency
Fix Release Engineering pain
Gain more code context and
confidence
Improve Release Engineering
processes
Influence company direction
Reciprocal
Empathy
Build times and CI turnaround
Tests, test infrastructure, and
tooling
Release process overhead
Shipping to the world
Reciprocal
Empathy
“One of us”
Specific features
Specific featuresCompany
Product
Release
Crash, OOM, hang reporting
Telemetry / analytics
Feature flags (A/B testing)
if (whatever) { doA();} else { doB();}
Feature flags (A/B testing)
if (whatever) { doA();} else { doB();}
Feature flags (A/B testing)
Bug reporter
Bug reporter
Bug reporter
Bug reporter
Bug reporter
Bug reporter
Promotion framework
Promotion framework
Promotion framework
Promotion framework
Promotion framework
Test the next version of FacebookGet access to new features and
bugfixes before everyone else.
Promotion framework
Promotion framework
Promotion framework
Automatic employee updater
React Native
React Native
JS Bundle
React Native
JS Bundle
JS Bundle↑
React Native
WebViews
WebViews
Version number scheme
Version number scheme
Version number scheme
v20
Version number schemev20
#5001#5002#5003#5004#5005#5006
…
14 APKs
Version number schemev20
#5001#5002#5003#5004#5005#5006
…
v21
#5046#5047#5048#5049#5050#5051
…
Version number scheme
Ship production weekly
Slow rollout
Alpha and beta
Internal employee dogfooding
Version number scheme
Version number scheme100.0.0.20.70
Version number scheme100.0.0.20.70
Major version
Version number scheme100.0.0.20.70
Major version
Version number scheme100.0.0.20.70
Major version
Hotfix
Version number scheme100.0.0.20.70
Major version Hotfix
Version number scheme100.0.0.20.70
Major version Hotfix
Beta
Version number scheme100.0.0.20.70
Major version Hotfix Beta
Version number scheme100.0.0.20.70
Major version Hotfix
Alpha
Beta
Version number scheme100.0.0.20.70
Major version Hotfix Beta Alpha
ReleaseEnginee
rs
ProductEnginee
rsApp
ReleaseEnginee
rs
ProductEnginee
rsApp
THE DARK SIDE OF ENTERPRISE SWIFTJacek Suliga Mobilize @ LinkedIn🍺
Builds Perf& App Size
LaunchPerformance
MaintenanceCost
Builds Perf & App Size
LaunchPerformance
MaintenanceCost
8-23 MB10-30%
20% slower
Builds Perf& App Size
LaunchPerformance
MaintenanceCost
WWDC 2016, 406, “Optimizing app startup time”
Builds Perf& App Size
LaunchPerformance
MaintenanceCost
0.5
1.0
1.1
1.2
2.0
2.1
OpenSourc
e2.2
2.3
3.0
6/14
9/14
10/14
4/15
6/15
9/15
12/15
3/16
6/16
9/16
Swift 3migratio
nparty
3.0
0.3
Any Questions?!
Automating Mobile Releases
Rachel Brindle
TERMS• User: Someone who uses your app.
• Environment: Place where a user can download your app
• Staging: HockeyApp, Testflight, etc.
• Production: Enterprise MDM, App Store, Play Store
• Automated Deployment: Using CI to push to environment
z
Processz
WHY
• Frees up the deployer to do other things
• Consistent deploys
• Shorter release cycle
• Documents
$ git pushpushed g4edeff4$ # wait$ rake check_if_deployedLatest is g4edeff4
z
• Make sure tests pass
• Build for release
• Gather metadata
• Screenshots
• Release Notes
• Other?
• Upload to environment
• Make sure tests pass
• Build for release
• Gather metadata
• Screenshots
• Release Notes
• Other?
• Upload to environment