Crash Fast & Furious

Crash Fast &

Furious

Pierre-Yves Ricau / @Piwai

Run Keeper / morning. End of run, save: crash

Source: https://t.co/uH1EqxqAow

* Study that’s available on hp.com* Stole slide from Doug Sillars

https://t.co/uH1EqxqAow

Not your fault. Fragmentation, bugs in manufacturer. Lifecycle. Fragments.

* “Why do they waste their time on this?”* “Why don’t they fix the crashes first?”* Crash: your fault. Even when not your own code is at fault.

What’s a crash?

Reminder: Android = linux, 1 app = 1 VM = 1 process

Crash: something bad happen, need to kill that process and restart it.

1

2

* Threads: exception handler per thread* Exceptions bubble up, delegated to exception handlers* If no handler, goes to static default

1

2

3

* Focus on crashes in Java land * Uncaught exception delegated to default handler

1

2

3

4

* main: when program starts * log, dialog, kill.

* How many people click “Report”? * What do most people do? * Can’t use Play Store Crash reports

1

2

* You can create your own. * Don’t do that. Client is easy, backend is hard.

* Crashlytics: closed source, UI for noobs * ACRA: OSS client, free, host your backend * Bugsnag: OSS client, small teams, small API, scaling issues

Native Crashes

1

2

* Signal sent to the process, need a signal handler * Uses Google breakpad * Fake exception

First thing to look at?

=> stack trace

* line numbers => checkout correct version of sources

* Smart stacktrace => stupid

* Stacktrace: quick fix of simple error * Who started animation, why? * Stacktrace on server: each frame is a layer * Callback => loss of info. Stacktrace not enough.

Reproducing

* How can we reproduce the crash?

* Associate customer id to crash* Best is to ask customer what they did.* Great for alpha / internal testing.

* Startup is the worst. Crash after work is second. * Asking for feedback channels frustration, avoids 1 star reviews.

* Custom crash dialog that asks for feedback.* Good idea: offer a link to contact support* Emotional connection to customer.

1

2

3

4

* Can’t ask for feedback while crashing: display popup on restart.* Risky: what if crash on restart. Don’t double restart. Maybe crash dedicated activity + different process.* Twitter & Fb seem to do that.

Static info

* diff UI, diff code path => isTablet helps identify problem * isTablet: sw600dp * app version number + SHA version numbers for dev builds

* Picture of what the screen look like at time of crash.* Bitmap? Too big. Upload description of view hierarchy.

Current screen

* What the user is looking at * Current screen

1

2

3

Find all windows: Espresso RootsOracle

* Black box.

History: high level log

1

2

* Steps of the user + internal state changes * Look at log, reproduce the steps * Navigation + Http calls

* OkHttp interceptor

OOM: Stack trace is useless.

squ.re/leakcanary

Detect memory leaks

* UI has validation rules * Bug: somehow not enforced * Crash time: too late to do something about it.

Exception =

something unexpected happened

What do you do when something unexpected happens and the app crashes?

* We somehow got a blank email

Defensive programming

* Can’t figure it out. * Fatal condition, shouldn’t happen. Problem ignored. * Payments.

Offensive programming Crash Fast

2

1

* Detect problems early * Complain as loudly as possible * Quality of code increases * If you can’t understand, make the problem happen earlier, and ship it.

Exception Grouping

* Exceptions thrown by a common Preconditions class might be grouped together in the crash reporting tool.

1

2

More assertions = more crashes.

How to keep low impact on customers?

Integrations tests

* Writing a feature = writing UI tests * Espresso * Run on VMs, no real devices. Parallelized * 20min total build

Smoke testing

* Manual QA * Testing parties * Internal and payed external testers

Dogfood / Beta

* Internal releases: hard, need use case => lunch. * Dogfood at sellers * Betas work better.

Staged Rollout

* Test the waters

* Ship to 5%, 10%, evaluate crash rate and do minor dot releases

* Raw crash numbers: not useful * Most important feature: take payments * Crash per transaction

• Reproducing • Static info • Flight Recorder • View hierarchy / state • Crash Fast • Staged rollout

Questions?

[email protected]

@Piwai* We are hiring. SF, NYC, Canada.

Engineering

Crash Fast & Furious