Performance Quality Metrics for Mobile Web and Mobile Native - Agile Testing Days 2014

Preview:

DESCRIPTION

5 Real Life Examples on why Mobile Web and Mobile Native Apps failed and Which Metrics would have shown the problem early on. Using these metrics along your delivery chain allows you go get closer to full automated deployment pipeline but also making sure performance criteria is met

Citation preview

Performance Quality Metrics for Mobile Web and Mobile Native

http://bit.ly/atd2014challenge

@grabnerandi

Why?

What?

It allows us to Evolve from …

Some Examples from Real Life

300 Deployments / Year

50-60 Deployments / Day

10+ Deployments / Day

Every 11.6 seconds

More details on Amazon

75% fewer outages since 2006

90% fewer outage minutes

~0.001% of deployments cause a problem

Instantaneous automatic rollback

Deploying every 11.6s

The “War Room” – back then

'Houston, we have a problem‘NASA Mission Control Center, Apollo 13, 1970

The “War Room” – NOW

Facebook – December 2012

5 Situations on

WHY this happens,

HOW to avoid it,

Metrics to look at

Don’t assume

You know the environment

1.3 Million iOS Apps

$ 10 billion iOS purchases

2013

1.4 million Android Apps

$ 5 billion Google Payout within 1 year

Distance Calculation Issues

480km biking in 1 hour!

Solution: Unit Test in Live App reports Geo

Calc Problems

Finding: Only happens on certain

Android versions

Bluetooth 4.0 Issue on AndroidOS Issue

crashing App

3rd Party Issues

Impact of bad 3rd party calls

Slow Network – Bad Latency

Impact of Latency and Bandwidth

Metrics: Crashes, Exceptions, # and Status of 3rd Party Calls, Payload of

Web Service Calls

Dev: Build for Mobile

Test: Test on Mobile and Diff. Carriers

Ops: Monitor Mobile

#Push without a Plan

434 Resources in total on that page:230 JPEGs, 75 PNGs, 50 GIFs, …

Total size of ~ 20MB

Mobile Landing Page of Super Bowl Ad

m.store.com redirects to www.store.com

ALL CSS and JS files are

redirected to the www domain

This is a lot of time “wasted” especially on high latency mobile

connections

Metrics: Load Time, # Resources (Images, …),

# HTTP 3xx, 4xx, 5xx

Dev: Build for Mobile

Test: Test on Mobile

Ops: Monitor Mobile

Architectural Decisions gone

Bad

We wanted Web 2.0 and Mobile Ready!

Metrics: # Visitors# Requests / User

Dev: Follow Best Practices

Test: Find realistic scenarios

“Blindly” (Re)use Existing

Components

Requirement: We need a report

Using Hibernate results in 4k+ SQL Statements to display 3 items!

Hibernate Executes 4k+ Statements

Individual Execution VERY FAST

But Total SUM takes 6s

Requirement: We need a fancy UI

Using Telerik Controls Results in 9s for Data-Binding of UI Controls

#1: Slow Stored ProcedureDepending on Request execution

time of this SP varies between 1 and 7.5s

#2: 240! Similar SQL StatementsMost of these 240! Statements are not

prepared and just differ in things like Column Names

Metrics: # Total SQLs# SQLs / Web Request# Same SQLs / Request

Transferred Rows

Test: With realistic Data

Dev: “Learn” Frameworks

#No “Agile” Deployment

Load Spike resulted in UnavailabilityAd

on

air

Alternative: “GoDaddy goes DevOps”

1h before SuperBowl KickOff

1h after Game ended

Behind the Scenes

Metrics: AvailabilityPage Size, # Objects

# Hosts, # Connections

DevOps: “Feature” Switches

# of Requests / User

# of Log Messages

# of Crashes

# Objects Allocated

# Objects In Cache

Cache Hit Ratio

# of Images

# of SQLs

# SQLs per RequestAvailability

# HTTP 3xx, 4xx

Page Size

A final thought …

How about this idea?

12 0 120ms

3 1 68ms

Build 20 testPurchase OK

testSearch OK

Build 17 testPurchase OK

testSearch OK

Build 18 testPurchase FAILED

testSearch OK

Build 19 testPurchase OK

testSearch OK

Build # Test Case Status # SQL # Excep CPU

12 0 120ms

3 1 68ms

12 5 60ms

3 1 68ms

75 0 230ms

3 1 68ms

Test Framework Results Architectural Data

We identified a regresesion

Problem solved

Let’s look behind the scenes

Exceptions probably reason for

failed testsProblem fixed but now we have an

architectural regression

Problem fixed but now we have an

architectural regressionNow we have the functional and

architectural confidence

How? Performance Focus in Test AutomationAnalyzing All Unit / Performance Tests

Analyze PerfMetrics

Identify Regressions

If we do all that

Want MORE of these and more details?

http://blog.dynatrace.com

FREE Products & More Info

• Dynatrace Free Trial – http://bit.ly/dtrial– Full End-to-End Visibility in your Java, .NET, PHP Apps

– Sign up for a 30 Days (option for Lifetime) Free Trial on http://bit.ly/atd2014challenge

• Our Blog: http://blog.dynatrace.com

• My contact: @grabnerandi, agrabner@dynatrace.com

Bonus Bite(s)

Don’t ruin your reputation

Outdated Library causing Issues

Implementation Flaws

Business Impact requires Action!

Solution: Cache to the RESCUE!!

Implementation and Rollout

Implemented InMemory Cache

Worked well in Load Testing

Result: Out of Memory Crashes!!

Still crashes

Problem fixed!Fixed Version Deployed

Metrics: Heap Size, # Objects Allocated,# Objects in Cache

Cache Hit Ratio

Test: With realistic Data