45
Performance metrics driven CI/CD Michael Villiger – Developer Advocate [email protected] @mikevilliger Introduction to CIO (Continuous Innovation & Optimization

Performance Metrics Driven CI/CD - Introduction to Continuous Innovation and Optimization

Embed Size (px)

Citation preview

Performance metrics driven CI/CD

Michael Villiger – Developer [email protected]@mikevilliger

Introduction to CIO (Continuous Innovation & Optimization

So how do we change?

Confidential, Dynatrace, LLC

Confidential, Dynatrace LLC

“24 Features in a Box”

“Very late Feedback”

“Ship the whole box”

“Frustration”

Confidential, Dynatrace LLC

“1 Feature at a Time”

“Optimize before Deploy”“Immediate Customer Feedback”

Continuous Innovation and Optimization

Confidential, Dynatrace LLC

Waterfall AgileContinuous Integration / Delivery

Test Automation & Automated Deployment

Monolith MicroservicesDevOps & Continuous Deployment on New Stack

A/B Testing, Feature Flags and Metrics-Driven Feedback

Source: https://www.threadless.com/@alanis

700 deployments / YEAR

10 + deployments / DAY

50 – 60 deployments / DAY

Every 11.6 SECONDS

Unicorns: Deliver value at the speed of business

DevOps @ Targetpresented at Velocity, DOES and more …

http://apmblog.dynatrace.com/2016/07/07/measure-frequent-successful-software-releases/

“We increased from monthly to 80 deployments per week

… only 10 incidents per month …

… over 96% successful! ….”

“We Deliver High Quality Software,Faster and Automated using New Stack”

“Shift-Left Performance to Reduce Lead Time”, Adam Auerbach, Sr. Dir DevOps

https://github.com/capitalone/Hygieia & https://www.spreaker.com/user/pureperformance

“… deploy some of our most critical production workloads on the AWS platform …”, Rob Alexander, CIO

2 major releases/yearcustomers deploy & operate on-prem

26 major releases/year170 prod deployments/dayself-service online sales SaaS & Managed

2011 2016

Not every Sprint ends without bruises!

Richard DominguezDeveloper in OperationsPrep Sportswear

“In 2013 business demanded to go from monthly to daily deployments”

“80% failed!”

Understanding Code Complexity• 4 Millions Lines of Monolith Code• Partially coded and commented in Russian

From Monolith to Microservice• Initial devs no longer with company• What to extract without breaking it?

Shift Left Quality & Performance• No automated testing in the pipeline• Bad builds just made it into production

Cross Application Impacts• Shared Infrastructure between Apps• No consolidated monitoring strategy

Scaling an Online Sports Club Search Service

2015201420xx

Response Time

2016+

1) 2-Man Project 2) Limited Success

3) Start Expansion

4) Performance Slows Growth Users

5) Potential Decline?

Early 2015: Monolith Under Pressure

Can‘t scale vertically endlessly!

May: 2.68s 94.09% CPU Bound

April: 0.52s

From Monolith to Services in a Hybrid-Cloud

Move Front Endto Cloud

Scale Backendin Containers!

Go live – 7:00 a.m.

Go live – 12:00 p.m.

What Went Wrong?

26.7s Load Time5kB Payload

33! Service Calls

99kB - 3kB for each call!

171! Total SQL Count

Architecture ViolationDirect access to DB from frontend service

Single search query end-to-end

It‘s not about blind automation of pushing more bad code on new stacks through a pipeline

Understanding Code Complexity• Existing 10 year old code & 3rd party• Skills: Not everyone is a perf expert or born architect

From Monolith to Microservice• Service usage in the End-to-End Scenarios?• Will it scale? Or is it just a new monolith?

Understand Deployment Complexity• When moving to Cloud/Virtual: Costs, Latency …• Old & new patterns, e.g: N+1 Query, Data

Understand Your End Users• What they like and what they DONT like!• Its priority list & input for other teams, e.g: testing

The fixed end-to-end use case“Re-architect” vs. “Migrate” to “New Stack”

2.5s (vs 26.7) 5kB Payload

1! (vs 33!) Service Call

5kB (vs 99) Payload!

3! (vs 177) Total SQL Count

MEASURE

All the things!

Dev&Test: Check-In Better Code

Performance: Production Ready Checks! Validate Monitoring

Ops/Biz: Provide Usage and Resource Feedback for next

Sprints

Test / CI: Stop Bad Builds Early

Build & Deliver Apps like the Unicorns!With a Metrics-Driven Pipeline!

Build 17 testNewsAlert OK

testSearch OK

Build # Use Case Stat #API Calls # SQL Payload CPU

1 5 2kb 70ms

1 35 5kb 120ms

Use Case Tests and Monitors Service & App Metrics

Build 26 testNewsAlert OK

testSearch OK

Build 25 testNewsAlert OK

testSearch OK

1 4 1kb 60ms

34 171 104kb 550ms

Ops

#ServInst Usage RT

1 0.5% 7.2s

1 63% 5.2s

1 4 1kb 60ms

2 3 10kb 150ms

1 0.6% 3.2s

5 75% 2.5s

Build 35 testNewsAlert -

testSearch OK

- - - -

2 3 10kb 150ms

- - -

8 80% 2.0s

Continuous Innovation and Optimization

Re-architecture into “Services” + Performance Fixes

Scenario: Monolithic App with 2 Key Features

Baseline Key Metrics

#SQL, #Threads, Bytes Sent, # ConnectionsWPO Metrics, Objects Allocated, ...

https://github.com/Dynatrace/Dynatrace-Test-Automation-Sampleshttps://dynatrace.github.io/ufo/

Regression Detected

Fail the build!

Integrate into your Build Server

Integrate into your Build Server

Shift Left PerformanceStop Bad Builds Earlier

1.Performance validation: 1.response time corridor2.# of SQL calls (N+1!)3.# of service calls (N+1!)4.# of exceptions

(Frameworks)2.Architectural validation

1.Making sure services are calling the right thing and not skipping layers

Performance metrics in your pipeline

Continuous Innovation and Optimization

Dev: Only Check in Quality Code

Test: Stop Bad Features/Builds

early

Arch: Optimize scalability, architecture and

performanceDevOps: Validate production

readiness and monitoring

Biz/Ops: Ensure Happy End Users and Healthy Environment

Biz/App: Innovate through User Analytics and Ops Feedback

Ops: Monitor ALL of your apps

powered by Dynatrace

SLA Monitoring: Are we up or down?

Time o

f Deploym

ent

Availability dropped to 0%

UEM: Conversion to User ExperienceNew Deployment + Mkt Push

Increase # of unhappy users!

Decline in Conversion Rate

Overall increase of Users!

Spikes in FRUSTRATED Users!

Resource Budgets & Cost per Feature

4x $$$ to IaaS

#4: Performance -> Behavior

1.Not just for back-end developers!

1.# of domains2.# of images3.Resource size4.Page size

Performance metrics in your pipeline

Performance management for the digital customer age