JavaOne 2016 "Java, Microservices, Cloud and Containers"

Preview:

Citation preview

01/05/2023 @danielbryantuk | @spoole167 1

Java, Microservices, Cloud and Containers: Migrating without the Tiers (or Tears)

Daniel Bryant @danielbryantukSteve Poole @spoole167

01/05/2023 @danielbryantuk | @spoole167 2

The pitch• Moving to the cloud requires a fundamental change in mindset• Technology• Skills (architectural, operational, QA)• Organisational design

• DevOps, container technology and microservices are complementary

• Migrating in non-trivial

• Learn from some of our successes (and mistakes)…

01/05/2023 @danielbryantuk | @spoole167 3

Who are we?Steve Poole

IBM Developer

@spoole167

Daniel Bryant

Chief Scientist, OpenCredo

CTO SpectoLabs

@danielbryantuk

Making Java Real Since Version 0.9

Open Source Advocate

DevOps Practitioner (whatever that means!)

Driving Change

“Biz-dev-QA-ops”

Leading change in organisations

Experience of Docker, k8s, Go, Java

InfoQ, DZone, Voxxed contributor

01/05/2023 @danielbryantuk | @spoole167 4

Introduction

What ‘Cloud’ promisesa virtual, dynamic environment which

maximizes use, is infinitely scalable, always available and needs minimal upfront

investment or commitment

Take your code – host it on someone else's machine pay only for the resource you use for the time you use it AND be able to do that very quickly and repeatedly in parallel

http

s://w

ww.fl

ickr.c

om/p

hoto

s/sk

ohlm

ann/

The ability to have ‘cloud burst’ capacity is changing the way software is being designed, developed and supported

We’re moving to a more industrial scale:

Why buy one computer for a year when you can hire 365 computers for a day..

It’s a new development world

https://www.flickr.com/photos/vuhung/

“Compute on demand” – it’s what we always wanted

Cloud computing: compute == money

Money changes everything

With a measureable and direct relationship between $£€¥ and CPU/RAM, disk etc the financial success or failure of a project is even easier to see

And that means…

Even more focus on value for money.

American Society of Civil Engineers

Someone will be looking at your leaky app

Loosing unnecessary baggage - (you have loads)

Java applications have to get lighter.

Java 9 modularity will help but you have to consider footprint across the board.

Choose your dependencies wisely

Your choice of OS & distribution is important.

The aim is ‘carry on only’

Your application isn’t going on a long trip

http

s://w

ww.fl

ickr.c

om/p

hoto

s/ar

myd

re20

08/

Startup timesHow long do you want to wait?

How long do you have to wait?

Do you need to preemptively start instances ‘just in case’ due to start up time? To bad – that costs

If the unit of deployment and scaling is an instance of a service it needs to start FAST

http

s://w

ww.fl

ickr.c

om/p

hoto

s/912

9511

7@N0

8/

https://www.flickr.com/photos/isherwoodchris/

• Q: How much RAM does

your application use?

• A: Too much

Runtime costs Most cloud providers will charge you for your RAM usage over time: $GB/hr. (Sometimes the charge is $0)

Increasing –Xmx directly effects cost. Something businesses can understand

Net effect : you’ll be tuning your application to fit into specific RAM sizes. Smaller than you use today.

You need to measure where the storage goes. You’ll be picking some components based on memory usage

Note that increasing the amount of memory for 1 service increases the bill by the number of concurrent instances

https://www.flickr.com/photos/erix/

SimplyJava applications are going to be running in a remote, constrained and metered environment

There will be precise limits on how much disk, CPU, RAM, Bandwidth an application can use and for how long

Whether your application is large or small, granular or monolithic. Someone will be paying for each unit used

That person will want to get the most out of that investment

http

s://w

ww.fl

ickr.c

om/p

hoto

s/rvo

egtli

/

Where you code runs day-to-day and moment-to-moment will be driven by economics, legal requirements and how much risk your business wants to take.

Your code has to scale better, be more efficient, resilient, secure and work in constrained environments

You will have to design, code, deliver, support and debug code in new ways

It’s going to be scary

How scary?

design, coding, deployment , startup, execution, scalingdebugging, security, resilience …

Almost everything about your application is effected

http

s://w

ww.fl

ickr.c

om/p

hoto

s/m

jtmai

l/

Resilient applications

Design for short term failure: something fails all the time. Expect data and service outages regularly

Fail and recover: don’t diagnose problems in running systems. Kill it and move on

Every IO operation you perform may fail – do as few as possible

Every IO operation may stall – costing you GB/hrs and resources– timeout everything quickly

Every piece of data you receive may be badly formed – check everything

Retry, compensation, backout strategies– these are your new friends

“Everything in the cloud fails all the time” : Werner Vogels

DebuggingRemote support for your family? Fancy having to do that for your own apps?

You have to assume:

You will never be able to log into a remote server.

You will never be able to attach a remote debugger to a failing app Ever.

All problems must be resolved by local reproduction or logs and dumps (discuss)

http

s://w

ww.fl

ickr.c

om/p

hoto

s/ca

rbon

nyc/

DebuggingIt gets more challenging.

Failures during deployment or initial startup can be difficult or impossible to diagnose.

If your service instance didn’t start there is is little chance of logs being kept!

Learn to love logs, dumps and traces.

Remote log stores and tools are going to be your best friend BTW: they’ll cost too

http

s://w

ww.fl

ickr.c

om/p

hoto

s/hin

kelst

one/

SecurityWhen you deploy to public cloud your system will be attacked in minutes. Certainly in < 1hrYour systems will always be under threat

https://www.flickr.com/photos/ahmadhammoudphotography/

It’s all changeHow you design, code, deploy, debug, support etc will be effected by the metrics and limits imposed on you.

Financial metrics and limits always change behavior. It also creates opportunity

Java applications have to get leaner and meaner

You have to learn new techniques and tools

http

s://w

ww.fl

ickr.c

om/p

hoto

s/be

igep

hoto

s/

01/05/2023 @danielbryantuk | @spoole167 22

Case studies

01/05/2023 @danielbryantuk | @spoole167 23

“Just make it do what the old one does (but better)”

• Case studies• ‘Teflon shouldered’ product owner• Rebuilding a service three times

• Problem• Performing migration without a

clear definition of ‘done’• Accepting feature creep

01/05/2023 @danielbryantuk | @spoole167 24

“Just make it do what the old one does (but better)”

• Attempt to retrofit BDD/regression tests around application• Serenity BDD, Cucumber, Jbehave

• Work incrementally with QA team• Manually test everything• Create tests for new functionality

• Compare input/output• Traffic: Twitter’s Diffy• Datastores: Reconsiliator pattern

01/05/2023 @danielbryantuk | @spoole167 25

Twitter’s Diffy and mysqldbcompare

blog.twitter.com/2015/diffy-testing-services-without-writing-tests dev.mysql.com/doc/mysql-utilities/1.5/en/mysqldbcompare.html

01/05/2023 @danielbryantuk | @spoole167 26

www.infoq.com/news/2015/04/raffi-krikorian-rearchitecting

My ‘re-architecting’ bible…

01/05/2023 @danielbryantuk | @spoole167 27

“Bounding the context”• Case studies• Large business software provider

thought they knew their domain• Small CRM company had let

domain model entropy

• Problem• Development team lost sight of the

application big picture• Lack of architectural awareness

and ‘broken windows’

01/05/2023 @danielbryantuk | @spoole167 28

Context mapping (static) & event storming (dynamic)

www.infoq.com/articles/ddd-contextmapping

ziobrando.blogspot.co.uk/2013/11/introducing-event-storming.html

01/05/2023 @danielbryantuk | @spoole167 29

“Bounding the context”• Create ‘seams’ within codebase• Natural domain boundaries• Single responsibility principle• Look for points of ‘friction’

• Extreme ownership• Seize (identify)• Clear (refactor logic / data)• Hold (metrics and rachets)• Build (move code to service)

01/05/2023 @danielbryantuk | @spoole167 30

“How small is micro?”• Case studies

• UK retailer looking to migrate to cloud and microservices

• Keen to minimise risk

• Problem• Previous attempts of gradual

migration had failed• Integration issues - services either too

big or too small• Spent a long time building a

‘microservice platform’

01/05/2023 @danielbryantuk | @spoole167 31

“How small is micro?”• Understand microservice principles and Self-Contained Systems (SCS)

• Utilise the strangler pattern

• ‘Service Virtualisation’ is valuable for testing

• Don’t underestimate the value of PaaS

01/05/2023 @danielbryantuk | @spoole167 32

zeroturnaround.com/rebellabs/microservices-for-the-enterprise/

01/05/2023 @danielbryantuk | @spoole167 33

Self-contained systems (SCS)

http://scs-architecture.org/

UI / Biz / Repo

MonolithDomains

Modules, components, frameworks, libraries

01/05/2023 @danielbryantuk | @spoole167 34

Self-contained systems (SCS)

SCS

Microservices

01/05/2023 @danielbryantuk | @spoole167 35

Strangling your software (not your manager!)

paulhammant.com/2013/07/14/legacy-application-strangulation-case-studies/

www.nginx.com/blog/refactoring-a-monolith-into-microservices/

01/05/2023 @danielbryantuk | @spoole167 36

Service Virtualisation (for Dev and Test)

• Existing tooling• Hoverfly• Wiremock• VCR/Betamax• Mountebank• mirage

01/05/2023 @danielbryantuk | @spoole167 37

Hoverfly• Lightweight Service virtualisation • Open source (Apache 2.0)• Go-based / single binary • Written by @Spectolabs

• Flexible API simulation• HTTP / HTTPS• More Protocols to follow?

01/05/2023 @danielbryantuk | @spoole167 38

• Middleware• Remove PII• Rate limit• Add headers

• Middleware• Fault injection• Chaos monkey

01/05/2023 @danielbryantuk | @spoole167 39

The value of PaaS…

01/05/2023 @danielbryantuk | @spoole167 40

The value of PaaS…

01/05/2023 @danielbryantuk | @spoole167 41

“Cloud native or ‘lift and shift’”• Case studies• Price comparison website

performance dipped upon a migration to the cloud

• Problems• Not coding for distributed or

ephemeral nature of cloud• No reliable creation of cloud

environment• Not testing in the cloud

01/05/2023 @danielbryantuk | @spoole167 42

“Cloud native or ‘lift and shift’”• Push apps through to production as early as possible (CI/CD)• Build POCs appropriately• Include building infrastructure in the pipeline

• Include NFR testing in the build pipeline

• Dev, QA and Ops must cultivate ‘mechanical sympathy’ • Everything in the cloud is networked

• Configure local development environments as appropriate

01/05/2023 @danielbryantuk | @spoole167 43

NFR testing in the (cloud) pipeline

01/05/2023 @danielbryantuk | @spoole167 44

NFRs testing in the (container) pipeline

01/05/2023 @danielbryantuk | @spoole167 45

NFR testing resources• Performance

• JMeter• Gatling

• Fault-tolerance• Hoverfly• Wiremock/Saboteur

• Security• bdd-security (OWASP ZAP)• OWASP Dependency-Check• Docker Bench for Security

01/05/2023 @danielbryantuk | @spoole167 46

Security is vital (but often ignored)

www.youtube.com/watch?v=c9uvV4ChIXw

www.infoq.com/news/2016/08/secure-docker-microservices

01/05/2023 @danielbryantuk | @spoole167 47

Shameless plugs…

www.youtube.com/watch?v=A1982GdXXSA

01/05/2023 @danielbryantuk | @spoole167 48

“Containerise all the things”• Problem• JVM respecting resource limits

• OOM: Unable to create thread

• Random application stalling

• Case studies• www.notonthehighstreet.com

01/05/2023 @danielbryantuk | @spoole167 49

“Containerise all the things”• Set container memory appropriately • docker - - memory=”Xg”• JVM requirements = Heap size (Xmx) + Metaspace + JVM overhead• Account for native thread requirements e.g. thread stack size (Xss)• Watch out for ulimits

• Entropy • Host entropy can soon be exhausted by crypto operations• –Djava.security.egd=file:/dev/urandom• Be aware of security ramifications

01/05/2023 @danielbryantuk | @spoole167 50

Containerising our knowledge

01/05/2023 @danielbryantuk | @spoole167 51

Key lessons learned

01/05/2023 @danielbryantuk | @spoole167 52

Lessons learned from the trenches• Specify goals and targets of migration (and retrospect)• Undertake just enough up front design (contexts, APIs, integration)• Understand distributed systems (12 factors etc)• Start pushing to production ASAP• There is nothing wrong with PaaS• Programmable infrastructure is a key enabler• Don’t forget the NFRs• Containers and microservices are complementary to cloud

01/05/2023 @danielbryantuk | @spoole167 53

Recommended reading

01/05/2023 @danielbryantuk | @spoole167 54

Thanks for listening

• Any questions?

• Daniel Bryant (@danielbryantuk )• Steve Pool (@spoole167)

Recommended