Upload
christian-deger
View
57
Download
2
Embed Size (px)
Citation preview
PowerPoint-Prsentation
Reactive Microservices Roadshow | 28.09.2016 | Christian DegerHighway to heavenBuilding microservices in the cloud
27.10.16
Christian DegerChief [email protected]@cdeger
AutoScout24 in Munich, 6 YearsSoftware Engineer, Team Lead, Architect, Chief Architect
2
2,4 Million Vehicles
Who does not know AS24?Listings business, completley digital.Our focus is the search for the vehicle and everything around that3
Microservices in the cloud adoption?
Who is doing CD?Who is in the cloud?Who plans on moving to the cloud?Who knows about microservices?Who is doing microservices?4
2000 Servers2 Data Centers
MTBF optimized
Proven delivery engineHighly optimized, but of last decadeIT platform supported growth for >6 yearsAvailability = MTBF / (MTBF + MTTR)Proven agile and lean principles5
Dev and Ops Silos
DevelopmentChangeOperationsStability
The split at CTO level.Reflexes:Protection: Ticket systemBlaming: Devs ships bad software. Ops is to slow.Introduced proxies between silos: DevOps as role anti-pattern.QA handover, Product handover.
6
In AutoScout24 there was resistance against change.We are not like NetflixThat is to expensiveWe have built a private cloudsunken cost fallacyabsurd lift and shift cost calculation.7
NewCEO
Scout24 was sold end of 2013New CEO Greg Ellis beginning of 2014Are you ready for the future?8
Talent?Do you attract
We got good, agile .NET developers.But from banks and insurrances.We were not getting the talents for internet business.9
We started think about our ecosystem and the flywheel that drives it. .NET small, slowLinux/JVM larger, faster
21st CenturyWhat does a
tech company look like?
11
We want to learn and participate from the advances the unicorns in our industry make.Visited Netflix, LinkedIn, AirBnb, etc.
Innovation for us no longer comes from previous enterprise suppliers.Oracle, IBM, Microsoft
12
Great DesignUniversally ConnectedMobile FirstInstant Business ValueMassive Data InsightHighly Available
13
good, but not greatHmm, we are
We are not bad, but we are not great.The new questions triggered something...14
Rebooteverything
Escape the gravity of status quo.Perhaps its a pattern: Just rebuild it.
15
Project
Tatsu
Fly at the speed of fear - DisruptiveJapanese dragon: Flying beast
Started Nov. 2014 with one team, now at 4 teams.
.NET / Windows to JVM / LinuxMonolith to MicroservicesData center to AWSDevs + Ops to Collaboration cultureInvolve product people
Windows is not used in 21st century companiesLinux is easier to automateCloud native architecture, use AWS services. Closing feedback loopProductNot a pure technical transformation.Leaner version of AS24.Shiny new cut.
Major JVM Languages
We settled on the JVM very early.Alternatives where to young and risky.Many companies we try to learn from use the JVM.Go was favored by some.18
No traction in major internet companiesMajor JVM Languages
Major JVM LanguagesNo traction in major internet companiesNot accepted by C#developers
Major JVM LanguagesNo traction in major internet companiesNot accepted by C#developersAttracts talentIs a starting point
Twitter, Gilt, SoundCloud, Zalando.Polyglot: Starting point onlyNo endless language war.
Would we make the same choice again?+ It does attract talent+ Ecosystem: Play, Akka, SparkLong learning curveNo idiomatic Scala / many concepts
21
Why Microservices?SpeedIndependent deployableFast local decisionsAutonomous teamsStrong boundariesLoosely coupledTechnology diversityScale the organization
Scale the organization + SpeedSpeed: Not waiting for others. Technology diversity: I dont want to end up with another migration.Strong boundaries:Modularized monolith degrades.Process boundary as a strong architectural force.Bounded Context
Strong module boundarieshttp://martinfowler.com/articles/microservice-trade-offs.html#boundariesIndepent deploymenthttp://martinfowler.com/articles/microservice-trade-offs.html#deploymentTechnology diversityhttp://martinfowler.com/articles/microservice-trade-offs.html#diversity
22
samedirection
First team was tightly aligned.Wanted to the right thing, they were prepared.When ramping up to four teams, we realized, that we need23
STRATEGICGOALSGoals of the business sideARCHITECTURALPRINCIPLESHigh-Level PrinciplesDESIGN AND DELIVERY PRINCIPLESTactical measuresREDUCE TIME TO MARKETEstablish fast feedback loops to learn, validate and improve. Remove friction, hand-offs and undifferentiated work.MOBILE FIRSTStart small and use device capabilities.SUPPORT DATA-DRIVEN DECISIONSProvide relevant metrics and data for user and market insights. Validate hypothesis for problems worth solving.
YOU BUILT IT, YOU RUN ITThe team is responsible for shaping, building, running and maintaining its products. Fast feedback from live and customers helps us to continuously improve.ORGANIZED AROUND BUSINESS CAPABILITIESBuild teams around products not projects. Follow the domain and respect bounded contexts. Make boundaries explicit. Inverse Conway Maneuver.LOOSELY COUPLEDBy default avoid sharing and tight coupling.No integration database. Dont create the next monolith.MACRO AND MICRO ARCHITECTUREClear separation. Autonomous micro services within the rules and constraints of the macro architecture. AWS FIRSTFavor AWS platform service over managed service,over self-hosted OSS, over self built solutions.DATA-DRIVEN / METRIC-DRIVENCollect business and operational metrics. Analyze, alert and act on them.ELIMINATE ACCIDENTAL COMPLEXITYStrive to keep it simple. Dont over-engineer.Focus on necessary domain complexity.AUTONOMOUS TEAMSMake fast local decisions. Be responsible. Know your boundaries. Share findings.INFRASTRUCTURE AS CODEAutomate everything: Reproducible, traceable, auditable and tested. Immutable servers.CROSS-FUNCTIONAL TEAMSEngineers from all backgrounds work together in collaborative teams as engineers and share responsibilities. No silos.BE BOLDGo into production early. Value monitoring over tests.Fail fast, recover and learn. Optimize for MTTR not MTBF.SECURITY, COMPLIANCE AND DATA PRIVACYBuild with least privilege and data privacy in mind.Know your threat model. Limit blast radius.COST EFFICIENCYRun your segment in the right balance of cost and value.
ONE SCOUT ITFoster collaboration. Harmonize and standardize tools.Pull common capabilities into decoupled platform services.
Version 2.0Icons made by Freepik from www.flaticon.com are licensed under CC BY 3.0
BEST TALENTAutonomy, Purpose and Mastery: We know why we do things, we decide how to approach them and deliberately practice our skills.
DisclaimersStrategic Goals include business and IT specific goals, because the target audience for the principles is IT.Dont be stupid! is relevant for all principles. Principles not Dogmas.Obvious things that are already part of our core beliefs are note mentioned. For example Lean, Agile and Continuous Integration.
Mobile FirstSync state between platforms
Tech CultureFor a better understanding of those terms, see Dan Pink: https://www.youtube.com/watch?v=u6XAPnuFjJcZalando and their Radical Agility: https://tech.zalando.de/working-at-z/
Cost efficiencyIncludes infrastructure
One Scout ITApplience model included as concept for a platform service
Organized around Business CapabilitiesIncludes internal products in the business or infrastructure platform.Business capabilities typically map to bounded contexts known from DDD.Inverse Conway Maneuver: Loosely coupled teams responsible for business capabilities lead to corresponding services aligned with those business capabilities.Expert exchange is not in violation: For example platform experts can help co-creating a service in a product team.
Eliminate Accidental ComplexityEssential complexity is the core of the problem we have to solve. It is based on true functional and cross-functional requirements.Accidental complexity is all the other stuff that doesnt directly relate to the solution derived from the requirements.Boy scout rule
Loosely coupledAvoid sharing includes shared infrastructure.Using a multi-tenant capable platform API is not sharing. No premature generalization. First have the problem and solve it. With the next incarnation of the same problem, start generalizing it.
AWS FirstFavor over as in the agile manifesto: There are use cases for things on the right, we start with and prefer things on the left.Document your decisions (ADR).Be aware of NIH.
You build it, you run itOwn the product and services lifecycles includes also retiring them.
Be bold
MTBF: Mean time between failuresMTTR: Mean time to recoveryAvailability = MTBF / (MTBF + MTTR)
Autonomous TeamsKnow your boundaries: Be thoughtful, when you are stepping into the realms of the macro architecture or outside your business capability.
Macro and Micro ArchitectureFor example, Scala is our default product engineering language. With good reasons you are free to deviate from that.Your freedom is not my responsibility - Quote from a Netflix engineer: When a team makes a decision like that, it needs to able to support it in the long run.
Infrastructure as codePhoenix servers and immutable containersReport and alert on security and conformity violations.
BuildMeasureLearn
Time to market: the core goal, fast feedbackWhy fast feedback: Build Measure Learn25
Autonomous teamsbusiness capabilitiesorganized around
Have empowered, self organizing cells, to get teams that build decoupled services.Collective code ownership. Does not scale.We want to scale the organization and still be fast.Products not projects.http://martinfowler.com/bliki/BusinessCapabilityCentric.html
Follow the domain and respect bounded contexts.Make fast local decisions.Be responsible.
26
collaborationculture
DevOps for us is not a team, a role or a tool is collaboration.No silos, no handovers, no fingerpointing.No software that is hard to operate.No infrastructure nobody needs.But also to use tools like Slack, Github
27
You build it,you run it.
Freedom and Reponsibility.The team is responsible for shaping, building, running and maintaining its products.Fast feedback from live and customers helps us to continuously improve.The one who feels the pain of being woken up at nightthe one to be able to make the changes-> real resilient services-> Customer first28
Monitoring is the new testing
Tests only run on delivery. Monitoring runs all the time.
Performance.CD Pipelines.Open OpsGenie Alerts.Costs per day.Page Speed.
Next step: Monitor business KPIs.29
How (not) to shareshared nothing as defaultloosely coupledfast local decisionsvoluntary adoptionexception: macro concerns
Sharing implies dependencies.Share only with good reason.Availability over shared nothingNo side effects: Story of shared state in dashingUse over re-useCopy npaste, OSS, libraryFast local decisions over committeeRe-use only after hardening
Follow thetrail
Convenience offerings.People will stick with decisions, when they are good enough for them.They will create different solutions, when it is important for them.CD Tool, Language decision, Service Template.31
ContinuousDelivery
Fast feedback from real users. Build, Measure, LearnFast feedback on quality of service.Creating value for user is daily business.
We started with one release per month......to many releases per day.
But only code changes were delivered.32
Application code in one repository per service.
CIDeployment package as artifact.CDDeliver package to servers
Delivery Pipeline Data Center
Commit stage: Unit tests etc.Additional database migration scripts.Blue/ Green delivery on the instance.
33
Application code and infrastructure specification in one repository per service.
CIDeployment package and infrastructure declaration as artifact.CD 1. Create or update service infrastructure.
2. New instances pull down package and start application.
Delivery Pipeline AWS
Every change goes through the delivery pipeline.High traceability.Delivery: CloudFormation + ASGDependencies: Global stack and Base AMI.
34
TraceableRepeatableReliable(Faster)
35
Cattle,not pets
Phoenix servers.No configuration drift.Security: Alerting on instances that are to old
Tradeoff cycle time:Current cycle time: commit to production is 20-30 minutes: To slow!
36
Separatecode deploymentfeature releasefrom
Who is using feature branches and does CI?No merging of branches anymore.You are not doing CI, when you are branching.Short lived branches < 1 day are ok.Dynamic Feature Toggles:Canary releases.Product can switch on a feature for acceptance.
37
Nostagingenvironment
We only maintain one environment where all services integrate: ProductionBe bold!MTTR over MTBF
38
SQS + S3Kinesis + S3Kinesis + DynamoDBSQS + DynamoDBProxy + DynamoDBDynamoDBEvolution
Fast evolutionWorking code and infrastructure for all options within 2 weeks
SQS + S3 Inspired by ImmobilienScout24Kinesis + S3 Feedback from AWSKinesis + DynomaDB Queryable, Storage costs not prohibitive (7x)SQS + DynamoDB Queue better than LATEST or TRIM_HORIZONProxy + DynamoDB Changes are collected via queue table in OracleDynamoDB DynamoDB is global service
Unlimited Infrastructure with APIs
Automated conformity and security:Alarms for unexpected production changes: Changes not through the pipeline.Alarms for old base AMIs, guarding against missing security patches.
40
Migration strategy
Migration strategy. Vertical slices
Allows hybrid approach: Routing back traffic to DC
PageSpeed Module
css (page+fragment)js (page+fragment)
ngx_pagespeed
css (page)js (page)
css (fragment)js (fragment)
No shared asset pipeline.
Pagesare accessible via (localised) URLare owned by one teamcould be cacheableFragmentsare parts of a pagedont know the original request should send cache headersAssetsShould be combined and minifiedCachingCloudFront Caching: Caching on edge locations. Respects Cache Headers from Jigsaw.PageSpeed Caching: Caches combined assets.Backend Caching: Respects Cache Headers from microservices.
Event Streaming
Microservices challange: Collect data for BI from all services.Application events are written directly to Kinesis as Json.Ingesting 1.4 TB per day.Intended to be used for real time processing later.
Event Sourcingone way data highway and data pumps
One way data highwayEvent Sourcing - History of all changes
Design decision: No queries against DC. Data needs to be pushed into AWSAbility to replay all events from beginning of time.Events = All changes to a classifiedDatabase change propagation = poor mans event sourcing, no intent captured
Kafka Connect, LinkedIn DataBus
Commit to Production20 Minutes Cycle Time
New Service1 Day Service Bootstrapping3 Days Frontend4 Days Backend
OEM Testdrive: https://www.autoscout24.de/testdriverService Bootstrapping: CD Pipeline to production, Monitoring, AlertsFrontend: Move the pixel. Reuse a lot of other services via ssi and ui compositionBackend: Persistence, Validation, Integrations
Cycle Time: From commit to production
46
015 Teams025 Lambda Functions200 Repositories040 Microservices009 Systems
Status Quo
Status in June 2016.Systems in terms of Self-Containted Systems
47
?
Picture CreditsTatsu Sign by Martin Lewison from The Hague, Zuid-Holland, The Netherlands under CC BY-SA 2.0Martin Fowler by Webysther Nunes under CC BY-SA 4.0Werner Vogels by Guido van Nispen under CC BY 2.0"HotWheels - '69 Ford Torino Talladega by Leap Kye, licensed under CC BY-ND 2.0Differences between Traditional vs Next Generation by Simon Wardley under CC BY-SA 3.0Enterprise IT Adoption Cycle by Simon Wardley under CC BY-SA 3.0And the future is private by Simon Wardley under CC BY-SA 3.0Leosvel et Diosmani by Ludovic Pron under CC BY-SA 3.0Spare wheel by Brian Snelson under CC BY 2.0Wandergeselle by Sigismund von Dobschtz under CC BY-SA 3.0Wheel clamps Texas by Richard Anderson from Denton, United States (Boots.) under CC BY-SA 2.0Sharing Sucks (4536747557) by eyeliam from Portland, United States under CC BY 2.0Traffic Jam by Doo Ho Kim under CC BY-SA 2.0Puzzling by Bernd Gessler (Own work) CC BY-SA 3.0Amazon16 by Neil Palmer/CIAT under CC BY-SA 2.0Pizza by Jakob Dettner, Rainer Zenz under CC BY-SA 2.0 deBezos Iconic Laugh by Steve Jurvetsonunder CC BY 2.0
49