Software Architecture for Cloud Infrastructure

Embed Size (px)

Text of Software Architecture for Cloud Infrastructure

  • Software ArchitectureSoftware Architecturefor Cloud Infrastructurefor Cloud Infrastructure

  • Tapio RautonenTapio Rautonen

    software architect

    Enabling cloud superpowers in software development.

  • Cloud computing characteristicsCloud computing characteristics

    On-demand self-serviceConsumer can provision computing capabilities without requiring human interaction

    Broad network accessCapabilities are available over the network and accessible by heterogeneous clients

    Resource poolingProvider's computing resources are pooled to serve multiple consumers dynamically

    Rapid elasticityCapabilities can be elastically provisioned and appear unlimited for the consumer

    Measured serviceAutomatically controlled and optimized resources by metering capabilities

  • Software architecture principlesSoftware architecture principles

    Intentional architecture with emergent design High modularity

    high cohesion, loose coupling low algorithmic complexity

    Well described elements expressive and meaningful names and APIs clean code

    Passes all defined tests or acceptance criteria Lightweight documentation

  • Software architecture Software architecture cloud computing cloud computing


  • Distributed computing fallaciesDistributed computing fallacies

    1. The network is reliable2. Latency is zero3. Bandwidth is infinite4. The network is secure5. Topology doesn't change6. There is one administrator7. Transport cost is zero8. The network is homogeneous

    Peter Deutsch / Sun Microsystems

  • Design for failureDesign for failure

  • New era of design patternsNew era of design patterns

    Cache-Aside Circuit Breaker Compensating Transaction Command and Query Responsibility Segregation (CQRS) Event Sourcing Queue-Based Load Leveling Sharding Throttling

  • Cache-Aside patternCache-Aside pattern

    Aggregated search combining multiple services requires additional search cache (Solr, ElasticSearch, ...)

    Improve performance of frequentlyread data

    Local caching resultsinconsistent state betweeninstances

    Consistency of data storesand cache is really hard tomaintain

    There are only

    two hard thin


    There are only

    two hard thin


    in Computer S

    cience: cache

    in Computer S

    cience: cache

    invalidation a

    nd naming thi


    invalidation a

    nd naming thi


    Phil KarltonPhil Ka


  • Circuit Breaker patternCircuit Breaker pattern

    Cloud infrastructure and distributed systems allow remote services to fail in ways beyond imagination prevent failures to cascade allow system to operate in degraded mode

    Suitable for big microservices architecture creates routing complexity and overhead potential single point of failure must be highly available

    Enables central logging and metrics dashboards and central state awareness

  • Circuit Breaker patternCircuit Breaker pattern

    During normal operation, breaker is in Closed state failures will eventually trip breaker

    While in Open state calls will fail fast and after some period attempts to reset

    In Half-Open state on successful call resets breaker, otherwise trips breaker

  • Netflix Hystrix dashboardNetflix Hystrix dashboard

  • Compensating Transaction patternCompensating Transaction pattern

    Irrecoverable failures in distributed systems are hard eventual consistency, rollbacks are impossible

    Distributed transactions (XA) difficult and complex to implement still not bulletproof not usable for generic REST services

    Undo the effects of the original operation defines an eventually consistent steps for a reverse operation compensation logic may be difficult to generate operations should be idemponent to prevent further catastrophe

  • CQRS patternCQRS pattern

    Command and Query Responsibility Segregation segregates read and write operations with separate interfaces allows to maximize performance, scalability and security

    Introduces flexibility at the cost of complexity traditionally same DTO is used for read and write operations different data model for read (query) and write (command) supports different read and write data stores not suitable for simple business rules where CRUD is sufficient

    Often used together with event sourcing pattern

  • Event Sourcing patternEvent Sourcing pattern

    Append only store of events that describe actions for data simplifies tasks in complex domains by avoiding synchronization improves performance, scalability and consistency for

    transactional data can serve multiple different materialized views

    Maintains full audit trail and history enables compensation actions supports play back at any point in time

    Events are simple the operation logic they describe might not be updates and deletes must be implemented with compensation at least once publication requires idemponent consumers

  • Queue-Based Load Leveling patternQueue-Based Load Leveling pattern

    Buffer between task and service minimizes the impact of peaks of work load task flood may result unresponsive or failure of the service

    Task provider and service runs asynchronously queue decouples tasks from the service service can handle tasks at its own optimal pace requires a mechanism for responses if the task expects a reply



    Message queue

  • Sharding patternSharding pattern

    Divide data store into multiple horizontal partitions improves scalability when handling large volumes of data

    Overcomes limitations of single server data store finite storage space computing resources for large number of concurrent users network bandwidth governed performance geographically limited storage for legal or performance reasons

    Strategy defines the sharding key and data distribution wrong sharding strategy results bad performance balancing shards is not trivial, rebalancing is expensive referential integrity and consistency is hard to maintain

    Configuring and managing big set of shards is a challenge

  • Throttling patternThrottling pattern

    Controls the consumption of resource used by a service allows the system to function and meet SLA on extreme load

    Throttle after soft limit of resource usage is exceeded reject requests for user that exceed the soft limits disable or degrade functionality of nonessential services queue-based load leveling with priority queues

    Throttling is an architectural decision impacts the entire design of the system must be detected and performed very quickly services should return specific error code for clients can be used as an interim measure while autoscaling

  • SaaS architecture methodologySaaS architecture methodology

    Declarative formats for setup and runtime automation Clean contract with infrastructure for maximum portability Cloud platform deployments, obviating the need for ops Tooling, architecture and dev practices support scaling

    Modern software is delivered from the cloud to heterogeneous clients on-demand

  • The Twelve-Factor AppThe Twelve-Factor App

    I. Codebaseone codebase tracked in revision control, many deploys

    II. Dependenciesexplicitly declare and isolate dependencies

    III. Configstore config in the environment

    IV. Backing Servicestreat backing services as attached resources

    V. Build, release, runstrictly separate build and run stages

    VI. Processesexecute the app as one or more stateless processes

  • The Twelve-Factor AppThe Twelve-Factor App

    VII. Port bindingexport services via port binding

    VIII.Concurrencyscale out via the process model

    IX. Disposabilitymaximize robustness with fast startup and graceful shutdown

    X. Dev/prod paritykeep development, staging, and production as similar as possible

    XI. Logstreat logs as event streams

    XII. Admin processesrun admin/management tasks as one-off processes

  • Break the monolith in piecesBreak the monolith in pieces

    Monoliths come with a burden cognitive overload for developers scaling and continuous deployment becomes difficult long-term commitment to technology stack allows taking shortcuts for architectural design

    If you can't build a monolith, what makes you think microservices are the answer?

  • AWS reference architectureAWS reference architecture

  • Service discoveryService discovery

    Services need to know about each other inexistence of centralized service bus smart endpoints and client side load balancing

    Service registry is the new single point of failure? value availability over consistency

    Provides a limited set of well defined features services notify each other of their availability and status cleaning of stale services easy integration with standard protocols like HTTP or DNS notifications on services starting and stopping

  • Ephemeral runtime environmentsEphemeral runtime environments

    Short lifetime of an application runtime environment scaling, testing, materializing ideas requires highly automatized infrastructure

    Nothing can be stored in the runtime environment logs, file uploads, database storage files, configuration

    Results stateless services optimal for horizontal scaling integrates to State as a Service

    Must be repeatable and automatically provisioned

  • Metrics and loggingMetrics