48
designing distributed scalable and reliable systems

designing distributed scalable and reliable systems

Embed Size (px)

DESCRIPTION

designing distributed scalable and reliable systems

Citation preview

Page 1: designing distributed scalable and reliable systems

designing distributed scalable and reliable systems

Page 2: designing distributed scalable and reliable systems

Mauro ServientiCTO @ Mastreeno, LTD

[email protected]@mauroservienti

//milestone.topics.it//github.com/mauroservienti

NServiceBus trainer/supportRavenDB trainer

Microsoft MVP – Visual C#

Page 3: designing distributed scalable and reliable systems

Side note: such little time…and I’m Italian :-)

• We have a tons of things to say;• We have a tons of slides (ok, my fault :-P);• My English is the worst thing you’ll ever

hear :-)

• If you, guys, can delay questions, leaving them for the break, it will be much easier :-)– Especially for me :-P

Page 4: designing distributed scalable and reliable systems

Resources

• Slides on SlideShare: //slideshare.net/mauroservienti

• Samples on GitHub: //github.com/mauroservienti– Repository: Conferences– Branches:• Full-Stack-Sample• 2014/SkillsMatter-InTheBrain-Distributed

Page 5: designing distributed scalable and reliable systems

TENETSArticles of faith…

Page 6: designing distributed scalable and reliable systems

None of the following is true

• Network is reliable;• Latency is near to zero or irrelevant;• Bandwidth is unlimited;• Network is secure;• Topology doesn’t change;• Transport cost is irrelevant;• Network is homogeneous;

Page 7: designing distributed scalable and reliable systems

DEFINITIONSLet’s get in touch…

Page 8: designing distributed scalable and reliable systems

Consistency

The rate of agreement of observers looking at a system at a given point in time.

The more the observers agree on what they see the more the system is consistent.

Page 9: designing distributed scalable and reliable systems

Coupling

The rate of dependency among parts of a system.

The more changing a portion of the system impacts on other portions of the system the

more the system is coupled.

Page 10: designing distributed scalable and reliable systems

Temporal Coupling

It’s a special form of coupling.

The more non-availability of a portion of the system impacts on other parts the more the

system is temporally coupled.

Page 11: designing distributed scalable and reliable systems

Scalability

the ability of a system to handle a growing amount of work in a capable manner

Scalability is generally difficult to define and in any particular case it is necessary to

define the specific requirements for scalability

Page 12: designing distributed scalable and reliable systems

The more we scale the more we cannot rely on

consistency

Page 13: designing distributed scalable and reliable systems

“ACD/C”Scaling can be achieved understanding that we need to choose and accept consequences of our decisions, our pillars should be:

- Asynchronous;- Cached;- Distributed;- And not Consistent;

Page 14: designing distributed scalable and reliable systems

CONSINSTENCY?Are you kidding me :-)

Page 15: designing distributed scalable and reliable systems

A strange world :-)

• A new order comes in;• The whole company is informed that a new order

we’ll be processed and we need to:– Understand if items are in stock;– Understand if we need to produce/buy something;

• At the same time production is trying to understand how to schedule the new order but is waiting for the warehouse that is currently used by the sales department to understand if the order can be shipped within the next week;

Page 16: designing distributed scalable and reliable systems

DEADLOCK

Page 17: designing distributed scalable and reliable systems

The real world…

• The obvious and only consequence of trying to scale a monolith is the collapse of the entire system;

• The real world:– Does not know at all what transactions are

(especially distributed);– Has a really low, if not null, coupling among parts;– Has no temporal coupling at all;

Page 18: designing distributed scalable and reliable systems

Transaction boundaries

• We cannot any more rely on transactions to guarantee consistency, e.g.: 1. Update the shopping chart;2. Checkout;3. Create the order;4. Create the shipment request at FedEx;

• “Simply” 1, 2, 3 and 4 can live in different systems on different machines with different databases;– And given our tenets we now have a problem– And a solution... :-)

Page 19: designing distributed scalable and reliable systems

EVENTUALCONSISTENCY

Page 20: designing distributed scalable and reliable systems

The rate of the agreement

• Will be low or really low;• Every communication must bring with itself its

version (or timestamp) in order to be able to sort stuff;

• Parts of the system are now free to move independently:– They can evolve due to the low coupling;– Be available or not, depending on their needs, because

there is no temporal coupling;• Parts…parts…parts…and parts

Page 21: designing distributed scalable and reliable systems

BOUNDEDCONTEXT

Page 22: designing distributed scalable and reliable systems

Bounded Context

• BCs, easily identified by/via the Ubiquitous Language perfectly map the concept of part or the one of transaction boundary;

• Within the BC the level of consistency is expected to be much higher than cross BCs;

• BCs can be isolated and should be able to live by their own;

• BCs generally are a unit of deployment;

Page 23: designing distributed scalable and reliable systems

MESSAGESWe have async and distributed parts…but…how they talk to each other?

Page 24: designing distributed scalable and reliable systems

asynchronous

We cannot rely on RPC calls

the other part is not guaranteed to be there when we need it.

Page 25: designing distributed scalable and reliable systems

asynchronous

QueueSender Receiver

Now

Some time in the future

Page 26: designing distributed scalable and reliable systems

distributed

We need an atomic piece of information

we cannot rely on ordering

we cannot rely on receiving the information at the same place (is distributed);

Page 27: designing distributed scalable and reliable systems

distributed

QueueSender Receiver

C

Receiver

Receiver

BA

B

A

Brok

en!

C

Page 28: designing distributed scalable and reliable systems

non-coupled

We need the message as small as possible

The more the exchanged vocabulary is large the more coupling we have

Changing the vocabulary is hard, think twice about it

Page 29: designing distributed scalable and reliable systems

IT’S A LONG WAY TO THE TOP IF YOU WANNA ROCK & ROLL…

Do we really need to go to that root?

Page 30: designing distributed scalable and reliable systems

without contextproblems

areempty talks

Page 31: designing distributed scalable and reliable systems

Context & Requirements

• Requirements set the boundary of the problem;– A problem not identified by a requirement does

not exists;• Much more: requirements set the boundary of

the solution;– A solution is valid when within the requirements

otherwise it is over-engineering;

Page 32: designing distributed scalable and reliable systems

DO WE NEED TO SCALE-OUT?The question should be…

Page 33: designing distributed scalable and reliable systems

“keep it simple”

• redesign the system from scratch;

• move everything to RavenDB;• Introduce Elastic Search;• Add a Redis cache on Linux

• 6 months, £££££• can fail

• Replace disks on the SQL machine with 2 fast SSD(s)

• 1hr, £• Observe the next 6 months :-)

A “small” e-commerce based on the traditional 3-layer architecture: Pages response time is slow;

Page 34: designing distributed scalable and reliable systems

“re-design”

• Single batch per request• Trying to scale this fails

• Multi-batch per request• Re-designing the system

guarantees to scale

A delivery system (e.g. your favorite retailer): want to increase the amount of order handled per unit of time

Page 35: designing distributed scalable and reliable systems

A new hope <cit.>

Moving away from a traditional architecture brings a lot of challenges to the table:• Everything will be async;• Non availability of a system must be managed;• Handling async failures;• Handling of synchronized access to shared resources;• How to correlate messages;• How to handle versioning and upgrades;• And more…

Page 36: designing distributed scalable and reliable systems

NServiceBusPlease welcome:

Page 37: designing distributed scalable and reliable systems

Concepts

• Message– An atomic piece of information that has a semantical meaning in

the business;• Component

– Something that can handle a message;• Service

– A set of components grouped by context;• Endpoint

– A set of services grouped by:• SLA(s);• Infrastructure concerns;• Etc..;

Page 38: designing distributed scalable and reliable systems

Concepts #2

• Command: A message that semantically identifies something to be done (imperative):– "CreateNewUser";

• Event: A message that semantically identifies something happened and immutable (past):– "NewUserCreated";

• Subscription: The notion that an endpoint is interested in an event;

Page 39: designing distributed scalable and reliable systems

DEMO

Page 40: designing distributed scalable and reliable systems

Transport

• Transport(s): The technology used to connect systems and transport the message:– MSMQ– RabbitMQ– SQL Server– Azure ServiceBus & Azure Queues

• Serialization: the way messages are"serialized" in order to be transported on the choosen transport;– it is transparent to the transport;

Page 41: designing distributed scalable and reliable systems

Advanced Concepts

• Saga: An orchestrator for a long running workflow, with the ability to store the saga state across requests and handling concurrency;

• Timeout: The way a Saga can take autonomous decisions;

• Retries: First level and Second level retry engine to handle transient failures;

• Error & Audit: error and auditing management

Page 42: designing distributed scalable and reliable systems

SCALE OUT & HIGH AVAILABILITYIn an eventual consistent world

Page 43: designing distributed scalable and reliable systems

Mail & Mail Servers

When we send an email message:• Our relation with the mail server is consistent?

Yes;• Cross-servers relation is consistent? Yes;• Relationship between the last server and the

recipient is consistent? Yes;

Page 44: designing distributed scalable and reliable systems

The entire system is consistent? No

But we have some guarantees:• Every single hop/node/BC is consistent;• If something along the way fails we will have,

with the same logic, an information back that our request is failed or succeeded;

• Do we need distributed transactions? No :-)• The message is fully enough to guarantee

consistency, in the long run.

Page 45: designing distributed scalable and reliable systems

DEMO

Page 46: designing distributed scalable and reliable systems

Publish/Subscribe

• Request/Response is generally considered an anti-pattern;

• Events are the easiest way to drive the world:– SomethingHappened;• DoSomethingToMoveOn;

• Lots of possible listener and lots of possible publishers:– CorrelationID

Page 47: designing distributed scalable and reliable systems

DEMO

Page 48: designing distributed scalable and reliable systems

QUESTIONS?We are all set :-)