109
Building Scalable, Highly Concurrent & Fault-Tolerant Systems: Lessons Learned Jonas Bonér CTO Typesafe Twitter: @jboner

Building Scalable, Highly Concurrent & Fault Tolerant Systems - Lessons Learned

Embed Size (px)

DESCRIPTION

Lessons learned through agony and pain, lots of pain.

Citation preview

Page 1: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Building

Scalable, Highly Concurrent &

Fault-Tolerant Systems:

Lessons Learned

Jonas Bonér CTO Typesafe

Twitter : @jboner

Page 2: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

Page 3: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

I will never use distributed transactions again

Lessons Learned

through...

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

Page 4: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

I will never use distributed transactions again

Lessons Learned

through...

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

Agony

Page 5: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

I will never use distributed transactions again

Lessons Learned

through...

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

I will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions againI will never use distributed transactions again

Agonyand Painlots of

Pain

Page 6: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Agenda

• It’s All Trade-offs• Go Concurrent• Go Reactive• Go Fault-Tolerant• Go Distributed• Go Big

Page 7: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned
Page 8: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

It’s all Trade-offs

Page 9: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Performance vs

Scalability

Page 10: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Latency vs

Throughput

Page 11: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Availability vs

Consistency

Page 12: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go Concurrent

Page 13: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Shared mutable state

Page 14: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Shared mutable stateTogether with threads...

Page 15: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Shared mutable state

...leads to

Together with threads...

Page 16: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Shared mutable state

...code that is totally INDETERMINISTIC...leads to

Together with threads...

Page 17: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Shared mutable state

...code that is totally INDETERMINISTIC

...and the root of all EVIL...leads to

Together with threads...

Page 18: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Shared mutable state

...code that is totally INDETERMINISTIC

...and the root of all EVIL...leads to

Together with threads...

Please, avoid it at all cost

Page 19: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Shared mutable state

...code that is totally INDETERMINISTIC

...and the root of all EVIL...leads to

Together with threads...

Please, avoid it at all costUse IMMUTABLE

state!!!

Page 20: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

The problem with locks

• Locks do not compose• Locks break encapsulation• Taking too few locks• Taking too many locks• Taking the wrong locks• Taking locks in the wrong order• Error recovery is hard

Page 21: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

You deserve better tools

• Dataflow Concurrency• Actors• Software Transactional Memory (STM)• Agents

Page 22: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Dataflow Concurrency• Deterministic• Declarative • Data-driven• Threads are suspended until data is available• Lazy & On-demand

• No difference between:• Concurrent code • Sequential code• Examples: Akka & GPars

Page 23: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Actors

•Share NOTHING•Isolated lightweight event-based processes•Each actor has a mailbox (message queue) •Communicates through asynchronous and non-blocking message passing

•Location transparent (distributable)•Examples: Akka & Erlang

Page 24: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• See the memory as a transactional dataset• Similar to a DB: begin, commit, rollback (ACI)• Transactions are retried upon collision• Rolls back the memory on abort• Transactions can nest and compose• Use STM instead of abusing your database

with temporary storage of “scratch” data• Examples: Haskell, Clojure & Scala

STM

Page 25: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• Reactive memory cells (STM Ref)• Send a update function to the Agent, which

1. adds it to an (ordered) queue, to be2. applied to the Agent asynchronously

• Reads are “free”, just dereferences the Ref• Cooperates with STM• Examples: Clojure & Akka

Agents

Page 26: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...

Page 27: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

Page 28: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

Page 29: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

• Dataflow

Page 30: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

• Dataflow

2. Add Indeterminism selectively - only where needed

Page 31: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

• Dataflow

2. Add Indeterminism selectively - only where needed

• Actor/Agent-based Programming

Page 32: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

• Dataflow

2. Add Indeterminism selectively - only where needed

• Actor/Agent-based Programming

3. Add Mutability selectively - only where needed

Page 33: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

• Dataflow

2. Add Indeterminism selectively - only where needed

• Actor/Agent-based Programming

3. Add Mutability selectively - only where needed

• Protected by Transactions (STM)

Page 34: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

• Dataflow

2. Add Indeterminism selectively - only where needed

• Actor/Agent-based Programming

3. Add Mutability selectively - only where needed

• Protected by Transactions (STM)

4. Finally - only if really needed

Page 35: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

If we could start all over...1. Start with a Deterministic, Declarative & Immutable core

• Logic & Functional Programming

• Dataflow

2. Add Indeterminism selectively - only where needed

• Actor/Agent-based Programming

3. Add Mutability selectively - only where needed

• Protected by Transactions (STM)

4. Finally - only if really needed

• Add Monitors (Locks) and explicit Threads

Page 36: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go Reactive

Page 37: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Never block

• ...unless you really have to• Blocking kills scalability (and performance)• Never sit on resources you don’t use• Use non-blocking IO• Be reactive• How?

Page 38: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go AsyncDesign for reactive event-driven systems

1. Use asynchronous message passing2. Use Iteratee-based IO3. Use push not pull (or poll)• Examples:

• Akka or Erlang actors

• Play’s reactive Iteratee IO

• Node.js or JavaScript Promises

• Server-Sent Events or WebSockets

• Scala’s Futures library

Page 39: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go Fault-Tolerant

Page 40: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Failure Recovery in Java/C/C# etc.

Page 41: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of controlFailure Recovery in Java/C/C# etc.

Page 42: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of control• If this thread blows up you are screwed

Failure Recovery in Java/C/C# etc.

Page 43: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of control• If this thread blows up you are screwed • So you need to do all explicit error handling

WITHIN this single thread

Failure Recovery in Java/C/C# etc.

Page 44: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of control• If this thread blows up you are screwed • So you need to do all explicit error handling

WITHIN this single thread• To make things worse - errors do not

propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed

Failure Recovery in Java/C/C# etc.

Page 45: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of control• If this thread blows up you are screwed • So you need to do all explicit error handling

WITHIN this single thread• To make things worse - errors do not

propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed

• This leads to DEFENSIVE programming with:

Failure Recovery in Java/C/C# etc.

Page 46: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of control• If this thread blows up you are screwed • So you need to do all explicit error handling

WITHIN this single thread• To make things worse - errors do not

propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed

• This leads to DEFENSIVE programming with:• Error handling TANGLED with business logic

Failure Recovery in Java/C/C# etc.

Page 47: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of control• If this thread blows up you are screwed • So you need to do all explicit error handling

WITHIN this single thread• To make things worse - errors do not

propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed

• This leads to DEFENSIVE programming with:• Error handling TANGLED with business logic • SCATTERED all over the code base

Failure Recovery in Java/C/C# etc.

Page 48: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• You are given a SINGLE thread of control• If this thread blows up you are screwed • So you need to do all explicit error handling

WITHIN this single thread• To make things worse - errors do not

propagate between threads so there is NO WAY OF EVEN FINDING OUT that something have failed

• This leads to DEFENSIVE programming with:• Error handling TANGLED with business logic • SCATTERED all over the code base

Failure Recovery in Java/C/C# etc.

We can do

better!!!

Page 49: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Just

Let It Crash

Page 50: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned
Page 51: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

The right way1. Isolated lightweight processes2. Supervised processes

• Each running process has a supervising process• Errors are sent to the supervisor (asynchronously)• Supervisor manages the failure

• Same semantics local as remote

• For example the Actor Model solves it nicely

Page 52: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go Distributed

Page 53: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Performance vs

Scalability

Page 54: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

How do I know if I have a performance problem?

Page 55: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

How do I know if I have a performance problem?

If your system is slow for a single user

Page 56: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

How do I know if I have a scalability problem?

Page 57: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

How do I know if I have a scalability problem?

If your system isfast for a single user

but slow under heavy load

Page 58: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

(Three) Misconceptions about Reliable Distributed Computing

- Werner Vogels

1. Transparency is the ultimate goal2. Automatic object replication is desirable3. All replicas are equal and deterministic

Classic paper: A Note On Distributed Computing - Waldo et. al.

Page 59: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Transparent Distributed Computing• Emulating Consistency and Shared

Memory in a distributed environment• Distributed Objects

• “Sucks like an inverted hurricane” - Martin Fowler

• Distributed Transactions• ...don’t get me started...

Fallacy 1

Page 60: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Fallacy 2RPC

• Emulating synchronous blocking method dispatch - across the network

• Ignores:• Latency• Partial failures• General scalability concerns, caching etc.

• “Convenience over Correctness” - Steve Vinoski

Page 61: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Instead

Page 62: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Embrace the NetworkInstead

and be

done

with itUse

AsynchronousMessagePassing

Page 63: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Delivery Semantics

• No guarantees• At most once• At least once• Once and only once

Guaranteed Delivery

Page 64: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

It’s all lies.

Page 65: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

It’s all lies.

Page 66: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

The network is inherently unreliable and there is no such thing as 100%

guaranteed delivery

It’s all lies.

Page 67: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed Delivery

Page 68: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed DeliveryThe question is what to guarantee

Page 69: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed DeliveryThe question is what to guarantee

1. The message is - sent out on the network?

Page 70: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed DeliveryThe question is what to guarantee

1. The message is - sent out on the network?

2. The message is - received by the receiver host’s NIC?

Page 71: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed DeliveryThe question is what to guarantee

1. The message is - sent out on the network?

2. The message is - received by the receiver host’s NIC?

3. The message is - put on the receiver’s queue?

Page 72: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed DeliveryThe question is what to guarantee

1. The message is - sent out on the network?

2. The message is - received by the receiver host’s NIC?

3. The message is - put on the receiver’s queue?

4. The message is - applied to the receiver?

Page 73: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed DeliveryThe question is what to guarantee

1. The message is - sent out on the network?

2. The message is - received by the receiver host’s NIC?

3. The message is - put on the receiver’s queue?

4. The message is - applied to the receiver?

5. The message is - starting to be processed by the receiver?

Page 74: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Guaranteed DeliveryThe question is what to guarantee

1. The message is - sent out on the network?

2. The message is - received by the receiver host’s NIC?

3. The message is - put on the receiver’s queue?

4. The message is - applied to the receiver?

5. The message is - starting to be processed by the receiver?

6. The message is - has completed processing by the receiver?

Page 75: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Ok, then what to do? 1. Start with 0 guarantees (0 additional cost)2. Add the guarantees you need - one by one

Page 76: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Ok, then what to do? 1. Start with 0 guarantees (0 additional cost)2. Add the guarantees you need - one by one

Different USE-CASES Different GUARANTEES

Different COSTS

Page 77: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Ok, then what to do? 1. Start with 0 guarantees (0 additional cost)2. Add the guarantees you need - one by one

Different USE-CASES Different GUARANTEES

Different COSTSFor each additional guarantee you add you will either :

• decrease performance, throughput or scalability

• increase latency

Page 78: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Just

Page 79: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Just

Use ACKing

Page 80: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Just

Use ACKingand be done with it

Page 81: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Latency vs

Throughput

Page 82: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

You should strive for maximal throughput

with acceptable latency

Page 83: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go Big

Page 84: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go BigData

Page 85: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Big DataImperative OO programming doesn't cut it

• Object-Mathematics Impedance Mismatch• We need functional processing, transformations etc.• Examples: Spark, Crunch/Scrunch, Cascading, Cascalog,

Scalding, Scala Parallel Collections• Hadoop have been called the:

• “Assembly language of MapReduce programming”

• “EJB of our time”

Page 86: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Batch processing doesn't cut it

• Ala Hadoop• We need real-time data processing• Examples: Spark, Storm, S4 etc.• Watch“Why Big Data Needs To Be Functional”

by Dean Wampler

Big Data

Page 87: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Go BigDB

Page 88: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

When isa RDBMS

not good enough?

Page 89: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Scaling reads to a RDBMS

is hard

Page 90: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Scaling writes to a RDBMS

is impossible

Page 91: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Do we really need a RDBMS?

Page 92: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Do we really need a RDBMS?Sometimes...

Page 93: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Do we really need a RDBMS?

Page 94: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Do we really need a RDBMS?

But many times we don’t

Page 95: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Atomic

Consistent

Isolated

Durable

Page 96: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Availability vs

Consistency

Page 97: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Brewer’s

CAPtheorem

Page 98: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

You can only pick 2

Consistency

Availability

Partition toleranceAt a given point in time

Page 99: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Centralized system• In a centralized system (RDBMS etc.)

we don’t have network partitions, e.g. P in CAP

• So you get both:

Consistency

Availability

Page 100: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Distributed system• In a distributed (scalable) system

we will have network partitions, e.g. P in CAP

• So you get to only pick one:

Consistency

Availability

Page 101: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Basically Available

Soft state

Eventually consistent

Page 102: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Think about your data

• When do you need ACID?• When is Eventual Consistency a better fit?• Different kinds of data has different needs• You need full consistency less than you think

Then think again

Page 103: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

How fast is fast enough?• Never guess: Measure, measure and measure• Start by defining a baseline

• Where are we now?

• Define what is “good enough” - i.e. SLAs• Where do we want to go?• When are we done?

• Beware of micro-benchmarks

Page 104: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

• Never guess: Measure, measure and measure• Start by defining a baseline

• Where are we now?

• Define what is “good enough” - i.e. SLAs• Where do we want to go?• When are we done?

• Beware of micro-benchmarks

...or, when can we go for a beer?

Page 105: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

To sum things up...1. Maximizing a specific metric impacts others

• Every strategic decision involves a trade-off• There's no "silver bullet"

2. Applying yesterday's best practices to the problems faced today will lead to:• Waste of resources• Performance and scalability bottlenecks• Unreliable systems

Page 106: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

SO

Page 107: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

GO

Page 108: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

...now home and build yourself Scalable,

Highly Concurrent & Fault-Tolerant

Systems

Page 109: Building Scalable, Highly Concurrent & Fault Tolerant Systems -  Lessons Learned

Thank YouEmail: [email protected]: typesafe.comTwitter : @jboner