Akka: Distributed by Design - Björn Antonsson (Typesafe)

Preview:

DESCRIPTION

Presented at JAX London 2013 Akka is a toolkit and runtime for building highly scalable, distributed, and fault tolerant reactive applications on the JVM using actors. Akka supports scaling both UP (utilizing multi-core processors) and OUT (utilizing the grid/cloud) hence it is "Distributed by Design". This gives the Akka runtime the freedom to do adaptive automatic load-balancing and actor migration, cluster rebalancing, replication and partitioning. All this is made possible through Akkas new decentralized P2P, gossip-based cluster module.

Citation preview

Akka - Distributed by DesignBjörn Antonsson

@bantonsson

Wednesday, 30 October 13

@bantonsson

Overview

• Akka• Actors• Distributed by Design• Diving Into The Cluster• What The Future Brings

Wednesday, 30 October 13

Akka

Wednesday, 30 October 13

@bantonsson

Akka

• Toolkit and runtime for reactive applications• Write applications that are– Concurrent– Distributed– Fault tolerant– Event-driven

Wednesday, 30 October 13

@bantonsson

Akka

• Has multiple tools– Actors– Futures– Dataflow– Remoting– Clustering

Wednesday, 30 October 13

Actors

Wednesday, 30 October 13

@bantonsson

Actors

• Isolated lightweight event-based processes• Share nothing• Communicate through async messages• Each actor has a mailbox (message queue)• Each actor has a parent handling its failures• Location transparent (distributable)

Wednesday, 30 October 13

@bantonsson

Actors

• An island of sanity in a sea of concurrency• Everything inside the actor is sequential– Processes one message at a time

• Very lightweight– Create millions– Create short lived

• Inherently concurrent

Wednesday, 30 October 13

@bantonsson

Actor code sample

public class Greeting implements Serializable { public final String who; public Greeting(String who) { this.who = who; }}

public class GreetingActor extends UntypedActor { LoggingAdapter log = Logging.getLogger(getContext().system(), this); int counter = 0;

public void onReceive(Object message) { if (message instanceof Greeting) { counter++; log.info("Hello #" + counter + " " + ((Greeting) message).who); } else unhandled(message); }}

Define the message(s) the Actor should be able to respond to

Define the Actor class

Define the Actor’s behavior

Wednesday, 30 October 13

@bantonsson

Creating and using Actors

ActorSystem system = ActorSystem.create("MySystem"); ActorRef greeter = system.actorOf( Props.create(GreetingActor.class), "greeter");

greeter.tell(new Greeting("Charlie Parker"), null);

Wednesday, 30 October 13

@bantonsson

Actors compared to Objects

• Think of an Actor as an Object• You can't peek inside it• You don't call methods– You send messages (asynchronously)

• You don't get return values– You receive messages (asynchronously)

• The internal state is thread safe

Wednesday, 30 October 13

Why should I care?

Wednesday, 30 October 13

The world is multicore!

Wednesday, 30 October 13

@bantonsson

Amdahl’s Law

Wednesday, 30 October 13

@bantonsson

So what's the catch?

• Really no catch• A different programming paradigm• All about tradeoffs– Some things are easier some harder

• Think different

Wednesday, 30 October 13

Distributed by Design

Wednesday, 30 October 13

@bantonsson

Remote Actors

• Sending messages decouples actors• Local or remote doesn't matter

Wednesday, 30 October 13

@bantonsson

NODE 1 NODE 2

Wednesday, 30 October 13

@bantonsson

Remote Actors

• Zero code change deployment decision• Add configuration to the Actor Systemakka { actor { provider = akka.remote.RemoteActorRefProvider deployment { /greeter { remote = akka.tcp://MySystem@machine1:2552 } } }}

Configure a Remote Provider

Define Remote Path Protocol Actor System Hostname Port

The "greeter" actor

Wednesday, 30 October 13

@bantonsson

Looking up Actors

ActorSelection greeter = system.actorSelection( "akka.tcp://MySystem@machine1:2552/user/greeter");

Wednesday, 30 October 13

Can you see the problem?

Wednesday, 30 October 13

@bantonsson

Fixed addressesakka { actor { provider = akka.remote.RemoteActorRefProvider deployment { /greeter { remote = akka.tcp://MySystem@machine1:2552 } } }}

ActorSelection greeter = system.actorSelection( "akka.tcp://MySystem@machine1:2552/user/greeter");

Wednesday, 30 October 13

Diving Into The Cluster

Wednesday, 30 October 13

@bantonsson

Akka Cluster 2.2

• Gossip-Based Cluster Membership• Failure Detector• Cluster DeathWatch• Cluster-Aware Routers

Wednesday, 30 October 13

@bantonsson

Cluster Membership

• Node ring à la Riak / Dynamo• Gossip-protocol for state dissemination• Vector Clocks to resolve conflicts• Peer based failure detector

Wednesday, 30 October 13

@bantonsson

Node ring with gossiping Members

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

Gossip

Wednesday, 30 October 13

@bantonsson

Gossip Protocol

• Cluster Membership• Leader Determination• Targets Random Node– Partly biased towards nodes with older state

• Push-Pull based– Sender only sends his version number– Receiver asks for newer information

Wednesday, 30 October 13

@bantonsson

Vector Clocks

• Partial ordering in a distributed system• Detects causality violations• Used to reconcile and merge cluster state

Wednesday, 30 October 13

@bantonsson

Failure Detection

• Uses The Phi Accrual Failure Detector• Peer Based with limited targets– B monitors A– A sends heart beats to B– B samples inter-arrival time to expect next beat– B measures continuum of deadness of A– B marks A as unreachable if A is dead enough

Wednesday, 30 October 13

@bantonsson

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

Heartbeat

Wednesday, 30 October 13

@bantonsson

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

MemberNode

Heartbeat

Wednesday, 30 October 13

@bantonsson

Member States

• Joining• Up• Leaving• Exiting• Down• Removed• Unreachable*

unreachable*

joining

up

leaving

exitingdown

(leader action)

(fd*)

(fd*)

(fd*)

(fd*)

leave

(leader action)

(leader action)

join

removed

Wednesday, 30 October 13

@bantonsson

Leader

• No SPOF• Can be any node• No handover involved• Deterministically recognized by all nodes – always the first member in the sorted

membership ring

Wednesday, 30 October 13

@bantonsson

Leader Duties

• Shift members from– Joining to Up– Exiting to Removed– Up to Down (auto-downing) to Removed

• Can only be performed on convergence

Wednesday, 30 October 13

@bantonsson

Cluster Convergence

• A Node sees that all other Nodes have seen this version of the gossip

• Is always local to that Node• Unreachable Nodes blocks this• Mark Unreachable Nodes as Down to proceed– Manual Ops intervention– Automatic action

Wednesday, 30 October 13

@bantonsson

Cluster Metrics

• Gossip based• Metrics about the Nodes in the Cluster– Load– CPU Usage– Processors– Heap Memory• Used, Committed, Max

Wednesday, 30 October 13

@bantonsson

Cluster Roles

• Assign roles to Nodes (named tags)• Nodes can have multiple roles• Restrict work to certain roles• Deterministically recognized role leader

Wednesday, 30 October 13

@bantonsson

Cluster Events

• Subscribe to be notified• Membership Changes– Up, Exited, Removed

• Leader Changed• Metrics Changed• Role Leader Changed• Member Unreachable

Wednesday, 30 October 13

@bantonsson

Cluster DeathWatch

• Triggered by marking node «A» Down– Tell parents of their lost children on «A»– Kill all children of actors on «A»– Send Terminated for actors on «A»

Wednesday, 30 October 13

Building on The Cluster

Wednesday, 30 October 13

@bantonsson

Load Balancing

• Cluster aware routers– Round Robin Router– Consistent Hashing Router– Adaptive Load Balancing Router• Use Cluster Metrics to select target• CPU/Memory/Load

Wednesday, 30 October 13

@bantonsson

Cluster Contributions/Patterns

• Distributed Pub/Sub Mediator– Publish and Subscribe to message flows

• Cluster Singleton– HA singleton actor instance within the cluster

• Cluster Client– Let other systems connect to the cluster

Wednesday, 30 October 13

@bantonsson

DistributedPubSubMediator

Frontend Master

Mediator Mediator

Wednesday, 30 October 13

@bantonsson

DistributedPubSubMediator

Frontend Master

Mediator Mediator

Put

Wednesday, 30 October 13

@bantonsson

DistributedPubSubMediator

Frontend Master

Mediator Mediator

Send

Wednesday, 30 October 13

@bantonsson

DistributedPubSubMediator

Frontend Master

Mediator Mediator

Send

Wednesday, 30 October 13

@bantonsson

ClusterSingleton

ClusterSingletonManager

ClusterSingletonManager

MasterMaster(Standby)

Wednesday, 30 October 13

@bantonsson

ClusterSingleton

ClusterSingletonManager

ClusterSingletonManager

MasterMaster(Standby)

Wednesday, 30 October 13

@bantonsson

ClusterSingleton

ClusterSingletonManager

ClusterSingletonManager

Master Master

Wednesday, 30 October 13

@bantonsson

ClusterClient & ClusterSingleton

Master

Mediator

Mediator

Receptionist

Master(Standby)

Receptionist

ClusterClient

ClusterClient

Worker

Worker

Wednesday, 30 October 13

@bantonsson

Typesafe ActivatorDistributed Workers Cluster Template

• http://typesafe.com/platform/getstarted

Wednesday, 30 October 13

What The Future Brings

Wednesday, 30 October 13

@bantonsson

Gossip Optimizations

• Several times faster Vector Clock comparison• Fewer Vector Clock comparisons• Gossip message size cut in half• Gossip message scrubbing• Lazy deserialization of Gossip messages

Wednesday, 30 October 13

@bantonsson

Return from Unreachable

• Unreachableto Reachable

• Cluster ismore resilientto fluctuations

unreachable*

joining

up

leaving

exitingdown

(leader action)

(fd*)

(fd*)

(fd*)

(fd*)

leave

(leader action)

(leader action)

join

removed

Wednesday, 30 October 13

@bantonsson

Rebuilt Routers

• Rebuilt from the ground• Routing logic usable in Actors• Actor Selection as routees• Improved Cluster behavior

Wednesday, 30 October 13

@bantonsson

Persistence

• New module akka-persistence• Command sourcing & event sourcing• Based on the proven Eventsourced library• Migrate actors by persisting their state

Wednesday, 30 October 13

Resources

Wednesday, 30 October 13

@bantonsson

Survey and Resources

• Help Akka get better. Fill out the survey!– http://tinyurl.com/akka-survey

• Akka Cluster Documentation– http://tinyurl.com/akka-cluster

• Akka Cluster in Production Blog Post – Ryan Tanner• http://tinyurl.com/akka-at-conspire

Wednesday, 30 October 13

@bantonsson

Coursera Course

• Principles of Reactive Programming byMartin Odersky, Erik Meijer and Roland Kuhn– Starts 4th of November 2013– 7 weeks– Workload: 5-7 hours a week– Free as in free beer

• https://www.coursera.org/course/reactive

Wednesday, 30 October 13

Björn Antonsson@bantonsson @akkateam

bjorn.antonsson@typesafe.com

Wednesday, 30 October 13

Recommended