R2D2 slides from Velocity Conference London 2013

R2D2

LinkedIn’s Request/Response Infrastructure

Oby Sumampouw (pronounced o-bee soo-mum-pow)[email protected]

Why R2D2?

Server Cluster

LoadBalancer

LoadBalancer 2

LoadBalancer 3 Cluster 3

Cluster 2

R2D2 in a nutshell

Client

Server forResource

“foo”

Server forResource

“foo”

ProfileService

Server forResource

“foo”

Server forResource

“foo”

InboxService

Server forResource

“foo”

Server forResource

“foo”

AdsService

Send request to get profile?id=123

Zookeeper

• Listens to profile zookeeper node• Get a list of servers’ URIs where profile are hosted• Get notified if a server leaves or joins a cluster• Choose one server to send the request to

???

Request

Servers

Agenda R2D2 Architecture

How information is stored and organized in zookeeper

How R2D2 does load balancing and graceful degradation

Partitioning and sticky routing

Miscellaneous D2 use cases at LinkedIn:

- Redlining

- Cluster variants

Q&A

StarWars™?

•Note*: This R2D2 is not related to StarWars™. Lucas-arts/Disney don’t sue us.

What is rest.li? Open source Java REST framework. Go to http://rest.li

What is D2? Primarily a name server and traffic router

The global “address book” is stored in zookeeper

We store the back-up in the local filesystem

Definitions:

D2 Cluster represents a collection of identical servers that host one or many D2 services

D2 Service represents a service

D2 Uri represents a server’s address and weight

How is D2 information organized and stored?

/ Root

/d2

/d2/clusters /d2/services /d2/uris

/d2/clusters/clusterA

/d2/clusters/clusterB

/d2/services/serviceA1


/d2/services/serviceB

ServiceProperties:

-Cluster = clusterA-Load-balancer configuration-Degrader configuration-Strategy configuration-Etc.

ClusterProperties:

-Partition configuration-Etc.

/d2/uris/clusterA

/d2/uris/clusterB

/d2/uris/clusterB/ephemeralNode1

/d2/uris/clusterB/ephemeralNode2

UriProperties:

-Machine URI-Weight

9

How is zookeeper initialized ?

ZookeeperConfig file

/ Root

/d2



/d2/clusters/clusterB

/d2/clusters/clusterC




ServiceA1Client

ClusterA Server

/d2/uris/clusterA

/d2/uris/clusterA/ephemeralNode1

D2Config.java

D2 Load Balancer Client-side load balancer

Client keeps track of the state

2 Strategies to use:

- Random

- Degrader

LOAD_BALANCE

Individual Server stats:

Cluster total call count:0

Cluster average latency:0 ms

Cluster drop rate:0.0

LOAD_BALANCE



How does the degrader load balancer work?

Server 1

Server 2

Client

Total Call Count: 0

Latency: 0 ms

Total Call Count: 0

Latency: 0 ms

100 points

100 points

Period 1Period 2

Total Call Count: 100

Latency: 4900 ms


Latency: 100 ms

61 points

CALL_DROPPING


Period 3

CALL_DROPPING


Cluster average latency:3636.5 ms


Latency: 4900 ms


Latency: 3000 ms

LOAD_BALANCE



Notice: The number of points don’t change because we are in CALL_DROPPINGmode

LB Configuration:Latency Low Water Mark: 500 msLatency High Water Mark: 2000 msMin Call Count: 10

How does the degrader recover from a bad state?

Server 1

Server 2

Client

Period N

LOAD_BALANCE





1 points

1 point

Total Call Count: 0

Latency: 0 ms

Total Call Count: 0

Latency: 0 ms

CALL_DROPPING


2 points

2 points

Notice:We’re in recovery modeBecause we choke all trafficSo we will try recoveringregardless of call stats

Period N+1

CALL_DROPPING


LOAD_BALANCE



Period N+2


Latency: 150 ms


Latency: 200 ms

LOAD_BALANCE



Cluster average latency:178.6 ms


37 points

37 points

CALL_DROPPING


Period N+3


Latency: 200 ms


Latency: 200 ms

CALL_DROPPING





LOAD_BALANCE



LB Configuration:Latency Low Water Mark: 500 msLatency High Water Mark: 2000 msMin Call Count: 10

A few more extra details Min call count is reduced depending on how degraded the state is

It’s not just latency, we also consider error rate and number of outstanding calls

We can use many types of latency:

- AVERAGE

- 90%

- 95%

- 99%

We can set different low/high water mark

for cluster vs for individual node

Call Dropping vs Load Balancing

Call Dropping Mode Load Balancing Mode

Affects the entire clustersAffects only individual machines in the cluster

Purpose: graceful degradation Purpose: load balancing traffic

Drop Rate Points

Hints: LatencyHints: individual node latency, error rate, #outstanding calls

Partitioning and Sticky Routing

D2 supports partitioning of clusters

- Range partitioning

- Hash partitioning (MD5 or Modulo)

- Use regex to extract key from URI

to determine where a request should go

Sticky routing within partition is also supported

- Use regex to extract key from URI (same

as above)

- Use consistent hash ring

Consistent Hash Ring

|Integer.MAX_INT Integer.MIN_INT

0100 -100

app1.foo.com

app2.foo.com

app3.foo.com

Request for “foo”

Miscellaneous D2 use cases Redlining: Measure max capacity of server

Use real traffic

Don’t have to worry about mutable operations

|Integer.MAX_INT Integer.MIN_INT

0100 -100

app1.foo.com

app2.foo.com

app3.foo.com

Miscellaneous D2 use cases What if there are different requirements from different clients?

Let’s say we have a service called profile.

- For clients who can only view profile, we want them to go to read-only cluster

- For clients who can edit profile, we want them to go to read-write cluster.

Use Cluster variant technique

Cluster variant allows changing D2 Service’s namespace to get around the restriction that zookeeper node’s name must be unique.

Miscellaneous D2 use cases/ Root

/d2


/d2/clusters/readonly

/d2/clusters/readwrite

/d2/services/profile

ServiceProperties:

-Cluster = readonly

/d2/uris/readonly

/d2/uris/readwrite

/d2/profileClusterVariant

/d2/profileClusterVariant/profile

ServiceProperties:

-Cluster = readwrite

/d2/uris/readonly/ephemeralNode1

/d2/uris/readwrite/ephemeralNode1

readonlyServer

readwriteServer

View Client Edit Client

Request for profile

Request for profile

Q&A

Questions?

Email me at: [email protected]

Check out http://rest.li https://github.com/linkedin/rest.li for more info

We’re hiring!

mailto:[email protected]

http://rest.li/

https://github.com/linkedin/rest.li



Cross data center routing

©2013 LinkedIn Corporation. All Rights Reserved. 21

ZookeeperData Center 1

ZookeeperData Center 2

Server Cluster for Data Center 1

Server Cluster for Data Center 2

/ Root

/d2



/d2/clusters/clusterA-1


/d2/services/serviceA

/d2/services/serviceA-1

/d2/services/serviceA-2


/d2/uris/clusterA-2/ephemeralNode1


/d2/uris/clusterA/ephemeralNode1

ServiceProperties:-Cluster = clusterA-2

View of Zookeeper

In Data Center 1

Client in Data Center 1

Client in Data Center 2

Engineering

R2D2 slides from Velocity Conference London 2013