Building a Scalable Architecture 1211657706511562 8

7/29/2019 Building a Scalable Architecture 1211657706511562 8

1/46

Intelligent People. Uncommon Ideas.

11

(Lessons Learned @ Directi)

By Bhavin Turakhia

CEO, Directi

(http://www.directi.com | http://wiki.directi.com | http://careers.directi.com)

Licensed under Creative Commons Attribution Sharealike Noncommercial
http://www.directi.com/http://wiki.directi.com/http://careers.directi.com/http://careers.directi.com/http://careers.directi.com/http://careers.directi.com/http://wiki.directi.com/http://www.directi.com/


2/46

22

Why is Scalability important

Introduction to the Variables and Factors

Building our own Scalable Architecture (in incrementalsteps) Vertical Scaling

Vertical Partitioning

Horizontal Scaling

Horizontal Partitioning

etc

Platform Selection Considerations Tips


3/46

33

Viral marketing can result in instant successes

RSS / Ajax / SOA pull based / polling type

XML protocols - Meta-data > data

Number of Requests exponentially grows with user base

RoR / Grails Dynamic language landscape gainingpopularity

In the end you want to build a Web 2.0 app that can servemillions of users with ZERO downtime


4/46

44

Scalability - Number of users / sessions / transactions /operations the entire system can perform

Performance Optimal utilization of resources

Responsiveness Time taken per operation

Availability - Probability of the application or a portion of theapplication being available at any given point in time

Downtime Impact - The impact of a downtime of aserver/service/resource - number of users, type of impact etc

Cost

Maintenance Effort

High: scalability, availability, performance & responsiveness

Low: downtime impact, cost & maintenance effort


5/46

55

Platform selection

Hardware

Application Design

Database/Datastore Structure and Architecture

Deployment Architecture Storage Architecture

Abuse prevention

Monitoring mechanisms

and more


6/46

66

We will now build an example architecture for an exampleapp using the following iterative incremental steps Inspect current Architecture

Identify Scalability Bottlenecks

Identify SPOFs and Availability Issues

Identify Downtime Impact Risk Zones

Apply one of -

Vertical Scaling

Vertical Partitioning

Horizontal Scaling

Horizontal Partitioning

Repeat process


7/4677

Appserver &

DBServer


8/4688

Appserver,

DBServer CPU

CPURAM RAM


9/4699

Introduction Increasing the hardware resources

without changing the number of nodes

Referred to as Scaling up the Server

Advantages Simple to implement

Disadvantages Finite limit

Hardware does not scale linearly(diminishing returns for eachincremental unit)

Requires downtime

Increases Downtime Impact

Incremental costs increaseexponentially

Appserver,

DBServer

CPU

CPURAM RAM

CPU

CPU

RAM RAM


10/46

1010

AppServer

DBServer

Introduction Deploying each service on a separate node

Positives Increases per application Availability

Task-based specialization, optimization and

tuning possible Reduces context switching

Simple to implement for out of band

processes

No changes to App required

Flexibility increases

Negatives Sub-optimal resource utilization

May not increase overall availability

Finite Scalability


11/46

1111

The term Vertical Partitioning denotes Increase in the number of nodes by distributing the

tasks/functions

Each node (or cluster) performs separate Tasks

Each node (or cluster) is different from the other

Vertical Partitioning can be performed at various layers(App / Server / Data / Hardware etc)


12/46

1212

AppServer AppServer AppServer

Load Balancer

DBServer

Introduction Increasing the number of nodes of

the App Server through LoadBalancing

Referred to as Scaling out theApp Server


13/46

1313

The term Horizontal Scaling denotes Increase in the number of nodes by replicating the nodes

Each node performs the same Tasks

Each node is identical

Typically the collection of nodes maybe known as a cluster

(though the term cluster is often misused) Also referred to as Scaling Out

Horizontal Scaling can be performed for any particulartype of node (AppServer / DBServer etc)


14/46

1414

Hardware Load balancers are faster

Software Load balancers are more customizable With HTTP Servers load balancing is typically combined

with http accelerators


15/46

1515

Sticky Sessions Requests for a given user are

sent to a fixed App Server

Observations

Asymmetrical load distribution(especially during downtimes)

Downtime Impact Loss ofsession data


Load Balancer

Sticky Sessions

User 1 User 2


16/46

1616

Central Session Store Introduces SPOF An additional variable

Session reads and writesgenerate Disk + Network I/O

Also known as a Shared

Session Store Cluster


Load Balancer

Session Store

Central Session Storage


17/46

1717

Clustered Session

Management Easier to setup

No SPOF

Session reads are instantaneous

Session writes generate Network

I/O Network I/O increases

exponentially with increase innumber of nodes

In very rare circumstances arequest may get stale sessiondata

User request reaches subsequentnode faster than intra-nodemessage

Intra-node communication fails

AKA Shared-nothing Cluster


Load Balancer

Clustered Session Management


18/46

1818

Sticky Sessions with Central

Session Store Downtime does not cause loss

of data

Session reads need notgenerate network I/O

Sticky Sessions withClustered SessionManagement No specific advantages

Sticky Sessions


Load Balancer

User 1 User 2


19/46

1919

Recommendation Use Clustered Session Management if you have

Smaller Number of App Servers

Fewer Session writes

Use a Central Session Store elsewhere

Use sticky sessions only if you have to


20/46

2020

In a Load Balanced App

Server Cluster the LB is anSPOF

Setup LB in Active-Active orActive-Passive mode Note: Active-Active nevertheless

assumes that each LB isindependently able to take upthe load of the other

If one wants ZERO downtime,then Active-Active becomes truly

cost beneficial only if multipleLBs (more than 3 to 4) are daisychained as Active-Active formingan LB Cluster


Load Balancer

Active-Passive LB

Load Balancer


Load Balancer

Active-Active LB

Load Balancer

Users

Users


21/46

2121

DBServer

Our deployment at the end ofStep 4

Positives Increases Availability and

Scalability

No changes to App required Easy setup

Negatives Finite Scalability

Load Balanced

App Servers


22/46

2222

DBServer

Introduction Partitioning out the Storage

function using a SAN

SAN config options Refer to Demystifying Storage at

http://wiki.directi.com -> DevUniversity -> Presentations

Positives Allows Scaling Up the DB Server

Boosts Performance of DB Server

Negatives Increases Cost

Load Balanced

App Servers

SAN
http://wiki.directi.com/http://wiki.directi.com/


23/46

2323

DBServer

Introduction Increasing the number of DB nodes Referred to as Scaling out the DB

Server

Options Shared nothing Cluster

Real Application Cluster (or Shared

Storage Cluster)DBServer DBServer

Load Balanced

App Servers

SAN


24/46

2424

Each DB Server node hasits own complete copy of thedatabase

Nothing is shared betweenthe DB Server Nodes

This is achieved through DBReplication at DB / Driver /

App level or through a proxy

Supported by most RDBMs

natively or through 3rd partysoftware

DBServer

Database

DBServer

Database

DBServer

Database

Note: Actual DB files maybe

stored on a central SAN


25/46

2525

Master-Slave Writes are sent to a single master which replicates the data to

multiple slave nodes

Replication maybe cascaded

Simple setup

No conflict management required Multi-Master

Writes can be sent to any of the multiple masters which replicate

them to other masters and slaves

Conflict Management required

Deadlocks possible if same data is simultaneously modified at

multiple places


26/46

2626

Asynchronous Guaranteed, but out-of-band replication from Master to Slave

Master updates its own db and returns a response to client

Replication from Master to Slave takes place asynchronously

Faster response to a client

Slave data is marginally behind the Master

Requires modification to App to send critical reads and writes tomaster, and load balance all other reads

Synchronous Guaranteed, in-band replication from Master to Slave

Master updates its own db, and confirms all slaves have updatedtheir db before returning a response to client

Slower response to a client

Slaves have the same data as the Master at all times

Requires modification to App to send writes to master and load

balance all reads


27/46

2727

Replication at RDBMS level Support may exists in RDBMS or through 3rd party tool

Faster and more reliable

App must send writes to Master, reads to any db and critical reads

to Master

Replication at Driver / DAO level Driver / DAO layer ensures

writes are performed on all connected DBs

Reads are load balanced

Critical reads are sent to a Master

In most cases RDBMS agnostic

Slower and in some cases less reliable


28/46

2828

All DB Servers in the cluster

share a common storagearea on a SAN

All DB servers mount thesame block device

The filesystem must be aclustered file system (eg

GFS / OFS)

Currently only supported by

Oracle Real ApplicationCluster

Can be very expensive(licensing fees)

DBServer

SAN

DBServer DBServer

Database


29/46

2929

Try and choose a DB which

natively supports Master-Slave replication

Use Master-Slave Asyncreplication

Write your DAO layer toensure writes are sent to a single DB

reads are load balanced

Critical reads are sent to amaster

DBServer

DBServer

DBServer

Load Balanced

App Servers

Writes & Critical Reads

Other Reads


30/46

3030

Our architecture now looks like

this Positives

As Web servers grow, Databasenodes can be added

DB Server is no longer SPOF

Negatives Finite limit

Load Balanced

App Servers

DB Cluster

DB DB DB

SAN


31/46

3131

Shared nothing clusters have a

finite scaling limit Reads to Writes 2:1

So 8 Reads => 4 writes

2 DBs

Per db 4 reads and 4 writes

4 DBs Per db 2 reads and 4 writes

8 DBs

Per db 1 read and 4 writes

At some point adding anothernode brings in negligibleincremental benefit

ReadsWrites

DB1 DB2


32/46

3232

Introduction

Increasing the number of DBClusters by dividing the data

Options Vertical Partitioning - Dividing

tables / columns

Horizontal Partitioning - Dividing by

rows (value)

Load Balanced

App Servers

DB Cluster

DB DB DB

SAN


33/46

3333

Take a set of tables and move

them onto another DB Eg in a social network - the users

table and the friends table can be onseparate DB clusters

Each DB Cluster has different

tables Application code or DAO / Driver

code or a proxy knows where agiven table is and directs queriesto the appropriate DB

Can also be done at a columnlevel by moving a set of columnsinto a separate table

App Cluster

DB Cluster 1

Table 1

Table 2

DB Cluster 2

Table 3

Table 4


34/46

3434

Negatives

One cannot perform SQL joins ormaintain referential integrity(referential integrity is as such over-rated)

Finite Limit

App Cluster

DB Cluster 1

Table 1

Table 2

DB Cluster 2

Table 3

Table 4


35/46

3535

Take a set of rows and move

them onto another DB Eg in a social network each DB

Cluster can contain all data for 1million users

Each DB Cluster has identical

tables Application code or DAO / Driver

code or a proxy knows where agiven row is and directs queriesto the appropriate DB

Negatives SQL unions for search type queries

must be performed within code

App Cluster

DB Cluster 1

Table 1

Table 2

Table 3

Table 4

DB Cluster 2

Table 1

Table 2

Table 3

Table 4

1 million users 1 million users


36/46

3636

Techniques

FCFS 1st million users are stored on cluster 1 and the next on cluster 2

Round Robin

Least Used (Balanced)

Each time a new user is added, a DB cluster with the least users is

chosen Hash based

A hashing function is used to determine the DB Cluster in which theuser data should be inserted

Value Based

User ids 1 to 1 million stored in cluster 1 OR all users with names starting from A-M on cluster 1

Except for Hash and Value based all other techniques alsorequire an independent lookup map mapping user to DatabaseCluster

This map itself will be stored on a separate DB (which mayfurther need to be replicated)


37/46

3737

Load BalancedApp Servers

DB Cluster

DB DB DB

DB Cluster

DB DB DB

Lookup

Map

SAN

Our architecture now looks

like this Positives

As App servers grow, DatabaseClusters can be added

Note: This is not the same astable partitioning provided bythe db (eg MSSQL)

We may actually want tofurther segregate these into

Sets, each serving acollection of users (refer nextslide


38/46

3838

Load Balanced

App Servers

DB Cluster

DB DB DB

DB Cluster

DB DB DB

Lookup

Map

SAN

Load Balanced

App Servers

DB Cluster

DB DB DB

DB Cluster

DB DB DB

Lookup

Map

SAN

Global Redirector

Global

Lookup

Map

SET 1 10 million users SET 2 10 million users

Now we consider each deployment as a single Set serving

a collection of users


39/46

3939

The goal behind creating sets is easier manageability

Each Set is independent and handles transactions for aset of users

Each Set is architecturally identical to the other Each Set contains the entire application with all its data

structures Sets can even be deployed in separate datacenters Users may even be added to a Set that is closer to them

in terms of network latency


40/46

4040

App Servers

Cluster

DB Cluster

SAN

Global Redirector

SET 1

DB Cluster

App Servers

Cluster

DB Cluster

SAN

SET 2

DB Cluster

Our architecture now looks

like this Positives

Infinite Scalability

Negatives Aggregation of data across sets

is complex

Users may need to be movedacross Sets if sizing is improper

Global App settings andpreferences need to be

replicated across Sets


41/46

4141

Add caches within App Server

Object Cache Session Cache (especially if you are using a Central Session

Store)

API cache

Page cache

Software Memcached

Teracotta (Java only)

Coherence (commercial expensive data grid by Oracle)


42/46

4242

If your app is a web app you should add an HTTP

Accelerator or a Reverse Proxy A good HTTP Accelerator / Reverse proxy performs the

following Redirect static content requests to a lighter HTTP server (lighttpd)

Cache content based on rules (with granular invalidation support)

Use Async NIO on the user side

Maintain a limited pool of Keep-alive connections to the App Server

Intelligent load balancing

Solutions

Nginx (HTTP / IMAP) Perlbal

Hardware accelerators plus Load Balancers


43/46

4343

CDNs

IP Anycasting Async Nonblocking IO (for all Network Servers) If possible - Async Nonblocking IO for disk Incorporate multi-layer caching strategy where required

L1 cache in-process with App Server

L2 cache across network boundary

L3 cache on disk

Grid computing Java GridGain

Erlang natively built in


44/46

4444

Programming Languages and Frameworks

Dynamic languages are slower than static languages Compiled code runs faster than interpreted code -> use

accelerators or pre-compilers

Frameworks that provide Dependency Injections, Reflection,Annotations have a marginal performance impact

ORMs hide DB querying which can in some cases result in poorquery performance due to non-optimized querying

RDBMS MySQL, MSSQL and Oracle support native replication

Postgres supports replication through 3rd party software (Slony)

Oracle supports Real Application Clustering MySQL uses locking and arbitration, while Postgres/Oracle use

MVCC (MSSQL just recently introduced MVCC)

Cache Teracotta vs memcached vs Coherence


45/46

4545

All the techniques we learnt today can be applied in any

order Try and incorporate Horizontal DB partitioning by value

from the beginning into your design

Loosely couple all modules

Implement a REST-ful framework for easier caching Perform application sizing ongoingly to ensure optimalutilization of hardware


46/46

Intelligent People. Uncommon Ideas.

[email protected]

http://directi.comhttp://careers.directi.com

http://wiki.directi.com

Documents

Building a Scalable Architecture 1211657706511562 8