36
Cluster-Based Scalable Cluster-Based Scalable Network Service Network Service Author: Armando Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Eric A. Brewer Brewer Paul Gauthier Paul Gauthier Presenter: Kang Cao Presenter: Kang Cao

Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Embed Size (px)

Citation preview

Page 1: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Cluster-Based Scalable Cluster-Based Scalable Network ServiceNetwork Service

Author: Armando Author: Armando

Steven D.GribbleSteven D.Gribble

Yatin ChawatheYatin Chawathe

Eric A. BrewerEric A. Brewer

Paul GauthierPaul Gauthier

Presenter: Kang CaoPresenter: Kang Cao

Page 2: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Over ViewOver View

• IntroductionIntroduction• Cluster-Based Scalable Service Cluster-Based Scalable Service

ArchitectureArchitecture• Service ImplementationService Implementation• MeasurementsMeasurements• DiscussDiscuss• conclusionconclusion

Page 3: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

IntroductionIntroduction

• GoalGoal• Advantages of ClustersAdvantages of Clusters• Challenges of Cluster computingChallenges of Cluster computing• BASE SemanticsBASE Semantics

Page 4: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

GoalGoal

• ScalabilityScalability– Keep same per-user cost as load Keep same per-user cost as load

increases.increases.

• Availability:Availability:– Run 24 hour a day and 7 day a weekRun 24 hour a day and 7 day a week

• Cost effectivenessCost effectiveness

Page 5: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

AdvantagesAdvantages

• ScalabilityScalability– Clusters are well suited to Internet Clusters are well suited to Internet

Service workloadService workload– Incremental scalability Incremental scalability

• High availabilityHigh availability• Commodity building blocksCommodity building blocks

– Cheap commodity PCCheap commodity PC– Get service quickly and cheapGet service quickly and cheap

Page 6: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

challengeschallenges

• Administration Administration • Component VS. System replicationComponent VS. System replication• Partial failuresPartial failures• Share statesShare states

Page 7: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

BASE SemanticsBASE Semantics

Against ACID(atomicity, Against ACID(atomicity, consistency,isolation,durability)consistency,isolation,durability)

• StaleStale• Soft stateSoft state• ApproximateApproximate

Page 8: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Cluster-Based Scalable Cluster-Based Scalable Service ArchitectureService Architecture

• Layer ArchitectureLayer Architecture• Separate network services from Separate network services from

their implementationtheir implementation• Stateless workersStateless workers

Page 9: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Cluster-Based Scalable Cluster-Based Scalable Service Architecture Service Architecture

• SNSSNS• TACCTACC• ServiceService

Page 10: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Scalable network serviceScalable network service

• Incremental and absolute scalabilityIncremental and absolute scalability• Worker load balancing and overflow Worker load balancing and overflow

managementmanagement• Front-end availability, fault tolerance Front-end availability, fault tolerance

mechanismsmechanisms• System monitoring and logging System monitoring and logging

Page 11: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

SNSSNS

SNS SNS ManagerManager

SNS SNS ManagerManager

InternalInternalNetworkNetwork

Front EndFront EndFront EndFront End

MSMSMSMS

Front EndFront EndFront EndFront End

MSMSMSMS

Front EndFront EndFront EndFront End

MSMSMSMS

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

...

...

$

$

Internet

Page 12: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Load balanceLoad balance

• Centralized load balancingCentralized load balancing• Easy to implementEasy to implement

Page 13: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

How to handle BurstsHow to handle Bursts

• Has a overflow poolHas a overflow pool• Manager can spawn workers on Manager can spawn workers on

overflow machines on the demandoverflow machines on the demand

Page 14: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

ScalabilityScalability

• Components replicated Components replicated • Amount of additional resources Amount of additional resources

required is a linear function of the required is a linear function of the increase in offered loadincrease in offered load

• Partition the function between front Partition the function between front end and workerend and worker

• Keep worker as simple as possible Keep worker as simple as possible

Page 15: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Fault Tolerance and AvailabilityFault Tolerance and Availability

• Process peer fault toleranceProcess peer fault tolerance• Using soft statesUsing soft states• Timeout as an additional fault-Timeout as an additional fault-

tolerance mechanismtolerance mechanism

Page 16: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

TACCTACC

TACC: Transformation, Aggregation, Caching, TACC: Transformation, Aggregation, Caching, CustomizationCustomization

• API for composition of stateless data API for composition of stateless data transformation and content aggregation transformation and content aggregation modulesmodules

• Uniform caching of original, post-aggregation Uniform caching of original, post-aggregation and post-transformation dataand post-transformation data

• Transparent access to Customization databaseTransparent access to Customization database

Page 17: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

TACCTACC

A programming model for internet A programming model for internet ServiceService

• TransformationTransformation• Aggregation Aggregation • CachingCaching• CustomizationCustomization

Page 18: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Service ImplementationService Implementation

• Workers that present human Workers that present human interface to what TACC modules interface to what TACC modules do, including device-specific do, including device-specific presentationpresentation

• User interface to control the User interface to control the serviceservice

• Most service can be done at the Most service can be done at the service and TACC layersservice and TACC layers

Page 19: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Example:TranSendExample:TranSend

Model pool

switch

workstation Workstation workstation

Internet

Page 20: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

TranSendTranSend

• Front EndsFront Ends• Load balancing ManagerLoad balancing Manager• User profile DatabaseUser profile Database• Cache NodesCache Nodes• Datatype-Specific DistillersDatatype-Specific Distillers• Graphical MonitorGraphical Monitor

Page 21: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Load Balancing ManagerLoad Balancing Manager

• Client-side JavaScript support Client-side JavaScript support balance load across multiple front balance load across multiple front endsends

• Centralized manager for internal Centralized manager for internal load balancingload balancing

Page 22: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Load balancingLoad balancing

• components register to managercomponents register to manager• Front end asks manager to give it a Front end asks manager to give it a

worker when it has taskworker when it has task• Manager locates a worker to Front endManager locates a worker to Front end• Manager may create a new distiller Manager may create a new distiller • Workers report their load to managerWorkers report their load to manager

Page 23: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Load balancingLoad balancing

• Manager broadcast the information Manager broadcast the information of load periodicallyof load periodically

• FrontEnds cache these informationFrontEnds cache these information• FrontEnds use the cached FrontEnds use the cached

information to dispatch requests to information to dispatch requests to workersworkers

Page 24: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Fault Tolerance and crash Fault Tolerance and crash RecoveryRecovery

• Using BASE semantics simplifies Using BASE semantics simplifies crash recoverycrash recovery

• Manager reports workers failures to Manager reports workers failures to the FrontEndthe FrontEnd

• Manager detects and restarts a Manager detects and restarts a crashed front endcrashed front end

• The front end detects and restarts The front end detects and restarts a crashed managera crashed manager

Page 25: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

PerformancePerformanceLoad balancingLoad balancing

Page 26: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Performance:Performance:Load balancingLoad balancing

Page 27: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Conclusions:Conclusions:

• Layer architecture for cluster-base Layer architecture for cluster-base scalable network servicescalable network service

• The architecture is reusableThe architecture is reusable• Cluster-based value-added network Cluster-based value-added network

services will become an important services will become an important Internet-service paradigmInternet-service paradigm

Page 28: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Performance:Performance:ScalabilityScalability

Page 29: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

questionquestion

1.1. Why are the cluster-based Why are the cluster-based network service well suited to network service well suited to internet serviceinternet service

Page 30: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

answeranswer

• The requirements are highly The requirements are highly parallel( many indepent parallel( many indepent simultaneous users)simultaneous users)

• The grain size typically corresponds The grain size typically corresponds to at most a few CPU seconds on a to at most a few CPU seconds on a commodity PCcommodity PC

Page 31: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Question 2Question 2

• Why does the cluster-base network Why does the cluster-base network service use BASE semantics?service use BASE semantics?

Page 32: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Answer:Answer:

• BASE semantics allow us to handle BASE semantics allow us to handle partial failure in clusters with less partial failure in clusters with less complexity and cost.complexity and cost.

Page 33: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Question 3Question 3

• When the overflow machines are When the overflow machines are being recruited unusually often, being recruited unusually often, what should be done at this time?what should be done at this time?

Page 34: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Answer:Answer:

• It is time to add new machines. It is time to add new machines.

Page 35: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Question 4Question 4

• Does the Frontend crash not lost Does the Frontend crash not lost any information? If does, what kind any information? If does, what kind information will be lost?information will be lost?

Page 36: Cluster-Based Scalable Network Service Author: Armando Steven D.Gribble Steven D.Gribble Yatin Chawathe Yatin Chawathe Eric A. Brewer Eric A. Brewer Paul

Answer:Answer:

• User requests will be lost and user User requests will be lost and user need to handle timeout and resend need to handle timeout and resend request.request.