35
Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy, J. Wu and D. K. Panda The Ohio State University

Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy,

Embed Size (px)

Citation preview

Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand

S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy, J. Wu and D. K. Panda

The Ohio State University

Presentation Outline

Introduction/Motivation Design and Implementation Experimental Results Conclusions

Introduction

Fast Internet Growth Number of Users Amount of data Types of services

Several uses E-Commerce, Online Banking, Online Auctions, etc

Types of Content Images, documents, audio clips, video clips, etc - Static

Content Stock Quotes, Online Stores (Amazon), Online Banking,

etc. - Dynamic Content (Active

Presentation Outline

Introduction/Motivation Multi-Tier Data-Centers Active Caches InfiniBand

Design and Implementation Experimental Results Conclusions

Multi-Tier Data-Centers

Single Powerful Computers Clusters

Low ‘Cost to Performance’ Ratio Increasingly Popular

Multi-Tier Data-Centers Scalability – an important issue

A Typical Multi-Tier Data-Center

Database Servers

Clients

Application Servers

Web Servers

Proxy Nodes

Tier 0

Tier 1

Tier 2

Apache

PHP

WAN

Tiers of a Typical Multi-Tier Data-Center Proxy Nodes

Handle Caching, load balancing, security, etc Web Servers

Handle the HTML content Application Servers

Handle Dynamic Content, Provide Services Database Servers

Handle persistent storage

Data-Center Characteristics

Computation

Front-End Tiers

Back-End Tiers

• The amount of computation required for processing each request increases as we go to the inner tiers of the Data-Center• Caching at the front tiers is an important factor for scalability

Presentation Outline

Introduction/Motivation Introduction Multi-Tier Data-Centers Active Caches InfiniBand

Design and Implementation Experimental Results Conclusions

Caching

Can avoid re-fetching of content

Beneficial if requests repeat

Static content caching Well studied in the past Widely used

Front-End Tiers

Back-End Tiers

Number ofRequestsDecrease

Active Caching Dynamic Data

Stock Quotes, Scores, Personalized Content, etc

Simple caching methods not suited Issues

Consistency Coherency

Proxy NodeCache

Back-EndData

User Request

Update

Cache Consistency

Non-decreasing views of system state Updates seen by all or none

User Requests

Proxy Nodes

Back-End Nodes

Update

Cache Coherency

Refers to the average staleness of the document served from cache

Two models of coherence Bounded staleness (Weak Coherency) Strong or immediate (Strong Coherency)

Strong Cache Coherency

An absolute necessity for certain kinds of data Online shopping, Travel ticket availability, Stock

Quotes, Online auctions Example: Online banking

Cannot afford to show different values to different concurrent requests

Caching policiesConsistency Coherency

No Caching Client Polling Invalidation * TTL/Adaptive TTL

*D. Li, P. Cao, and M. Dahlin. WCIP: Web Cache Invalidation Protocol. IETF Internet Draft, November 2000.

Presentation Outline

Introduction/Motivation Introduction Multi-Tier Data-Centers Active Caches InfiniBand

Design and Implementation Experimental Results Conclusions

InfiniBand High Performance

Low latency High Bandwidth

Open Industry Standard Provides rich features

RDMA, Remote Atomic operations, etc Targeted for Data-Centers Transport Layers

VAPI IPoIB SDP

Performance

Latency

0

20

40

60

80

100

120

140

2 4 8 16 32 64 128 256 512 1K 2K 4K 8K 16K

Message Size

Lat

ency

(us

)

IPoIBSDPVAPI

Throughput

0

100

200

300

400

500

600

700

800

900

4 16 64 256 1K 4K 16K 64KMessage Size

Thro

ughput (M

B/s)

IPoIBSDPVAPI

• Low latencies of less than 5us achieved• Bandwidth over 840 MB/s

* SDP and IPoIB from Voltaire’s Software Stack

Performance

• Receiver side CPU utilization is very low• Leveraging the benefits of One sided communication

Throughput (RDMA Read)

0

1000

2000

3000

4000

5000

6000

7000

1 4 16 64 256 1K 4K 16K 64K 256K

Message Size (bytes)

Th

rou

gh

pu

t (M

bp

s)

0

5

10

15

20

25

Send CPU Recv CPU

Throughput (Poll) Throughput (Event)

Caching policies

Consistency Coherency

No Caching Client Polling Invalidation TTL/Adaptive TTL

Objective

To design an architecture that very

efficiently supports strong cache

coherency on InfiniBand

Presentation Outline

Introduction/Motivation Design and Implementation Experimental Results Conclusions

Basic Architecture

External modules are used Module communication can use any transport

Versioning: Application servers version dynamic data Version value of data passed to front end with

every request to back-end Version maintained by front end along with

cached value of response

Mechanism

Cache Hit: Back-end Version Check If version current, use cache Invalidate data for failed version check

Cache Miss Get data to cache Initialize local versions

Architecture

Front-End Back-End

Request

Cache Hit

Cache Miss

Response

Design

Every server has an associated module that uses IPoIB, SDP or VAPI to communicate

VAPI: When a request arrives at proxy, VAPI module is

contacted. Module reads latest version of the data from the

back-end using one-sided RDMA Read operation If versions do not match, cached value is

invalidated

VAPI Architecture

Front-End Back-End

Request

Cache Hit

Cache Miss

Response

RDMA Read

Implementation

Socket-based Implementation: IPoIB and SDP are used Back-end version check is done using two-sided

communication from the module Requests to read and update are mutually

excluded at the back-end module to avoid simultaneous readers and writers accessing the same data.

Minimal changes to existing software

Presentation Outline

Introduction/Motivation Design and Implementation Experimental Results Conclusions

Data-Center: Performance

• The VAPI module can sustain performance even with heavy load on the back-end servers

DataCenter: Throughput

0

500

1000

1500

2000

2500

0 10 20 30 40 50 60 70 80 90 100 200

Number of Compute Threads

Tra

nsactions p

er

second (

TP

S)

No Cache IPoIB VAPI SDP

Data-Center: Performance

• The VAPI module responds faster even with heavy load on the back-end servers

Datacenter: Response Time

012

34567

89

10

0 10 20 30 40 50 60 70 80 90 100 200

Number of Compute Threads

Response t

ime (

ms)

NoCache IPoIB VAPI SDP

Response Time BreakupResponse Time Splitup - 0 Compute Threads

0

1

2

3

4

5

6

7

8

ClientCommunication

ProxyProcessing

ModuleProcessing

Backend versioncheck

Tim

e (m

s)

IPoIB

SDP

VAPI

• Worst case Module Overhead less than 10% of the response time• Minimal overhead for VAPI based version check even for 200 compute threads

Response Time Splitup - 200 Compute Threads

0

1

2

3

4

5

6

7

8

ClientCommunication

ProxyProcessing

ModuleProcessing

Backendversion check

Tim

e (

ms)

IPoIB

SDP

VAPI

Data-Center: Throughput

Throughput: ZipF distribution

0

100

200

300

400

500

600

700

800

0 10 20 30 40 50 60 70 80 90 100 200

Number of Compute Threads

Tran

sact

ions

per

Sec

ond

(TP

S)

No Cache IPoIB VAPI SDP

ThroughPut: World Cup Trace

0

500

1000

1500

2000

2500

3000

0 10 20 30 40 50 60 70 80 90 100 200

Number of Compute Threads

Tran

sact

ions

Per

Sec

ond

(TPS

)

NoCache IPoIB VAPI SDP

• The drop in the throughput of VAPI in World cup trace is due to the higher penalty for cache misses under increased load• VAPI implementation does better for real trace too

Conclusions

An architecture for supporting Strong Cache Coherence

External module based design Freedom in choice of transport Minimal changes to existing software

Sockets API inherent limitation Two-sided communication High performance Sockets not the solution (SDP)

Main benefit One sided nature of RDMA calls

Web Pointers

http://nowlab.cis.ohio-state.edu/

E-mail: {narravul, balaji, vaidyana, savitha, wuj, panda}@cis.ohio-state.edu

NBC home page