33
Storage I/O Control: Proportional Allocation of Shared Storage Resources Chethan Kumar Sr. Member of Technical Staff, R&D VMware, Inc.

Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Storage I/O Control:Proportional Allocation of Shared Storage Resources

Chethan KumarSr. Member of Technical Staff, R&D

VMware, Inc.

Page 2: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Outline

The Problem

Storage IO Control (SIOC) overview

Technical Details

SIOC in Action

• Case study 1: Benefit of Disk Shares

• Case study 2: Dynamic IO Prioritization

Conclusions

Page 3: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

The Problem

What you see

DatabaseServer Farms

Online store: Product Catalog

Online Store:Data Mining(low priority)

Storage Array LUN

Online Store:Order Processing

What you want to see

Storage Array LUN

Online store: Product Catalog

Online Store:Data Mining(low priority)

Online Store:Order Processing

Page 4: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

The Solution: Resource Controls

Online Store:Data Mining

Shared Storage

Online store: Product Catalog

High Disk Shares

Online Store:Order Processing

High Disk Shares

Low Disk Shares

Shares: Relative priority of a virtual machine (VM)

Page 5: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Outline

The Problem

Storage IO Control (SIOC) overview

Technical Details

SIOC in Action

• Case study 1: Benefit of Disk Shares

• Case study 2: Dynamic IO Prioritization

Conclusions

Page 6: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Typical vSphere Datacenter Architecture

10 10 10 50 20 30 50

1005030 VMs running across multiple hosts Hosts share LUNs using Virtual

Machine File System (a cluster file system)

Issue:VMs interfere with each other

Desired Solution:

Performance isolationwhile maximizing array utilization Shares

Page 7: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Disk Shares

Online Store:Data Mining

Shared Storage

Online store: Product Catalog

High Disk Shares

Online Store:Order Processing

High Disk Shares

Low Disk Shares

Shares: Relative priority of a virtual machine (VM)

E.g., 2x shares 2x resource allocation

during contention

High, Normal, Low shares (4:2:1 ratio)

Custom shares (numerical value)

• Proportional weight

Relative priority changes:

• VMs are powered on / off

• VMs don’t utilize all the resources

Page 8: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Without a Global Storage Resource Control

Hosts get equal IOPS⇒IOPS dependent on VM placement!

VM shares respected only within a host

Local scheduling helps, but not sufficientTh

roug

hput

(IO

PS)

Hosts

20 10 10 10 20 10

30 20 20 10

OLTP OLTP OLTP OLTP Iomtr Iomtr

VM Shares

Host Shares

Page 9: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Optimal Storage Resource Control

20 10 10 10 20 10

30 20 20 10

OLTP OLTP OLTP OLTP Iomtr Iomtr

VM Shares

Host Shares

Thro

ughp

ut (I

OPS

)

Hosts

SIOC

Shares should be respected across hosts • Independent of VM placement

Storage IO Control (SIOC)

Page 10: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

SIOC: How To Use

Just two steps:

1. Enable Storage IO Control (SIOC) on a shared storage device (called

a datastore) in ESX

2. (Optional) Set Disk shares (and limit values) for virtual disks

Page 11: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Outline

The Problem

Storage IO Control (SIOC) overview

Technical Details

SIOC in Action

• Case study 1: Benefit of Disk Shares

• Case study 2: Dynamic IO Prioritization

Conclusions

Page 12: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Our Approach

Detect Congestion• SIOC monitors average IO latency for a datastore• Latency above a threshold indicates congestion (triggers SIOC)• If it ain’t broke, don’t fix it

Control IOs issued per host• Based on VMs and their shares on each host• Adjust dynamically to workload

o Idlenesso Bursty behavior

Page 13: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Congestion Detection: Setting the Threshold

Performance suffers if datastore is overloaded

SIOC uses a reasonable default setting

Default threshold good for most cases• If latency is very critical (IOPS may suffer), lower the threshold

Late

ncy

Thro

ughp

ut(IO

PS o

r MB

/s)

Load (# of IO’s in flight)

Load (# of IO’s in flight)

Stor

age

Satu

rate

d

CongestionThreshold

L

T

Congestion threshold value (ms):• Higher is better for overall

throughput• Lower is better for stronger isolation

Page 14: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

++−=+ βγγ )(

)()()1()1( tw

tLtwtw L

Per-Host Control Algorithm

L : latency threshold, operating point for IO latency

β : proportional to aggregate VM shares for host

γ : smoothing parameter between 0 and 1

Control Algorithm -

Adjusts window (queue) size w(t) of each host using datastore-wide average

latency L(t)

Runs every 4 seconds

Motivated by FAST TCP mechanism

Current Window size New Delta based on current Latency

Page 15: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Control Algorithm Features

Maintain high utilization at the array• Overall array queue proportional to Throughput x L

Ability to allocate queue size in proportion to hosts’ shares• At equilibrium, host window sizes are proportional to β

Ability to control overall latency of a cluster• Cluster operates close to L or below

++−=+ βγγ )(

)()()1()1( tw

tLtwtw L

Page 16: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

What does the Priority Setting Mean?

Two main units exist in industry

• Bandwidth (MB/s)

• Throughput (IOPS)

Both have problems

• Using bandwidth may hurt workloads with large IO sizes

• Using IOPS may hurt VMs with sequential IOs

SIOC: carves out array queue among VMs

• VMs reuse queue slots faster or slower (depending on array latency)

o e.g. Sequential streams get higher IOPS even if shares identical, similarly for

workloads with high read cache hit rates

o This is a good thing!

• Maintains high overall throughput

Page 17: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Control IOs Issued per Host (Based on Shares)

With SIOC: All VMs get equal queue slots

Without SIOC: VM C gets equal queue slots as VMs A+ B

VM DiskShares

A 1000

B 1000

C 1000

Page 18: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Control IOs Issued per Host (Based on Shares)

With SIOC: VMs get queue slots proportional to shares

VM DiskShares

A 1500

B 500

C 500

Page 19: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Outline

The Problem

Storage IO Control (SIOC) overview

Technical Details

SIOC in Action

• Case study 1: Benefit of Disk Shares

• Case study 2: Dynamic IO Prioritization

Conclusions

Page 20: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Workload

DVD Store Version 2 (DS2)

Open Source, online E-commerce workload

Leverages commonly used database features

Supports SQL Server, Oracle and MySQL for backend database

Easy to set up and run

Supports multiple database sizes

To download the workload:

http://www.delltechcenter.com/page/DVD+Store

Page 21: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Outline

The Problem

Storage IO Control (SIOC) overview

Technical Details

SIOC in Action

• Case study 1: Benefit of Disk Shares

• Case study 2: Dynamic IO Prioritization

Conclusions

Page 22: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Benefit of Disk Shares

Experimental Setup

VM ID Number of DS2 Users

1 36

2 503 36

4 36

Page 23: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

VM ID Disk Shares

1 200

2 variable

3 200

4 200

Benefit of Disk Shares

Performance of DS2 workload in critical VM• Standalone• When sharing datastore with other VMs: without and with SIOC enabled

With SIOC ON

SIOC OFF SIOC ON

VM2:

500

Sha

res

VM2:

120

0 Sh

ares

VM2:

400

0 Sh

ares

Higher Shares Better Performance

Congestion Threshold: 20ms

Page 24: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Outline

The Problem

Storage IO Control (SIOC) overview

Technical Details

SIOC in Action

• Case study 1: Benefit of Disk Shares

• Case study 2: Dynamic IO Prioritization

Conclusions

Page 25: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Dynamic IO Prioritization

Experimental Setup

VM ID Number of DS2 Users

1 24

2 24

3 24

4 24

5 50

Page 26: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Dynamic IO Prioritization

Effect of dynamic I/O prioritization on performance of DVDStore workloadsPhase 1: All VMs active Phase 2: VM 5 goes idle Phase 3: VM5 active again

VM ID

DiskShares

1 500

2 500

3 750

4 750

5 4000

Congestion Threshold: 20ms

Thro

ughp

ut

Thro

ughp

ut

Thro

ughp

ut

Page 27: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Dynamic IO Prioritization

Effect of dynamic I/O prioritization on performance of DVDStore workloads

Thro

ughp

ut

Shar

es

Phase 1: All VMs active Phase 2: VM 5 goes idle Phase 3: VM5 active again

VM ID

DiskShares

1 500

2 500

3 750

4 750

5 4000

Thro

ughp

ut

Thro

ughp

utTh

roug

hput

Thro

ughp

ut

Shar

es

Thro

ughp

ut

Shar

es

Thro

ughp

ut

Congestion Threshold: 20ms

Page 28: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Phase 1: All VMs active Phase 2: VM 5 goes idle Phase 3: VM5 active again

Host ID

Host diskShares

1 1000

2 1500

3 4000

Congestion Threshold: 20ms

Under the Hood …

Page 29: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Dynamic IO Prioritization

I/O Prioritization based on• Disk Shares (set by User)• Usage of allocated resources (monitored by SIOC)

0

200

400

600

800

1000

DS2

Thr

ough

put (

Ord

ers

per s

econ

d)

Time (10-second interval)

Aggregate DS2 Throughput0

100

200

300

400

500

DS2

Thr

ough

put (

Ord

ers

per s

econ

d)

Time (10-second interval)

VM 1

VM 2

VM 3

VM 4

VM 5

Phase 1: All VMs active Phase 2: VM 5 goes idle Phase 3: VM5 active again

SIOC maintains high utilization of storage devices

Page 30: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Outline

The Problem

Storage IO Control (SIOC) overview

Technical Details

SIOC in Action

• Case study 1: Benefit of Disk Shares

• Case study 2: Dynamic Prioritization

Conclusions

Page 31: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Conclusions

A need for resource control for shared storage in virtual

environments

VMware’s solution: Storage IO Control

• Control VMs access to shared storage using “Disk Shares”

• Easy to use – just two steps

o Enable Storage IO Control on a Datastore

o Set Disk shares (and limit values) for virtual disks

SIOC is smart

• Automatic detection of I/O congestion

• Dynamic decisions

Page 32: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

Related Resources

USENIX Annual Technical Conference 2009 paper“PARDA: Proportional Allocation of Resources for Distributed Storage Access”• Paper (http://www.usenix.org/events/fast09/tech/full_papers/gulati/gulati.pdf)• Slides (http://www.usenix.org/events/fast09/tech/slides/gulati.pdf)

Managing Performance Variance of Applications using Storage I/O Controlhttp://www.vmware.com/files/pdf/techpaper/vsp_41_perf_SIOC.pdf

vSphere Resource Management Guide for ESX / ESXi / vCenter Server 4.1http://www.vmware.com/pdf/vsphere4/r41/vsp_41_resource_mgmt.pdf

Page 33: Storage I/O Control · 2020-07-27 · Both have problems • Using bandwidth may hurt workloads with large IO sizes • Using IOPS may hurt VMs with sequential IOs SIOC: carves out

2010 Storage Developer Conference. ©VMware, Inc. All Rights Reserved.

THANK YOU