30
Design and Implementation of Design and Implementation of a Generic Resource-Sharing a Generic Resource-Sharing Virtual-Time Dispatcher Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona SC Ctr Dror Feitelson Scl. Eng & CS Hebrew University Supported by the Israel Science Foundation, grant no. 28/09

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Embed Size (px)

Citation preview

Page 1: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Design and Implementation ofDesign and Implementation ofa Generic Resource-Sharinga Generic Resource-Sharing

Virtual-Time DispatcherVirtual-Time Dispatcher

Tal Ben-NunScl. Eng & CS

Hebrew University

Yoav EtsionCS Dept

Barcelona SC Ctr

Dror FeitelsonScl. Eng & CS

Hebrew University

Supported by the Israel Science Foundation, grant no. 28/09

Page 2: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Design and Implementation ofa Generic Resource-Sharing

Virtual-Time Dispatcher

Goal is to control share of resources, not to optimize performance – important in virtualization

Page 3: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Design and Implementation ofa Generic Resource-Sharing

Virtual-Time Dispatcher

Goal is to control share of resources, not to optimize performance – important in virtualization

Same module used for diverse resources

Page 4: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Design and Implementation ofa Generic Resource-Sharing

Virtual-Time Dispatcher

Goal is to control share of resources, not to optimize performance – important in virtualization

Same module used for diverse resources

Mechanism used: dispatch the most deserving client at each instant

Page 5: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Design and Implementation ofa Generic Resource-Sharing

Virtual-Time Dispatcher

Goal is to control share of resources, not to optimize performance – important in virtualization

Same module used for diverse resources

Mechanism used: dispatch the most deserving client at each instant

Selection of deserving client using virtual time formalism

Page 6: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Design and Implementation ofa Generic Resource-Sharing

Virtual-Time Dispatcher

Goal is to control share of resources, not to optimize performance – important in virtualization

Same module used for diverse resources

Mechanism used: dispatch the most deserving client at each instant

Selection of deserving client using virtual time formalism

Implemented and measured in Linux

Page 7: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Motivation

Context: VMM for server consolidation Multiple legacy servers share physical platform Improved utilization and easier maintenance Flexibility in allocating resources to virtual machines Virtual machines typically run a single application

(“appliances”)

Page 8: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Motivation

Assumed goal: enforce predefined allocation of resources to different virtual machines(“fair share” scheduling) Based on importance / SLA Can change with time or due to external events

Problem: what is “30% of the resources” when there are many different resources, and diverse requirements?

Page 9: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Global Scheduling

“Fair share” usually applied to a single resource But what if this resource is not a bottleneck?

Global scheduling idea:

1) Identify the system bottleneck resource

2)Apply fair share scheduling on this resource

3)This induces appropriate allocations on other resources

This paper: how to apply fair-share scheduling on any resource in the system

Page 10: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Previous Work I: Virtual Time

Accounting is inversely proportional to allocation Schedule the client that is farthest behind

Page 11: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Previous Work II: Traffic Shaping

• Leaky bucket

– Variable requests

– Constant rate transmission

– Bucket represent buffer

• Token bucket

– Variable requests

– Constant allocations

– Bucket represents stored capacity

Page 12: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Putting them Together: RSVT

• “Resource sharing”: all clients make progress continuously– Generalization of processor sharing

• Each job has its ideal resource sharing progress– This is considered to be the allocation ai

– Grows at constant rate

• Each job has its actual consumption ci

– Grows only when job runs

• Scheduling priority is the difference:

pi = ai – ci

Page 13: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

ExampleThree clients

Allocations roughly 50%, 30%, 20%

Consumption always occur in resource time

Wallclock time

Co

nsu

me

d r

eso

urce

tim

e

Page 14: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Bookkeeping

• The set of active jobs is A

• The relative allocation of job i is ri

• During an interval T job k has run

• Update allocations:

• Update consumptions:

Tr

ra

Aj j

ii

otherwise

kiTci 0

Page 15: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

The Active Set

• Active jobs (the set A) are those that can use the resource now

• Allocations are relative to the active set

• The active set may change

– New job arrives

– Job terminates

– Job stops using resource temporarily

– Job resumes use of resource

Page 16: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Grace Period

• Intermittent activity: process data / send packet

• should retain allocations even when inactive

• Thus ai continues to grow during grace period after it becomes inactive

• Grace period reflects notion of continuity

• Sub-second time scale

Page 17: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Rebirth

• Resumption after very long inactive periods should be treated as new arrivals

• Due to grace period, job that becomes inactive accrues extra allocation

• Forget this extra allocation after rebirth period

(set ai = ci)

• Two order of magnitude larger than grace period

Page 18: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Implementation

• Kernel module with generic functionality– Create / destroy module– Create / destroy client– Make request / set active / set inactive– Make allocations– Dispatch– Check-in (note resource usage)

• Glue code for specific subsystems– Currently networking and CPU– Plan to add disk I/O

Page 19: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Networking Glue Code

Use the Linux QoS framework: create RSVT queueing discipline

IP

QoS

NIC

TCP

App

queueingdiscipline

Page 20: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Networking Glue Code

Non-RSVT traffic has priority (e.g. NFS traffic) and is counted as dead time

IP

NIC

TCP

App

RSVT?

sendimmediately

no enqueue

selectand send

yes

Page 21: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

CPU Scheduling Glue Code

• Use Linux modular scheduling core

• Add an RSVT scheduling policy

– RSVT module essentially replaces the policy runqueue

– Initial implementation only for uniprocessors

• CFS and possibly other policies also exist and have higher priority

– When they run, this is considered dead time

Page 22: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Timer Interrupts

• Linux employs timer interrupts (250 Hz)

• Allocations are done at these times

– Translate time into microseconds

– Subtract known dead time (unavailable to us)

– Divide among active clients according to relative allocations

– Bound divergence of allocation from consumption

• Also handling of grace period (mark as inactive)

• Also handling of rebirth (set ai = ci)

Page 23: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Multi-Queue

• At dispatch, need to find client with highest priority

• But priorities change at different rates• Solution: allow only a limited discrete set of

relative priorities• Each priority has a separate queue• Maintain all clients in each queue in priority

order• Only need to check the first in each queue to

find the maximum

Page 24: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Experiment – Basic Allocations

rate bandwidth

1 30.890.05

2 61.410.02

Page 25: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Experiment – Basic Allocations

rate bandwidth

1 15.690.11

2 30.810.03

3 46.100.03

Page 26: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Experiment – Active Set

Page 27: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Experiment – Grace Period

Page 28: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Experiment – Rebirth

Page 29: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Experiment – Throttling

•Two competing MPlayers

•The one with higher allocation does not need all of it

– Allocation tracks consumption

Page 30: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona

Conclusions

• Demonstrated generic virtual-time based resource sharing dispatcher

• Need to complete implementation

– Support for I/O scheduling

– More details, e.g. SMP support

• Building block of global scheduling vision