Mor Harchol-Balter Carnegie Mellon University School of Computer Science

Mor Harchol-BalterCarnegie Mellon UniversitySchool of Computer Science

“size” = service requirement

load < 1

jobs SRPT

jobs PS

Q: Which minimizes mean response time?

“size” = service requirement

jobs SRPT

load < 1

jobs PS

Q: Which best represents scheduling in web servers ?

IDEAHow about using SRPT instead of PS in web servers?

Linux 0.S.

WEBSERVER(Apache)

client 1

client 2

client 3

“Get File 1”

“Get File 2”

“Get File 3”

Internet

Many servers receive mostly static web requests.

“GET FILE”

For static web requests, know file size

Approx. know service requirement of request.

Immediate Objections 1) Can’t assume known job size

2) But the big jobs will starve ...

Outline of Talk

[Sigmetrics 01] “Analysis of SRPT: Investigating Unfairness”[Performance 02] “Asymptotic Convergence of Scheduling Policies…”[Sigmetrics 03*] “Classifying Scheduling Policies wrt Unfairness …”

THEORY

IMPLEMENT

www.cs.cmu.edu/~harchol/

[TOCS 03] “Size-based Scheduling to Improve Web Performance”[ITC 03*, TOIT 06] “Web servers under overload: How scheduling helps”[ICDE 04,05,06] “Priority Mechanisms for OLTP and Web Apps”

(M/G/1)

Schroeder

Wierman

IBM/CMU Patent

THEORY SRPT has a long history ...

1966 Schrage & Miller derive M/G/1/SRPT response time:

1968 Schrage proves optimality

1979 Pechinkin & Solovyev & Yashkov generalize

1990 Schassberger derives distribution on queue length

BUT WHAT DOES IT ALL MEAN?

THEORYSRPT has a long history (cont.)1990 - 97 7-year long study at Univ. of Aachen under Schreiber SRPT WINS BIG ON MEAN!

1998, 1999 Slowdown for SRPT under adversary: Rajmohan, Gehrke, Muthukrishnan, Rajaraman, Shaheen, Bender, Chakrabarti, etc. SRPT STARVES BIG JOBS!

Various o.s. books: Silberschatz, Stallings, Tannenbaum: Warn about starvation of big jobs ...

Kleinrock’s Conservation Law: “Preferential treatment given to one class of customers is afforded at the expense of other customers.”

Unfairness Question

Let =0.9. Let G: Bounded Pareto(= 1.1, max=1010)

Question: Which queue does biggest job prefer?

THEORY

Results on UnfairnessLet =0.9. Let G: Bounded Pareto(= 1.1, max=1010)

PSI SRPT

Unfairness – General Distribution

All-can-win-theorem:

For all distributions, if ½,

E[T(x)]SRPT E[T(x)]PS for all x.

All-can-win-theorem:

For all distributions, if ½,

E[T(x)]SRPT E[T(x)]PS for all x.

Proof idea:

0 )1 1x

dttft 2 )(

Waiting time (SRPT) Residence (SRPT) Total (PS)

Classification of Scheduling Policies

AlwaysUnfair

Sometimes Unfair

AlwaysFair

Age-BasedPolicies

Preemptive Size-basedPolicies

Remaining Size-basedPolicies

Non-preemptive

PS PLCFS

LRPT FCFS LJF

FSP[Sigmetrics 01, 03]

[Sigmetrics 04]• Henderson FSP (Cornell) (both FAIR & efficient)• Levy’s RAQFM (Tel Aviv) (size + temporal fairness)• Biersack’s, Bonald’s flow fairness (France)• Nunez, Borst TCP/DPS fairness (EURANDOM)

What does SRPT mean within a Web server?

• Many devices: Where to do the scheduling?

• No longer one job at a time.

IMPLEMENT From theory to practice:

Server’s Performance BottleneckIMPLEMENT

Linux 0.S.

WEBSERVER(Apache)

client 1

client 2

client 3

“Get File 1”

“Get File 2”

“Get File 3”

Rest ofInternet ISP

Site buyslimited fractionof ISP’s bandwidth

We model bottleneck by limiting bandwidth on server’s uplink.

Network/O.S. insides of traditional Web server

Sockets take turnsdraining --- FAIR = PS.

WebServer

Socket 1

Socket 3

Socket 2Network Card

Client1

Client3

Client2BOTTLENECK

IMPLEMENT

Network/O.S. insides of our improved Web server

Socket corresponding to filewith smallest remaining datagets to feed first.

WebServer

Socket 1

Socket 3

Socket 2Network Card

Client1

Client3

Client2

priorityqueues.

2nd3rd

BOTTLENECK

IMPLEMENT

Experimental Setup

Implementation SRPT-based scheduling: 1) Modifications to Linux O.S.: 6 priority Levels 2) Modifications to Apache Web server 3) Priority algorithm design.

Linux 0.S.

APACHEWEB

SERVER

switch

Experimental Setup

APACHEWEB

SERVER

Linux 0.S.

switch

Trace-based workload: Number requests made: 1,000,000Size of file requested: 41B -- 2 MBDistribution of file sizes requested has HT property.

FlashApache

WAN EMUGeographically-dispersed clients

10Mbps uplink100Mbps uplinkSurgeTrace-basedOpen systemPartly-open

Load < 1Transient overload

+ Other effects: initial RTO; user abort/reload; persistent connections, etc.

Preliminary Comments

• Job throughput, byte throughput, and bandwidth utilization were same under SRPT and FAIR scheduling.

• Same set of requests complete.

• No additional CPU overhead under SRPT scheduling. Network was bottleneck in all experiments.

APACHEWEB

SERVER

Linux 0.S.

switch

SRPTMea

Results: Mean Response Time (LAN)

Percentile of Request Size

Load =0.8

Mean Response Time vs. Size Percentile (LAN)

Transient Overload

Transient Overload - Baseline

Mean response time SRPTFAIR

Transient overloadResponse time as function of job

small jobswin big!

big jobsaren’t hurt!

Baseline CaseWAN propagation delays

WAN loss

Persistent ConnectionsInitial RTO valueSYN CookiesUser Abort/ReloadPacket LengthRealistic Scenario

WAN loss + delay

RTT: 0 – 150 ms

Loss: 0 – 15%

Loss: 0 – 15%RTT: 0 – 150 ms,

0 – 10 requests/conn.

RTO = 0.5 sec – 3 secON/OFF

Abort after 3 – 15 sec, with 2,4,6,8 retries.

Packet length = 536 – 1500 Bytes

RTT = 100 ms; Loss = 5%; 5 requests/conn.,RTO = 3 sec; pkt len = 1500B; User abortsAfter 7 sec and retries up to 3 times.

FACTORS

Transient Overload - Realistic

Mean response timeFAIR SRPT

Mor Harchol-Balter Carnegie Mellon University School of Computer Science

Documents

Analysis of Scheduling Policies under Correlated Job Sizesharchol/Papers/Performance10b.pdf · 2010. 8. 8. · Varun Gupta , Michelle Burroughs, Mor Harchol-Balter Computer Science

Mor Harchol-Balter Carnegie Mellon University Computer Science

Math for Smart Kids Book 1: Multiplication & Division By Mor Harchol-Balter, Ph.D. Copyright March 2003

Balter adam slideshowPCP

1 Scheduling in Server Farms Mor Harchol-Balter Computer Science Dept Carnegie Mellon University harchol@cs.cmu.edu

Anshul Gandhi (Carnegie Mellon University) Varun Gupta (CMU), Mor Harchol-Balter (CMU) Michael Kozuch (Intel, Pittsburgh)

An Adaptive Threshold-Based Policy for Sharing Servers with ...An Adaptive Threshold-Based Policy for Sharing Servers with Afﬁnities Takayuki Osogami Mor Harchol-Balter Alan Scheller-Wolf

1 Scheduling in Server Farms Mor Harchol-Balter Associate Department Head Computer Science Dept Carnegie Mellon University harchol@cs.cmu.edu

Analysis of SRPT Scheduling: Investigating Unfairnessharchol/Papers/Sigmetrics01.pdf · 2007. 6. 26. · Analysis of SRPT Scheduling: Investigating Unfairness Nikhil Bansal Mor Harchol-Balter

ISAAC GROSOF, MOR HARCHOL-BALTER, arXiv:1905.03439v1 [cs.PF] 9 … · MOR HARCHOL-BALTER, Carnegie Mellon University, USA Load balancing systems, comprising a central dispatcher and

VARUN GUPTA Carnegie Mellon University 1 With: Mor Harchol-Balter (CMU)

An Adaptive Threshold-Based Policy for Sharing Servers with ...harchol/CMU-CS-04-112.pdfAn Adaptive Threshold-Based Policy for Sharing Servers with Afﬁnities Takayuki Osogami Mor

Applying for Grad School in CS* Adam Wierman > Caltech *much of this was blatantly poached from Mor Harchol-Balter’s advice harchol/gradschooltalk.pdf

Multi-server queueing systems with multiple priority classesharchol/Papers/questa.pdf · Multi-server queueing systems with multiple priority classes Mor Harchol-Balter∗ Takayuki

Applying to Ph.D. Programs in Computer Scienceharchol/gradschooltalk.pdf · Applying to Ph.D. Programs in Computer Science Mor Harchol-Balter Computer Science Department Carnegie

Mor Harchol-Balter Carnegie Mellon University School of Computer Science

Léo Balter: JavaScript Idiomático

Carnegie Mellon University Computer Science Department 1 OPEN VERSUS CLOSED: A CAUTIONARY TALE Bianca Schroeder Adam Wierman Mor Harchol-Balter Computer

Thesis Oral VARUN GUPTA - University of Chicagohome.uchicago.edu/~guptav/talks/thesis_slides.pdf · 38 [Performance’] V. Gupta, M. Harchol-Balter, K. Sigman, and W. Whitt. Analysis

Mor Harchol-Balterharchol/cv.pdf · [20] Ziv Scully, Mor Harchol-Balter, Alan Scheller-Wolf. “Simple Near-Optimal Scheduling for the M/G/1.” Proceedings of the ACM Measurement