Transcript
Page 1: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS

Justinien Bouron, Baptiste Lepers, Sébastien Chevalley, Willy ZwaenepoelEPFL

Redha Gouicem, Julia Lawall, Gilles Muller, Julien SopenaSorbonne University, Inria, LIP6

Page 2: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

What is a scheduler ?

Runs all the tasks of a system, solving the following challenges :

● Assign set of tasks to (smaller) set of cores● High utilization of hardware resources (ie. CPU utilization)● Fast response time and low overhead !● React to workload changes (load balancing, …)

2

Page 3: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Linux CFS and FreeBSD ULE

● Linux CFS is supposed to be completely fair● FreeBSD ULE is supposed to have good interactive performances

Both schedule a large number of threads on a large number of cores ...

… but their design differ greatly

3

Page 4: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Our goal : Compare Linux CFS and FreeBSD ULE

● How do they differ in terms of design ?● What is the impact of each design on performances ?● Apple-to-apple comparison only, not declaring a winner

4

Page 5: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

How to compare their impact on performances ?

● We want to compare the impact of both scheduler on performances● But both are from different kernels/OSes ... ● … naively running the applications on both OSes would be highly biased

How to single out the performance differences coming from the schedulers only ?

5

Page 6: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Our approach : Transplanting

● Transplant one scheduler into the other kernel alongside the original● Choose which one to use !● Everything else remains the same => No bias from other components.

6

Page 7: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Which scheduler to transplant ?

CFS18k LoC

ULE3k LoC

Answer : ULE into Linux7

Page 8: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Challenges

8

Page 9: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Challenge : Interface mismatch

● Linux provides an “API” (user defined functions) to add new schedulers … ● … FreeBSD does not● But functions inside ULE could easily be mapped to their Linux counterpart

9

Page 10: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Challenge : Different low-level assumptions

Both schedulers have their very own low-level assumptions :

● Locking policy : Multiple locks● Runqueue management : data structures, locks, indexes, ...● Priority range : CFS is nice range, ULE all tasks

ULE’s code had to be slightly modified to comply with Linux’s assumptions

10

Page 11: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Evaluation

11

Page 12: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

A broader performance comparison

● Mix of synthetic benchmarks and realistic applications● Evaluation performed on a 32-cores NUMA machine (AMD Opteron) with 32GB

of RAM.

Dual-purpose :

● Test our implementation● Give us clues on where to look for differences

12

Page 13: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

A broader performance comparison

● Most of the time the performances are the same : 2.75% in favor of ULE in average.

Performances of ULE (% diff w.r.t CFS)13

Page 14: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

A broader performance comparison

● Big gaps when changing scheduler : The scheduler can have a big impact on performances !

Performances of ULE (% diff w.r.t CFS)14

Page 15: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Design differences

15

Page 16: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Difference #1/4 : Dealing with interactive tasks

16

Page 17: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : The difference

● A task can be either interactive or batch● Interactive tasks = sleep most of the time (inputs, yields, …)● Batch tasks = CPU-bound tasks with very little sleep

How are CFS and ULE handling those ?

17

Page 18: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

CFS

18

Page 19: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : CFS

● CFS is fair, thus no distinction between interactive and batch tasks.● Tasks ordered by runtime, pick the one on top of the runqueue

Task 0

Task 1

Task 2

Runqueue

Interactive

BatchCore

19

Page 20: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : CFS

Task 0

Task 1

Task 2

Runqueue

Interactive

BatchCore

20

● CFS is fair, thus no distinction between interactive and batch tasks.● Tasks ordered by runtime, pick the one on top of the runqueue

Page 21: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : CFS

Task 1

Task 2

Runqueue

Interactive

BatchCore

Task 0

21

● CFS is fair, thus no distinction between interactive and batch tasks.● Tasks ordered by runtime, pick the one on top of the runqueue

Page 22: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : CFS

Task 1

Task 2

Runqueue

Interactive

BatchCore

Task 0

22

● CFS is fair, thus no distinction between interactive and batch tasks.● Tasks ordered by runtime, pick the one on top of the runqueue

Page 23: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : CFS

Task 1

Task 2

Task 0

Runqueue

Interactive

BatchCore

23

● CFS is fair, thus no distinction between interactive and batch tasks.● Tasks ordered by runtime, pick the one on top of the runqueue

Page 24: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : CFS

Task 1

Task 2

Task 0

Runqueue

Interactive

BatchCore

24

● CFS is fair, thus no distinction between interactive and batch tasks.● Tasks ordered by runtime, pick the one on top of the runqueue

Page 25: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : CFS

Task 2

Task 0

Runqueue

Interactive

BatchCore

Task 1

25

● CFS is fair, thus no distinction between interactive and batch tasks.● Tasks ordered by runtime, pick the one on top of the runqueue

Page 26: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

ULE

26

Page 27: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Task 1

Batch

Interactive

BatchCore

Task 0

Task 2

Interactive

27

Page 28: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 0

Task 2

Interactive

28

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 29: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 2

Interactive

Task 0

29

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 30: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 2

Interactive

Task 0

30

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 31: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 2

Task 0

Interactive

31

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 32: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 2

Task 0

Interactive

32

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 33: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 0

Interactive

Task 2

33

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 34: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 0

Interactive

Task 2

34

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 35: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Task 0

Task 2

Interactive

35

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 36: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Interactive

36

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 37: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Batch

Interactive

BatchCore

Interactive

Task 1

37

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 38: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Batch

Interactive

BatchCore

Interactive

Task 1

38

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 39: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : ULE

Task 1

Batch

Interactive

BatchCore

Interactive

39

● ULE keeps interactive tasks and batch tasks in separate runqueues● Tasks ordered by runtime in each● Interactive tasks have absolute priority over batch tasks

Page 40: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : Rationales

40

CFS’s rationale :

● Let’s be fair● No distinction between tasks

ULE’s rationale :

● Interactive tasks are latency-critical, give them absolute priority● This should not cause problems as they sleep most of the time

Page 41: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : Similarities

41

Both schedulers operate the same way when dealing with only one class of task

● They both pick the task with the lowest runtime from one runqueue

Thus the only interesting case to study is when we mix both classes of tasks

Page 42: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : The experiment

Run two applications in parallel on a single core machine ...

● One interactive application with 80 interactive threads● One single-threaded batch application

Goal : compare the evolution of their runtime

42

Page 43: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : The impact

CFS

ULE

43

Interactive Batch

Page 44: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : The impact

CFS

44

Interactive Batch

Page 45: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : The impact

● Linux CFS is fair (both application get ~50% of the CPU) ● No starvation

CFS

Roughly same slope for Interactive and Batch

=> Fairness

45

Interactive Batch

Page 46: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : The impact

● On FreeBSD ULE interactive tasks saturate the CPU and starve batch tasks !

ULEBatch is starving !

Interactive gets the full core

46

Interactive Batch

Page 47: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : The impact

● On ULE interactive applications may perform better ...● … But may also starve other tasks in the system.

CFS

ULE

47

Interactive Batch

Page 48: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : “Auto-Starvation”

● Starvation problem in ULE can occur between threads of a single application● Can be good for performances as it avoids over-subscription of the CPU!● More details in the paper

48

Page 49: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Dealing with interactive tasks : Summary

In ULE :

● Interactive tasks have absolute priority● They also can starve batch tasks, even from the same application

In CFS :

● All tasks are treated the same● Fairness● No starvation

49

Page 50: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Difference #2/4 : Preemption

50

Page 51: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Full preemption : The difference

Should a waking up tasks preempt the running task ?

● Linux CFS : Full preemption is enabled, so yes, sometimes.● FreeBSD ULE : No full preemption by default. Only kernel threads can preempt

others.

What impact ?

51

Page 52: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Full preemption : The experiment

● Run a communication intensive workload (Apache) on a single core● The workload consist of a load injector and workers that handle requests● Compare the performances and look at low-level events with perf

52

Page 53: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Full preemption : The impact ?

● Apache workload performs better on ULE on single core.● On CFS the request injector is preempted by the workers at every request● Thus further requests are delayed, performances go down !

Linux CFS FreeBSD ULE

Preemption of injector by userland thread

> 2M 0

Total time (seconds) 257 185

Requests / second 3891 5405

53

Page 54: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Full preemption : Summary

ULE :

● No full preemption by default

CFS :

● Full preemption is enabled by default● Can worsen performances in some surprising ways

54

Page 55: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Difference #3/4 : Load balancing

55

Page 56: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The difference

Linux CFS FreeBSD ULE

Load = complex metric w/ heuristics Load = number of tasks

Hierarchical (NUMA) Non-hierarchical (SMP)

Every 4ms Every 0.5-1.5s (random)

Migrates multiple tasks from a loaded core at once

Migrate at most one task from a loaded core

A scheduler must balance the load on all cores

● Both schedulers have their own load balancing algorithm ... ● … which differ in three main points

56

Page 57: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The experiment

● Use a lot of threads to put a lot of stress on the load balancer● What we want to compare : The speed and the efficiency

57

Page 58: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The experiment

1) Spawn a lot of threads, all pinned on core 0 of a 32-cores machine

Thread 0

Thread 1

Thread 2

...

Core 0 Core 1 Core 2 Core 31

… 58

Page 59: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The experiment

2) Unpin the threads

Thread 0

Thread 1

Thread 2

...

Core 0 Core 1 Core 2 Core 31

… 59

Page 60: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The experiment

3) Let the load balancer do its work and save runqueue sizes at every migration

Thread 0

Thread 1

Thread 2

...

Core 0 Core 1 Core 2 Core 31

… 60

Page 61: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The experiment

4) … until some stable state is reached

Thread 0

Thread 44

...

...

Thread 234

Thread 72

...

...

Thread 99

Thread 56

...

...

Thread 6

Thread 451

...

...

Core 0 Core 1 Core 2 Core 31

… 61

Page 62: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The experiment

5) Record the time it took to reach stable state

Thread 0

Thread 44

...

...

Thread 234

Thread 72

...

...

Thread 99

Thread 56

...

...

Thread 6

Thread 451

...

...

Core 0 Core 1 Core 2 Core 31

… 62

Page 63: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The experiment

● We used C-ray, a massively parallel ray-tracer with 512 threads that are all identical

● As all threads are identical, so we should expect 16 threads per core at the end...

63

Page 64: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact

● We end up with the following graphs● Each line is a core, the color is the size of its runqueue

ULE

CFS

64

Page 65: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact

● We end up with the following graph● Each line is a core, the color is the size of its runqueue

ULE

CFS

65

Page 66: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Perfect balancing

ULE achieves perfect balancing !

CFS has some troubles due to NUMA heuristics

66

Page 67: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Speed

● Four minutes to spread the load on ULE ???

ULE

CFS

67

Page 68: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Speed (ULE)

68

● ULE can migrate at most one task from a loaded core !

Page 69: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Speed (ULE)

69

● ULE can migrate at most one task from a loaded core !● Idle cores also steal only one task at a time

Page 70: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Speed (ULE)

70

● ULE can migrate at most one task from a loaded core !● Idle cores also steal only one task at a time● After the stealing, threads will be migrated one at a time ...

Page 71: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Speed (CFS)

71

● CFS has no limit on the number of migrations

Page 72: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Speed (CFS)

72

● CFS has no limit on the number of migrations ● CFS balances the load much faster : around 400 migrations in less than 0.2s !

Page 73: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : The impact, Speed (CFS)

73

● CFS has no limit on the number of migrations ● CFS balances the load much faster : around 400 migrations in less than 0.2s !● But heuristics are still a problem

Page 74: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Load balancer : Summary

ULE :

● Very simple load metric● Achieves perfect balancing● But slow

CFS :

● Complex load metric, lots of heuristics● Can be stuck in imbalanced state● But fast at spreading the load

74

Page 75: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Difference #4/4 : Thread placement

75

Page 76: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Thread placement : The difference

How to choose the core running a new/waking thread ?

● CFS : Heuristic to restrict the list of suitable cores and take the less loaded one● ULE : Choose, among all cores, the one with the minimum number of tasks

76

Page 77: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Thread placement : The experiment

● Spawn a lot of threads on all available cores● Record size of runqueues over time

Again C-ray was a good choice for this

77

Page 78: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Thread placement : The impact

ULE

CFS

78

ULE

CFS

Page 79: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Thread placement : The impact : ULE

● The load is always more or less uniform (nice fading on the load graph)● Auto-starvation slows down the creation of threads !

79

Page 80: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Thread placement : The impact : CFS

● Bad load balance at the beginning due to NUMA heuristics● Load balancer tries to fix this but still struggles as before

80

Page 81: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Thread placement : Summary

ULE :

● Thread placement policy consider all cores ...● … and thus relieves the pressure from the load balancer● Load is always uniform

CFS :

● The policy consider a subset of cores only using heuristics● Might worsen the balancing in case of large spawn rates

81

Page 82: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Conclusion

82

Page 83: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Conclusion

● Scheduling is hard … and even harder on a multicore machines● Design and implementation choices can have a great influence on

performances● No scheduler perform better than the other on all workloads

83

Page 84: The Battle of the Schedulers: FreeBSD ULE vs. Linux CFS · Linux CFS FreeBSD ULE Load = complex metric w/ heuristics Load = number of tasks Hierarchical (NUMA) Non-hierarchical (SMP)

Questions ?

84

Code available on Github : https://github.com/JBouron/linux/tree/loadbalancing


Recommended