Lecture 14 Multicore Scheduling.ppt [호환 모드]rtcc.hanyang.ac.kr/.../Lecture_14_Multicore_Scheduling.pdf3 3 Real-Time Computing and Communications Lab. HanyangUniversity Uniprocessor

Real-Time Computing and Communications Lab.Hanyang University

Multicore Scheduling

Minsoo Ryu

Department of Computer Science and EngineeringHanyang University

2

2


Global vs. Partitioned Scheduling1 XPage

Other Scheduling Approaches2 XPage

Q & A3 XPage

3

3


Uniprocessor vs. Multiprocessor Scheduling

Uniprocessor scheduling It is to decide when and which job will run

Multiprocessor scheduling It is to decide not only when but also where a job will run Almost the same goals as those of uniprocessor

scheduling But it raises new issues

• How to assign applications to multiple processors?• How to balance workload among processors?• How to define and exploit affinity?• How to manage processor heterogeneity?

4

4


Multiprocessor Scheduling Policies

The same policies as uniprocessor policies Priority-based scheduling: FCFS, SJF, SRTF, RM, EDF Proportional share scheduling: PGPS, SFQ, WF2Q, Lottery

and Stride, BVT, VTRR

Two approaches Global scheduling

• The system has a single global process queue• Processes are dispatched to any available processors

Partitioned scheduling• Each processor has a separate process queue• Each queue is scheduled by an independent scheduler• Process migration may be allowed or not

5

5


Global vs. Partitioned Scheduling

Global Scheduling Partitioned Scheduling

6

6


Global vs. Partitioned Scheduling

Global scheduling It is generally believed that global scheduling can achieve

better performance However, it can be inefficient due to the contention at the

single queue and increased cache misses

Partitioned scheduling Performance can vary depending on the initial distribution

of processes, i.e., a bin-packing problem Different scheduling policies can be employed across

processors We can use the rich and extensive results from the

uniprocessor scheduling theory

7

7


Global EDF

Consider the following tasks Process X : period = 20, WCET = 15, deadline = 20 Process Y : period = 30, WCET = 15, deadline = 30 Process Z : period = 40, WCET = 10, deadline = 40

10 20 30 40 50 600

10 20 30 40 50 600

CPU #1

CPU #2

8

8


Partitioned EDF

Consider the following tasks Process X : period = 20, WCET = 15, deadline = 20 Process Y : period = 30, WCET = 15, deadline = 30 Process Z : period = 40, WCET = 10, deadline = 40

10 20 30 40 50 600

10 20 30 40 50 600

CPU #1(X)

CPU #2(Y, Z)

9

9


Schedulability Analysis

Global EDF There is no single efficient test Most tests are very complex

Partitioned EDF Sufficient to check if the CPU utilization does not exceed

100% for each processor

……

10

10


Global WFQ

Consider the following tasks Process A : weight = 1 Process B : weight = 2 Process C : weight = 2 Process D : weight = 4

CPU #1

CPU #2

11

11


Partitioned WFQ with Load Balancing

Consider the following tasks Process A : weight = 1 Process B : weight = 2 Process C : weight = 2 Process D : weight = 4

CPU #1(A, D)

CPU #2(B, C)

migrate migrate migrate

12

12


Other Scheduling Approaches

Coscheduling Gang Scheduling and Symbiotic Scheduling

Processor sets Locality and cache issues Processor Heterogeneity

13

13


Gang Scheduling

First proposed by Ousterhout in 1982 Also known as coscheduling

Basic idea All processes of an application are scheduled to run

simultaneously When a time slice ends, all running processes are

preempted simultaneously, and all processes from a second application are scheduled for the next time slice

14

14


Some Properties of Gang Scheduling

Advantages Gang scheduling solves the problems associated with

synchronization and communications among related processes

Disadvantages Gang scheduling is a centralized scheduling strategy,

which can become a bottleneck for large machine It can result in poor cache performance It can lead to fragmentation of processors where there are

applications that do not need all of the processors in the system, but do not leave enough free processors to fit all of the processes of another application

15

15


Symbiotic Scheduling

Proposed by Snavely et al. in 1999 The term symbiosis was introduced to refer to an increase

in throughput that can occur when particular jobs are coscheduled on multithreaded machines

Throughput may go up or down depending on how well the jobs in the running set symbios or ‘get along’

Symbiosis measurement Throughput rate TR

••

16

16


Processor Sets

Proposed by Black in 1990 The machine is partitioned into sets of processors, each of

which executes a single parallel application Using processor sets can also ensure that different

applications get an equal portion of the machine (equi-partitioning)

17

17


Thread Clustering

Proposed by D. Tam, R. Azimi, and M. Stumm in 2007

Motivation Cost of cross-chip sharing is high

Key ideas Detect thread sharing

patterns using performancecounters

Locate threads that heavily share data onto the same chip

18

18


Processor Heterogeneity

ASISA (Asymmetric Single ISA) processors Processors have the same ISA, but different performance

characteristics (clock speed, power, cache size, …)

Heterogeneity, scheduling, and performance

19

19


Documents

Lecture 14 Multicore Scheduling.ppt [호환 모드]rtcc.hanyang.ac.kr/.../Lecture_14_Multicore_Scheduling.pdf3 3 Real-Time Computing and Communications Lab. HanyangUniversity Uniprocessor