34
OS Scheduling and The Anatomy of a context switch Daniel Ben-Zvi

OS scheduling and The anatomy of a context switch

Embed Size (px)

Citation preview

Page 1: OS scheduling and The anatomy of a context switch

OS Scheduling and The Anatomy of a context

switchDaniel Ben-Zvi

Page 2: OS scheduling and The anatomy of a context switch

Multi-tasking

Page 3: OS scheduling and The anatomy of a context switch

Scheduling in the modern world● Pause a running process in the middle of its execution

● Resume a previously paused process that is ready to

execute

● Do this (tens) of thousands of times per second

Page 4: OS scheduling and The anatomy of a context switch

But it wasn’t always like this

Page 5: OS scheduling and The anatomy of a context switch

No Scheduler

Only one process can operate at a time.

A single process can hog the whole system forever.

Page 6: OS scheduling and The anatomy of a context switch

Digger sleep function, 1982sleep(time)

int time;

{

int a,b;

for(a=0; a<time; a++) {

for(b=0; b<100; b++);

}

}

So this is why pressing Turbo would speed it up! :)

3(source: http://www.digger.org/digsrc_orig.zip)

Page 7: OS scheduling and The anatomy of a context switch

And then they said.. lets cooperate!

Page 8: OS scheduling and The anatomy of a context switch

Cooperative scheduler

Each process must “play nice” and yield control to other processes.

… but a single misbehaving process can still hog the whole system forever.

(the sad reality of a kibbutz)

Page 9: OS scheduling and The anatomy of a context switch

Sleep method designed for MacOS 8 (in C)void wxThread::Sleep(unsigned long milliseconds)

{

UnsignedWide start, now;

Microseconds(&start);

double mssleep = milliseconds * 1000 ;

double msstart, msnow ;

msstart = (start.hi * 4294967296.0 + start.lo) ;

do

{

YieldToAnyThread();

Microseconds(&now);

msnow = (now.hi * 4294967296.0 + now.lo) ;

} while( msnow - msstart < mssleep );

}

// Start a busy loop till time has passed

// Pass control to other threads (in the active process context)

(source: https://github.com/LuaDist/wxwidgets/blob/master/src/mac/classic/thread.cpp#L539)

Page 10: OS scheduling and The anatomy of a context switch

Non preemptiveA process will eventually misbehave...

Page 11: OS scheduling and The anatomy of a context switch
Page 12: OS scheduling and The anatomy of a context switch

Round Robin (Time sharing)

(source: http://www.qnx.com/developers/docs/qnx_4.25_docs/qnx4/sysarch/microkernel.html)

Page 13: OS scheduling and The anatomy of a context switch

but.. tasks are different in nature!● Some tasks require CPU time to complete

● others require I/O (waiting for user input, network data, disk)

● Some just sleep most of the time

Page 14: OS scheduling and The anatomy of a context switch

Multilevel feedback queue

Page 15: OS scheduling and The anatomy of a context switch

And many more..

Page 16: OS scheduling and The anatomy of a context switch

So, how do you interrupt a running process in the middle of its execution, and then resume the work later on?

Page 17: OS scheduling and The anatomy of a context switch

Use a timer interrupt!● Ask the processor to interrupt the active process and wake up

the kernel every X interval (in the goold old Linux days, this was defined as CONFIG_HZ):

void do_timer(struct pt_regs *regs) // Linux kernel (some version) interrupt handler{ jiffies_64++;

update_process_times(user_mode(regs)); update_times();}

Page 18: OS scheduling and The anatomy of a context switch

The “context” in this programbits 16 ; 16 bits real mode

start:

cli ; disable interrupts

mov si, msg ; SI points to RAM address of msg

mov ah, 0x0e ; print char service, with int 0x10

.loop lodsb ; load RAM: AL <- [DS:SI] && SI++

or al, al ; end of string?

jz halt

int 0x10 ; call BIOS print char

jmp .loop ; next char

halt: hlt ; halt

msg: db "Hello, World!", 0

Page 19: OS scheduling and The anatomy of a context switch

# void swtch(struct context **old, struct context *new);

#

# Save current register context in old

# and then load register context from new.

.globl swtch

swtch:

movl 4(%esp), %eax

movl 8(%esp), %edx

# Save old callee-save registers

pushl %ebp ; Save old process ebp (base pointer) register

pushl %ebx ; Save old process ebx (general) register

pushl %esi ; Save old process esi (source index) register

pushl %edi ; Save old process edi (destination index) register

# Switch stacks

movl %esp, (%eax)

movl %edx, %esp

# Load new callee-save registers

popl %edi ; Load new process edi

popl %esi ; Load new process esi

popl %ebx ; Load new process ebx

popl %ebp ; Load new process ebp

ret

(source: http://samwho.co.uk/blog/2013/06/01/context-switching-on-x86/)

Page 20: OS scheduling and The anatomy of a context switch

Finding out the context switch rate[us-east-1c] i-91xxxxxx [root:~]$ vmstat -w 10

procs ---------------memory-------------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

0 0 0 3275668 48308 370264 0 0 0 0 12504 8884 1 5 94 0 0

0 0 0 3275492 48308 370264 0 0 0 0 12933 8834 2 5 93 0 0

1 0 0 3275588 48308 370264 0 0 0 0 12490 8825 2 4 94 0 0

Page 21: OS scheduling and The anatomy of a context switch

Using /usr/bin/time[us-east-1c] i-xxxxxxxx [root:~]$ /usr/bin/time -v find / > /dev/null

Command being timed: "find /"

User time (seconds): 0.12

System time (seconds): 0.30

Percent of CPU this job got: 6%

Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.79

...

Voluntary context switches: 7362

Involuntary context switches: 60

...

File system inputs: 58888

Page 22: OS scheduling and The anatomy of a context switch

Voluntary vs Involuntary● Non voluntary context switches occur when the

scheduler interrupts the process (i.e. timeslice expired)

● Voluntary context switches occur when the process yields control to the CPU (i.e. waiting for I/O, sleeping)

Page 23: OS scheduling and The anatomy of a context switch

Using pidstat[us-east-1c] i-xxxxxxxx [root:~]$ pidstat -w -p 1504

Linux 3.13.0-55-generic (ip-10-xx-xx-x8) 06/21/2015 _x86_64_ (2 CPU)

09:24:23 PM UID PID cswch/s nvcswch/s Command

09:24:23 PM 0 1504 531.40 3.45 haproxy

Page 24: OS scheduling and The anatomy of a context switch

Some benchmarks● No CPU affinity

○ Intel 5150: ~4300ns/context switch○ Intel E5440: ~3600ns/context switch○ Intel E5520: ~4500ns/context switch○ Intel X5550: ~3000ns/context switch○ Intel L5630: ~3000ns/context switch○ Intel E5-2620: ~3000ns/context switch

● With CPU affinity○ Intel 5150: ~1900ns/process context switch, ~1700ns/thread context switch○ Intel E5440: ~1300ns/process context switch, ~1100ns/thread context switch○ Intel E5520: ~1400ns/process context switch, ~1300ns/thread context switch○ Intel X5550: ~1300ns/process context switch, ~1100ns/thread context switch○ Intel L5630: ~1600ns/process context switch, ~1400ns/thread context switch○ Intel E5-2620: ~1600ns/process context switch, ~1300ns/thread context switch

Performance boost: 5150: 66%, E5440: 65-70%, E5520: 50-54%, X5550: 55%, L5630: 45%, E5-2620: 45%.

(source: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html)

Page 25: OS scheduling and The anatomy of a context switch

The hidden cost

● TLB flush

● Potentially screw up L1,L2,L3 CPU cache

● Kernel tries very hard to schedule threads on the same Core/CPU/Numa node, but sometimes it can't.

Page 26: OS scheduling and The anatomy of a context switch

Without CPU affinity

(source: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html)

Page 27: OS scheduling and The anatomy of a context switch

With CPU affinity

(source: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html)

Page 28: OS scheduling and The anatomy of a context switch

Without CPU affinity With CPU affinity

(source: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html)

Page 29: OS scheduling and The anatomy of a context switch

Full context switch

● Save current process context ● TLB flush ● Potentially screw up L1,L2,L3 CPU cache● Load next process context● Can be done in hardware - but is usually

done in software.

Page 30: OS scheduling and The anatomy of a context switch

So how much is too much?

10k c.s. @ 5 micro-seconds per c.s. = 50ms CPU time spent on context switching.

If this was 100k...

Page 31: OS scheduling and The anatomy of a context switch

Some formulas● IO bound tasks - threads = number of cores * (1 + wait time / service time)

● Balanced - N = U * Ncores * (1+ I/O Wait time / CPU time) ^^ same as above without the U factor

● CPU bound tasks - threads = number of CPUs + 1 ^^ same as above but W == 0

see y i like this version of the formula? yes.I think this is a pretty good presentation.. :Pyep, but it looks like we’ll keep adding slides until tomorrow night…. lol Most probably you’re right - but this covers great thingswe have to stop somewhere. also, we need to get drunk ;)

Page 32: OS scheduling and The anatomy of a context switch

Conclusions● Scheduling is complicated

● Context switches can be very expensive

● Find the right balance when deciding how many workers to use

● Use CPU affinity to reduce cost where applicable

● And most important lesson, don't overload the scheduler with runnable threads/processes

Page 33: OS scheduling and The anatomy of a context switch

Thank youDaniel Ben-Zvi | [email protected]