Upload
karla-startup
View
315
Download
1
Embed Size (px)
Citation preview
Scheduling and Dispatch
Instructor: Hengming Zou, Ph.D.
In Pursuit of Absolute Simplicity 求于至简,归于永恒
22
Content
4.1. The Concept of Processes and Threads
4.2. Windows Processes and Threads
4.3. Windows Process and Thread Internals
4.4. Windows OS Thread Scheduling
4.5. Advanced Windows Scheduling
33
Process Concept
An operating system executes programs:– Batch system – jobs
– Time-shared systems – user programs or tasks
Process – a program in execution– Process execution must progress sequentially
A process includes:– CPU state (one or multiple threads)
– Text & data section
– Resources such as open files, handles, sockets
44
Process Concept
Traditionally, process used to be unit of scheduling – (i.e. no threads)
However, like most modern operating systems, Windows schedules threads
Our discussion assumes thread scheduling
55
Thread States
Five-state diagram for thread scheduling:– init: The thread is being created
– ready: The thread is waiting to be assigned to a CPU
– running: The thread’s instructions are being executed
– waiting: The thread is waiting for some event to occur
– terminated: The thread has finished execution
66
init
ready
waiting
running
terminated
schedulerdispatch
waiting forI/O or event
I/O or eventcompletion
interrupt quantum expired
admitted exit
Thread States
77
Process and Thread Control Blocks
Information associated with process: Process Control Block (PCB)
– Memory management information
– Accounting information
– Process-global vs. thread-specific
Information associated with thread: Thread Control Block (TCB)
– Program counter
– CPU registers
– CPU scheduling information
– Pending I/O information
88
Process Control Block (PCB)
Windows implementation of PCB is split in multiple data structures
Program Counter
Parent PID
…
Handle Table
Process ID (PID)
Registers
Next Process Block
Image File Name
PCB
List of ThreadControl Blocks
List of open files
…
Next TCB
…
Thread Control Block (TCB)
99
CPU Switch from Thread to Thread
Thread T1
executing
executing
ready orwaiting
Save state into TCB2
Reload state from TCB1
Save state into TCB1
Reload state from TCB2
Interrupt or system call Thread T2
executingInterrupt or system call
ready orwaiting
ready orwaiting
1010
Context Switch
Save the state of the old thread and load the saved state for the new thread
Context-switch time is overhead
Thread context-switch can be implemented in kernel or user mode
Interaction with MMU is required when switching between threads in different processes
1111
Thread Scheduling Queues
Ready queue – Maintains set of all threads ready and waiting to
execute
– There might be multiple ready queues, sorted by priorities
Device queue– Maintains set of threads waiting for an I/O device
– There might be multiple queues for different devices
Threads migrate between the various queues
1212
Ready Queue and I/O Device Queues
CPU
Ready queue
I/O 1 wait
I/O 2 wait
I/O n wait
I/O n queue
I/O 1 queue
I/O occurs
Time-out
ReleaseDispatch
1313
Optimization Criteria
CPU scheduling uses heuristics to manage the tradeoffs among contradicting optimization criteria.
Schedulers are optimized for certain workloads
– Interactive vs. batch processing
– I/O-intense vs. compute-intense
1414
Common Optimization Criteria
Maximize CPU utilization
Maximize throughput
Minimize turnaround time
Minimize waiting time
Minimize response time
1515
Basic Scheduling Considerations
What invokes the scheduler?
Which assumptions should a scheduler rely on?
What are its optimization goals?
Rationale:
– Multiprogramming maximizes CPU utilization
– Thread execution experiences cycles of compute- and I/O-bursts
– Scheduler should consider CPU burst distribution
1616
Alternating Sequence of CPU and I/O Bursts …load valinc valread file
wait for I/O
inc countadd data, valwrite file
wait for I/O
load valinc valread from file
wait for I/O
…
CPU burst
CPU burst
CPU burst
I/O burst
I/O burst
I/O burst
Threads can be described as:
I/O-bound – spends more time doing I/O than computations
– many short CPU bursts
CPU-bound – spends more time doing computations
– few very long CPU bursts
1717
Histogram of CPU-burst Times
Burst duration (msec)0 10 20 30
distribution
Many short CPU bursts are typical
Exact figures vary greatly by process and computer
1818
Schedulers
Long-term scheduler (or job scheduler)
– Select which processes with their threads should be brought into the ready queue
– Takes MM into consideration (swapped-out processes)
– Controls degree of multiprogramming
– Invoked infrequently, may be slow
Short-term scheduler (or CPU scheduler)
– Select which thread should be executed next and allocate CPU
– Invoked frequently, must be fast
Windows has no dedicated long-term scheduler
1919
CPU Scheduler
Select from among the threads in memory that are ready to execute, and allocate the CPU to one of them
CPU scheduling decisions may take place when a thread
– 1.Switches from running to waiting state
– 2.Switches from running to ready state
– 3.Switches from waiting to ready
– 4.Terminates
Scheduling under 1 and 4 is nonpreemptive
All other scheduling is preemptive
2020
Dispatcher
Dispatcher module gives control of CPU to the thread selected by the short-term scheduler; this involves:
– switch context
– switch to user mode
– jump to proper location in user program to restart that program
Dispatch latency – time it takes for the dispatcher to stop one thread and start another running
Windows scheduling is event-driven
– No central dispatcher module in the kernel
2121
Scheduling Algorithms: FIFO
First-In, First-Out
Also known as First-Come, First-Served (FCFS)
Thread Burst Time
T1 20
T2 5
T3 4
Suppose threads arrive in the order: T1 , T2 , T3
– The Gantt Chart for the schedule is:
2222
Scheduling Algorithms: FIFO
Waiting time for T1 = 0; T2 = 20; T3 = 25
Average waiting time: (0 + 20 + 25)/3 = 15
Convoy effect:
– short thread behind long threads experience long waiting time
T1 T2 T3
20 25 290
2323
FIFO Scheduling (Cont.)
Suppose that the threads arrive in the order
T2 , T3 , T1 .
The Gantt chart for the schedule is:
Waiting time for T1 = 9; T2 = 0; T3 = 5
Average waiting time: (9 + 0 + 5)/3 = 4.66
Much better than previous case
T1T3T2
95 290
2424
Scheduling Algorithms: Round Robin (RR)
Preemptive version of FIFO scheduling algorithm
Each thread gets a small unit of CPU time (quantum),
– usually 10-100 milliseconds
After this time has elapsed, the thread is preempted and added to the end of the ready queue
Each of n ready thread gets 1/n of the CPU time in chunks of at most quantum q time units at once
Of n threads, no one waits more than (n-1)q time units
2525
Scheduling Algorithms:Round Robin (RR)
Performance
q large FIFO q small q must be large with respect to
context switch
– otherwise overhead is too high
2626
Example of RR with Quantum = 10
Assume we have:
– Thread Burst Time
– T1 23
– T2 7
– T3 38
– T4 14
Assume all threads have same priority, the Gantt chart is:
T1 T2 T3 T4 T1 T3 T4 T1 T3 T3
0 10 17 27 37 47 57 61 64 74 82
2727
Example of RR with Quantum = 10
Round-Robin favors CPU-intense over I/O-intense threads
Priority-elevation after I/O completion can provide a compensation
Windows uses Round-Robin with a priority-elevation scheme
2828
Round Robin Performance
Shorter quantum yields more context switches
Longer quantum yields shorter average turnaround time
Thread execution time: 15
0 15
15
15
0
0
10
10
quantumcontextswitches
20
10
1
0
1
14
2929
Scheduling Algorithms: Priority Scheduling
A priority number (integer) is associated with each thread
CPU is allocated to the thread with the highest priority
– Preemptive
– Non-preemptive
3030
Priority Scheduling - Starvation
Starvation is a problem:
– low priority threads may never execute
Solutions:
– 1) Decreasing priority & aging: the Unix approachDecrease priority of CPU-intense threadsExponential averaging of CPU usage to slowly increase priority of blocked threads
– 2) Priority Elevation: the Windows/VMS approachIncrease priority of a thread on I/O completionSystem gives starved threads an extra burst
3131
Multilevel Queue
Ready queue is partitioned into separate queues:
– Real-time (system, multimedia)
– Interactive
Queues may have different scheduling algorithm
– Real-Time – RR
– Interactive – RR + priority-elevation + quantum stretching
3232
Multilevel Queue
Scheduling must be done between the queues
Fixed priority scheduling (i.e., serve all from real-time threads then from interactive)– Possibility of starvation
Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its threads– CPU reserves
3333
Multilevel Queue Scheduling
Windows uses strict Round-Robin for real-time threads
Priority-elevation can be disabled for non-RT threads
Real-time system threads
Real-time user threads
System threads
Interactive user threads
background threads
High priority
Low priority
3434
Process Creation
Parent process creates children processes, which create other processes, forming a tree of processes
– Processes start with one initial thread
Resource sharing models
– Parent and children share all resources
– Children share subset of parent’s resources
– Parent and child share no resources
Execution
– Parent’s and children's’ threads execute concurrently
– Parent waits until children terminate
3535
Process Creation (Cont.)
How to set up an address space
– Child can be duplicate of parent
– Child may have a program loaded into it
UNIX example
– fork() system call creates new process
– exec() system call used after a fork to replace the process’ memory space with a new program
Windows example
– CreateProcess() system call create new process and
– loads program for execution
3636
Processes Tree on a UNIX System
3737
Process Termination
Last thread inside a process executes last statement and returns control to operating system (exit)
– Parent may receive return code (via wait)
– Process’ resources are deallocated by OS
3838
Process Termination
Parent may terminate children processes (kill)
– Child has exceeded allocated resources
– Task assigned to child is no longer required
– Parent is exiting
OS typically does not allow child to continue if its parent terminates (depending on creation flags)– Cascading termination inside process groups
3939
Single and Multithreaded Processes
code data files
registers stack
Thread
single-threaded
code data files
registers
stack
Thread
multi-threaded
stack
registers
stack
registers
Thread Thread
4040
Benefits of Multithreading
Higher Responsiveness– Dedicated threads for handling user events
Simpler Resource Sharing– All threads in a process share same address space
Economy - fewer context switches– If threading implemented in user-space
Utilization of Multiprocessor Architectures– Multiple threads may run in parallel
4141
User Threads
Thread management within a user-level threads library
– Process is unit of CPU scheduling from kernel perspective
Examples
– POSIX Pthreads
– Mach C-threads
– Solaris threads
– Fibers on Windows
4242
Kernel Threads
Supported by the Kernel
– Thread is unit of CPU scheduling
Examples
– Windows
– Solaris
– OSF/1
– LinuxTasks can act like threads by sharing kernel data structures
4343
Multithreading Models
How are user-level threads mapped on kernel threads?
Many-to-One
– Many user-mode threads mapped on a single kernel thread
One-to-One
– Each user-mode thread mapped on a separate kernel thread
Many-to-Many
– Set of user-mode threads mapped on set of kernel threads
4444
Many-to-One Model
Used on systems that do not support kernel threads
Example:
– POSIX Pthreads
– Mach C-Threads
Kernelthread
UserThread
UserThread
UserThread
4545
One-to-One Model
Each user-level thread maps to kernel thread
Examples
– Windows
– OS/2
Kernelthread
UserThread
Kernelthread
UserThread
Kernelthread
UserThread
4646
Many-to-Many Model
Allows many user level threads to be mapped to many kernel threads.
Allows OS to create a sufficient number of kernel threads.
Example
– Solaris 2
Kernelthread
UserThread
UserThread
UserThread
Kernelthread
4747
Problems with Multithreading
Semantics of fork()/exec() or CreateProcess() system calls
Coordinated termination
Signal handling
Global data, errno, error handling
Thread specific data
Reentrant vs. non-reentrant system calls
4848
Pthreads
a POSIX standard (IEEE 1003.1c) API for thread creation and synchronization
API specifies behavior of the thread library, not an implementation
Implemented on many UNIX operating systems
Services for Unix (SFU) implement PThreads on Windows
4949
4.2. Windows Processes and Threads
5050
Windows Processes
What is a process?– Represents an instance of a running program
you create a process to run a programstarting an application creates a process
– Process defined by:Address spaceResources (e.g. open handles)Security profile (token)
Every process starts with one thread– First thread executes the program’s “main” functionCan create other threads in the same processCan create additional processes
5151
Windows Threads
What is a thread?
– An execution context within a process
– Unit of scheduling (threads run, processes don’t run)
All threads in a process share same process address space
– Services provided so threads can synchronize access to shared resources (critical sections, mutexes, events, semaphores)
All threads in the system are scheduled as peers to all others, without regard to their “parent” process
5252
Per-Process Data
Virtual address space
– program code, global storage, heap storage, threads’ stacks
Working set
– physical memory “owned” by the process
Access token
– includes security identifiers
Handle table for Windows kernel objects
5353
Per-Process Data
Environment strings
Command line
These are common to all threads in the process, but separate and protected between processes
5454
Per-Thread Data
User-mode stack
– arguments passed to thread, automatic storage, call frames
Kernel-mode stack (for system calls)
Thread Local Storage (TLS)
– array of pointers to allocate unique data
Scheduling state (Wait, Ready, Running, etc.) and priority
5555
Per-Thread Data
Hardware context
– Program counter, stack pointer, register values
– Current access mode (user mode or kernel mode)
– (saved in CONTEXT structure if not running)
Access token (optional -- overrides process’s if present)
5656
Process and Thread Identifiers
Every process and every thread has an identifier
Generically: “client ID” (debugger shows as “CID”)
– A.K.A. “process ID” and “thread ID”, respectively
– Process IDs and thread IDs are in the same “number space”
ID identifies request process or thread to its subsystem server process, in API calls that need server’s help
5757
Process and Thread Identifiers
Visible in:
– PerfMon, Task Manager (for processes),
– Process Viewer (for processes), kernel debugger, etc.
IDs are unique among all existing processes and threads
– might be reused as soon as a process or thread is deleted
5858
Process-Related Performance Counters
Object: Counter Function
Process:%PrivilegedTime
Percentage of time that the threads in the process have run in kernel mode
Process:%ProcessorTime
Percentage of CPU time that threads have used during specified interval
%PrivilegedTime + %UserTime
Process:%UserTime Percentage of time that the threads in the process have run in user mode
Process: ElapsedTime Total lifetime of process in seconds
Process: ID Process PID – process IDs are re-used
Process: ThreadCount Number of threads in a process
5959
Thread-Related Performance Counters
Object: Counter Function
Process: Priority Base Base priority of process: starting priority for thread within process
Thread:%PrivilegedTime Percentage of time that the thread was run in kernel mode
Thread:%ProcessorTime Percentage of CPU time that the threads has used during specified interval
%PrivilegedTime + %UserTime
Thread:%UserTime Percentage of time that the thread has run in user mode
Thread: ElapsedTime Total lifetime of process in seconds
Thread: ID Process PID – process IDs are re-used
Thread: ID Thread Thread ID – re-used
6060
Thread-Related Performance Counters (contd.)
Object: Counter Function
Thread: Priority Base Base priority of thread: may differ from the thread‘s starting priority
Thread: Priority Current
The thread‘s current dynamic priority
Thread: Start Address The thread‘s starting virtual address (the same for most threads)
Thread: Thread State Value from 0 through 7 – current state of thread
Thread: Thread Wait Reason
Value from 0 through 19 – reason why the thread is in wait state
6161
Tools for Obtaining Process & Thread Information
Many overlapping tools
– most show one item the others do not
Built-in tools in Windows 2000/XP:
– Task Manager, Performance Tool
– Tasklist (new in XP)
Support Tools
– pviewer - process and thread details (GUI)
– pmon -rocess list (character cell)
– tlist-shows process tree, thread details (character cell)
6262
Tools for Obtaining Process & Thread Information
Resource Kit tools:
– apimon - system call and page fault monitoring (GUI)
– oh – display open handles (character cell)
– pviewer - processes & threads and security details (GUI)
– ptree –display process tree & kill remote processes (GUI)
– pulist-lists processes and usernames (character cell)
– pstat -process/threads & driver addresses (character cell)
– qslice - can show process-relative thread activity (GUI)
6363
Tools for Obtaining Process & Thread Information
Tools from www.sysinternals.com:
Process Explorer: super Task Manager
– shows open files, loaded DLLs, security info, etc.
Pslist
– list processes on local or remote systems
Ntpmon
– shows process/thread create/deletes
– and context switches on MP systems only
Listdlls
– displays full path of EXE & DLLs loaded in each process
6464
What Are Task Manager’s “Applications”?
A meaningless term at the OS level
– Not a list of processes
– Not a list of “tasks” (another meaningless term)
– It’s a list of top level visible windows in your session that meet certain criteria
6565
What Are Task Manager’s “Applications”?
What does the status column mean?
Running:
– Windows don’t run—threads do
– Running displayed only when owning thread is waiting for a window message (e.g. not running!)
Not Responding: not waiting for window messages
To map a window to a process
– right-click on a window and select “Go to process”
6666
What Are Task Manager’s “Applications”?
6767
Process Explorer (Sysinternals)
Super Task Manager
Shows:– full image path, command line,
– environment variables, parent process,
– security access token, open handles,
– loaded DLLs & mapped files
6868
Process Explorer (Sysinternals)
6969
Lab: The Process List
Run Process Explorer & maximize window
Run Task Manager – click on Processes tab
Arrange windows so you can see both
Notice process tree vs flat list in Task Manager
If parent has exited, process is left justified
7070
Lab: The Process List
1. Sort on first column (“Process”) and note tree view disappears
2. Click on View->Show Process Tree (or CTRL+T) to bring it back
3. Notice description and company name columns
4. Hover mouse over image to see full path of image
5. Right click on a process and choose “Google”
7171
Lab: Refresh Highlighting
1. Change update speed to paused by pressing space bar
2. Run Notepad
3. In ProcExp, hit F5 and notice new process
4. Exit Notepad
5. In ProcExp, hit F5 and notice Notepad in red
Uses
– Understanding process startup sequences
– Detecting appearance of processes coming and going
7272
Process Performance
Click on Performance Tab of process properties
– Note: all these numbers can be configured as columns
7373
Thread Details
Process Explorer “Threads” tab shows which thread(s) are running
– Start address represents where the thread began running (not where it is now)
– Click Module to get details on module containing thread start address
7474
Thread Start Functions
Process Explorer can map the addresses within a module to the names of functions
– This can help identify which component within a process is responsible for CPU usage
Requires access to:
– Symbol file for that module
– Proper version of Dbghelp.dll
7575
Thread Start Functions
By default, Process Explorer looks for:
Dbghelp.dll: – in the default Windows Debugging Tools install directory
Symbols: – _NT_SYMBOL_PATH environment variable
Can also specify with Options->Configure Symbols
7676
Call Stacks
Function 2Function 2
Function 1Function 1
Function 3Function 3
Process Explorer can also show the thread call stack
– Represents sequence of functions called
Important if start address doesn’t indicate what the thread is doing
– E.g. if it’s a generic library start routine
7777
Call Stacks
Click Stack to view call stack
– Lists functions in reverse chronological order
Note that start address on Threads tab is different than first function shown in stack
– This is because all user threads start in a Windows library function which calls the programmed start address
7878
Example: Viewing Stacks
Problem: Powerpoint was hanging for 1 minute on startup
Thread stack shows waiting on a printer driver
7979
Suspending Processes
Process Explorer can suspend a process Why would you want to do this?
– You’ve started a long running job but want to pause it to do something elseLowering the priority still leaves it running…
– You’ve started a long download but want to have your network bandwidth temporarily
– Some multi-service system process activity is due to other processes calling upon their servicesSuspend a process that is consuming CPU time to see what that does to the system process in question
8080
Lab: Suspend
Start Notepad
From a command prompt:
1. Suspend Notepad process with Process Explorer
2. Try to switch back to Notepad (should not respond)
3. Open Task Manager and look at Notepad’s status in the applications tab
4. Resume Notepad
8181
Processes
Jobs
Jobs are collections of processes
Can be used to specify limits on CPU, memory, and security
Enables control over some unique process & thread settings not available through any process or thread system call– E.g. length of thread time slice
Job
Processes
8282
Jobs
How do processes become part of a job?
Job object has to be created (CreateJobObject)
Then processes are explicitly added (AssignProcessToJob)– Processes created by processes in a job automatically
are part of the jobUnless restricted, processes can “break away” from a job
Then quotas and limits are defined (SetInformationJobObject)– Examples on next slide…
8383
Process Lifetime
Created as an empty shell
Address space created with only ntdll and the main image unless created by POSIX fork()
Handle table created empty or populated via duplication from parent
Process is partially destroyed on last thread exit
Process totally destroyed on last dereference
8484
Thread Lifetime
Created within a process with a CONTEXT record
– Starts running in the kernel but has a trap frame to return to user mode
Threads run until they:
– The thread returns to the OS
– ExitThread is called by the thread
– TerminateThread is called on the thread
– ExitProcess is called on the process
8585
Why Do Processes Exit? (or Terminate?)
Normal: Application decides to exit (ExitProcess)
Usually due to a request from the UI
or: CRTL does ExitProcess when primary thread function (main, WinMain, etc.) returns to caller– this forces TerminateThread on the process’s remaining
threads
– or, any thread in the process can do an explicit ExitProcess
8686
Why Do Processes Exit? (or Terminate?)
Orderly exit requested from the desktop (ExitProcess)
– e.g. “End Task” from Task Manager “Tasks” tab
– Task Manager sends a WM_CLOSE message to the window’s message loop…
– …which should do an ExitProcess (or equivalent) on itself
8787
Why Do Processes Exit? (or Terminate?)
Forced termination (TerminateProcess)
– if no response to “End Task” in five seconds, Task Manager presents End Program dialog (which does a TerminateProcess)
– or: “End Process” from Task Manager Processes tab
Unhandled exception
– Covered in Unit 4.3 (Process and Thread Internals)
8888
Why Do Processes Exit? (or Terminate?)
8989
Job Settings
Quotas and restrictions:
– Quotas: total CPU time, # active processes, per-process CPU time, memory usage
– Run-time restrictions: priority of all the processes in job; processors threads in job can run on
– Security restrictions: limits what processes can doNot acquire administrative privilegesNot accessing windows outside the job, no reading/writing the clipboard
9090
Job Settings
– Scheduling class: number from 0-9 (5 is default) - affects length of thread timeslice (or quantum)E.g. can be used to achieve “class scheduling” (partition CPU)
9191
Jobs
Examples where Windows OS uses jobs:
– Add/Remove Programs (“ARP Job”)
– WMI provider
– RUNAS service (SecLogon) uses jobs to terminate processes at log outSU from NT4 ResKit didn’t do this
Process Explorer highlights processes that are members of jobs
– Color can be configured with Options->Configure Highlighting
– For processes in a job, click on Job tab in process properties to see details
9292
Lab: WMI Job
Jobs are used by WMI
– Example: run Psinfo (Sysinternals) and pause output
9393
Lab: RUNAS Job
1. In a command prompt: RUNAS /USER:xxx CMD(where xxx is some other local account)
2. In ProcExp, find newly created cmd.exe process
– Who is the father?
3. Run Notepad from new CMD window
4. Double click on newly highlighted process & click on Job tab
9494
Programming Slides
NOTE: The remaining slides are for use in a class that covers the programming aspects of the OS (vs a class aimed at system administrators who are not doing programming)
9595
Process Windows APIs
CreateProcess
OpenProcess
GetCurrentProcessId - returns a global ID
GetCurrentProcess - returns a handle
ExitProcess
TerminateProcess - no DLL notification
Get/SetProcessShutdownParameters
GetExitCodeProcess
GetProcessTimes
GetStartupInfo
9696
Windows Thread APIs
CreateThread
CreateRemoteThread
GetCurrentThreadId - returns global ID
GetCurrentThread - returns handle
SuspendThread/ResumeThread
ExitThread
TerminateThread - no DLL notification
GetExitCodeThread
GetThreadTimes
Windows 2000 adds:– OpenThread
– new thread pooling APIs
9797
Fibers
Implemented completely in user mode– no “internals” ramifications
– Fibers are still scheduled as threads
– Fiber APIs allow different execution contexts within a threadstackfiber-local storagesome registers (essentially those saved and restored for a procedure call)
cooperatively “scheduled” within the thread
– Analogous to threading libraries under many Unix systems
– Analogous to co-routines in assembly language
– Allow easy porting of apps that “did their own threads” under other systems
9898
Process Creation
BOOL CreateProcess( LPCSTR lpApplicationName, LPSTR lpCommandLine, LPSECURITY_ATTRIBUTES lpProcessAttributes, LPSECURITY_ATTRIBUTES lpThreadAttributes, BOOL bInheritHandles, DWORD dwCreationFlags, LPVOID lpEnvironment, LPCSTR lpCurrentDirectory, LPSTARTUPINFO lpStartupInfo, LPPROCESS_INFORMATION lpProcessInformation)
No parent/child relation in Win32
CreateProcess() – new process with primary thread
9999
typedef struct _PROCESS_INFORMATION { HANDLE hProcess; HANDLE hThread; DWORD dwProcessId; DWORD dwThreadId;} PROCESS_INFORMATION;
Parameters
fdwCreate:
– CREATE_SUSPENDED, DETACHED_PROCESS, CREATE_NEW_CONSOLE, CREATE_NEW_PROCESS_GROUP
lpStartupInfo:
– Main window appearance
– Parent‘s info: GetStartupInfo
– hStdIn, hStdOut, hStdErr fields for I/O redirection
lpProcessInformation:
– Ptr to handle & ID of new proc/thread
100100
UNIX & Win32 comparison
Windows API has no equivalent to fork()
CreateProcess() similar to fork()/exec()
UNIX $PATH vs. lpCommandLine argument
– Win32 searches in dir of curr. Proc. Image; in curr. Dir.;
in Windows system dir. (GetSystemDirectory); in Windows dir.
(GetWindowsDirectory); in dir. Given in PATH
Windows API has no parent/child relations for processes
No UNIX process groups in Windows API
– Limited form: group = processes to receive a console event
101101
Windows API Thread Creation
cbStack == 0: thread‘sstack size defaults toprimary thread‘s size
HANDLE CreateThread (LPSECURITY_ATTRIBUTES lpsa,DWORD cbStack,LPTHREAD_START_ROUTINE lpStartAddr,LPVOID lpvThreadParm,DWORD fdwCreate,LPDWORD lpIDThread)
lpstartAddr points to function declared as
DWORD WINAPI ThreadFunc(LPVOID) lpvThreadParm is 32-bit argument LPIDThread points to DWORD that receives thread ID
non-NULL pointer !
102102
VOID ExitProcess( VOID ExitProcess( UINT uExitCode);UINT uExitCode);
BOOL TerminateProcess( BOOL TerminateProcess( HANDLE hProcess, HANDLE hProcess, UINT uExitCode);UINT uExitCode);
BOOL GetExitCodeProcess( BOOL GetExitCodeProcess( HANDLE hProcess, HANDLE hProcess, LPDWORD lpExitCode);LPDWORD lpExitCode);
Exiting and Terminating a Process
Shared resources must be freed before exiting– Mutexes, semaphores, events
– Use structured exception handling
But:
_finally, _except
handlers are not
executed on
ExitProcess; no SEH on
TerminateProcess
103103
VOID ExitThread( DWORD devExitCode )
When the last thread in a process terminates, the process itself terminates(TerminateThread() does not execute final SEH)
Thread continues to exist until last handle is closed(CloseHandle())
BOOL GetExitCodeThread (HANDLE hThread, LPDWORD lpdwExitCode)
Returns exit code or STILL_ACTIVE
Windows API Thread Termination
104104
Each thread has suspend count
Can only execute if suspend count == 0
Thread can be created in suspended state
DWORD ResumeThread (HANDLE hThread)
DWORD SuspendThread(HANDLE hThread)
Both functions return suspend count or 0xFFFFFFFF on failure
Suspending and Resuming Threads
105105
Synchronization & Remote Threads
WaitForSingleObject() and WaitForMultipleObjects() with thread handles as arguments perform thread synchronization
– Waits for thread to become signaled
– ExitThread(), TerminateThread(), ExitProcess() set thread objects to signaled state
CreateRemoteThread() allows creation of thread in another process
– Not implemented in Windows 9x
C library is not thread-safe; use libcmt.lib instead
– #define _MT before any include
– Use _beginthreadex/_endthreadex instead of Create/ExitThread
106106
Windows Process and Thread Internals
Data Structures for each process/thread:
Executive process block (EPROCESS)
Executive thread block (ETHREAD)
Win32 process block
Process environment block
Thread environment block
107107
Windows Process and Thread Internals
Process environment
block
Thread environment
block
Process block(EPROCESS)
Thread block(ETHREAD)
Win32 process block
Handle table
...
Process address space
System address space
108108
Process
Container for an address space and threads
Associated User-mode Process Environment Block (PEB)
Primary Access Token
Quota, Debug port, Handle Table etc
Unique process ID
Queued to the Job, global process list and Session list
MM structures like the WorkingSet, VAD tree, AWE etc
109109
Thread
Fundamental schedulable entity in the system Represented by ETHREAD that includes a KTHREAD Queued to the process (both E and K thread) IRP list, Impersonation Access Token Unique thread ID Associated User-mode Thread Environment Block
(TEB) User-mode stack, Kernel-mode stack Processor Control Block (in KTHREAD) for CPU
state when not running
110110
ProcessObject
Handle Table
VAD VAD VAD
object
object
Virtual Address Space Descriptors
Access Token
Thread Thread Thread . . .Access Token
See kernel debuggercommands:
dt (see next slide)!process!thread!token!handle!object
Processes & Threads Internal Data Structures
111111
Process/Thread Kernel Debugger Commands
!process [/s Session] [Address/Pid [Flags]]
– !process – display current process (not full details)
– !process 342 – display full details of process 342
– !process 829fa030 – display process identified by EPROCESS address
– !process 0 0 – summary display of all processes
– !process 0 7 – full details of all processes
112112
Process/Thread Kernel Debugger Commands
!thread [Address [Flags]]
– !thread – current thread
– !thread 826e8898display thread identified by ETHREAD address
To view user stack, must set process context:
– .process <address of EPROCESS>
– .context <address of page directory (Dirbase)>
!peb [Address]
!teb [Address]
113113
PROCESS ff704020 Cid: 0075 Peb: 7ffdf000 ParentCid: 005d DirBase: 0063c000 ObjectTable: ff7063c8 TableSize: 70. Image: Explorer.exe VadRoot ff70d6e8 Clone 0 Private 229. Modified 236. Locked 0. FF7041DC MutantState Signalled OwningThread 0 Token e1462030 ElapsedTime 0:01:19.0874 UserTime 0:00:00.0991 KernelTime 0:00:02.0613 QuotaPoolUsage[PagedPool] 18317 QuotaPoolUsage[NonPagedPool] 3824 Working Set Sizes (now,min,max) (727, 20, 45) (2908KB, 80KB, 180KB) PeakWorkingSetSize 757 VirtualSize 29 Mb PeakVirtualSize 31 Mb PageFaultCount 1396 MemoryPriority FOREGROUND BasePriority 8 CommitCharge 250
EPROCESS address Process ID Address of process environment block
Process ID ofparent process
Time the processhas been running,divided into Userand Kernel time
Physical address of Page Directory
root of the process’sVirtual AddressDescriptor tree
Process Block (!process)
114114
Process Block Layout
Quota Block
Exit Status
Primary Access Token
Process ID
Parent Process ID
Exception Port
Debugger Port
Handle Table
Process Environment Block
Create and Exit Time
Next Process Block
Image File Name
Process Priority Class
Memory Management Information
EPROCESS
Kernel Process Block (or PCB)
Image Base Address
Win32 Process Block
Dispatcher Header
Processor Affinity
Kernel Time
User Time
Inwwap/Outswap List Entry
Process Spin Lock
Resident Kernel Stack Count
Process Base Priority
Default Thread Quantum
Process State
Thread Seed
Disable Boost Flag
Process Page Directory
KTHREAD . . .
115115
Process Block Layout
lkd> dt nt!_EPROCESS +0x000 Pcb : _KPROCESS +0x06c ProcessLock : _EX_PUSH_LOCK +0x070 CreateTime : _LARGE_INTEGER +0x078 ExitTime : _LARGE_INTEGER +0x080 RundownProtect : _EX_RUNDOWN_REF +0x084 UniqueProcessId : Ptr32 Void +0x088 ActiveProcessLinks : _LIST_ENTRY +0x090 QuotaUsage : [3] Uint4B +0x09c QuotaPeak : [3] Uint4B +0x0a8 CommitCharge : Uint4B +0x0ac PeakVirtualSize : Uint4B +0x0b0 VirtualSize : Uint4B
.
. NOTE: Add “-r” to recurse through substructures
116116
THREAD 83160f60 Cid 9f.3d Teb: 7ffdc000 Win32Thread: e153d2c8WAIT: (WrUserRequest) UserMode Non-Alertable 808e9d60 SynchronizationEvent Not impersonating Owning Process 81b44880 WaitTime (seconds) 953945 Context Switch Count 2697 LargeStack UserTime 0:00:00.0289 KernelTime 0:00:04.0664 Start Address kernel32!BaseProcessStart (0x77e8f268) Win32 Start Address 0x020d9d98 Stack Init f7818000 Current f7817bb0 Base f7818000 Limit f7812000 Call 0 Priority 14 BasePriority 8 PriorityDecrement 6 DecrementCount 13Kernel stack not resident.
ChildEBP RetAddr Args to Child f7817bb0 8008f430 00000001 00000000 00000000 ntoskrnl!KiSwapThreadExit f7817c50 de0119ec 00000001 00000000 00000000 ntoskrnl!KeWaitForSingleObject+0x2a0 f7817cc0 de0123f4 00000001 00000000 00000000 win32k!xxxSleepThread+0x23c f7817d10 de01f2f0 00000001 00000000 00000000 win32k!xxxInternalGetMessage+0x504 f7817d80 800bab58 00000001 00000000 00000000 win32k!NtUserGetMessage+0x58 f7817df0 77d887d0 00000001 00000000 00000000 ntoskrnl!KiSystemServiceEndAddress+0x4 0012fef0 00000000 00000001 00000000 00000000 user32!GetMessageW+0x30
Address of ETHREAD
Thread ID
Address of threadenvironment block
Objects beingwaited on
Threadstate
Address of systemservice dispatch table
Priority Information
Actual thread start address
Stack trace
Address of user thread function
Process ID
Thread Block (!thread)
117117
Thread Block
ETHREAD
Create and Exit Time
Process ID
Thread Start Address
Impersonation Information
LPC Message Information
EPROCESS
Access Token
KTHREAD
Timer InformationPending I/O Requests
Total User Time
Total Kernel Time
Thread Scheduling Information
Synchronization Information
List of Pending APCs
Timer Block and Wait Blocks
List of Objects Thread is Waiting On
System Service Table
TEB
KTHREAD
Thread Local Storage Array
Kernel Stack Information
Dispatcher Header
Trap Frame
118118
Thread Block (!strct ethread)
lkd> dt nt!_ETHREAD +0x000 Tcb : _KTHREAD +0x1c0 CreateTime : _LARGE_INTEGER +0x1c0 NestedFaultCount : Pos 0, 2 Bits +0x1c0 ApcNeeded : Pos 2, 1 Bit +0x1c8 ExitTime : _LARGE_INTEGER +0x1c8 LpcReplyChain : _LIST_ENTRY +0x1c8 KeyedWaitChain : _LIST_ENTRY +0x1d0 ExitStatus : Int4B +0x1d0 OfsChain : Ptr32 Void +0x1d4 PostBlockList : _LIST_ENTRY +0x1dc TerminationPort : Ptr32 _TERMINATION_PORT +0x1dc ReaperLink : Ptr32 _ETHREAD
119119
Process Environment Block
Mapped in user space
Image loader, heap manager, Windows system DLLs use this info
View with !peb or dt nt!_peb
Image base addressModule list
Thread-local storage dataCode page data
Critical section time-outNumber of heaps
Heap size info
GDI shared handle tableOS version no infoImage version info
Image process affinity mask
Processheap
120120
Thread Environment Block
User mode data structure
Context for image loader and various Windows DLLs
View with !teb or dt nt!_teb
Exception listStack baseStack limit
Thread IDActive RPC handle
LastError valueCount of owned crit. sect.
Current localeUser32 client info
GDI32 infoOpenGL infoTLS array
Subsyst. TIB
Fiber info
PEB
Winsock data
121121
Flow of CreateProcess()
1. Open image file (.EXE) to be executed inside the process
2. Create Windows NT executive process object
3. Create initial thread
1. stack, context, Win NT executive thread object)
4. Notify Windows subsystem of new process so that it can set up for new proc.& thread
5. Start execution of initial thread
1. unless CREATE_SUSPENDED was specified)
6. In context of new process/thread:
1. complete initialization of address space (load DLLs)
2. and begin execution of the program
122122
Open EXE andcreate selection
object
Create NTprocess object
Create NTthread objectNotify Windowssubsystem
Set up for newprocess and
thread
Start execution of the initialthread
Return to caller
Finalprocess/imageinitialization
Start executionat entry point to
image
Creating process
Windows subsystem
New process
Stages Windows follows to create a process
123123
CreateProcess: some notes
CreationFlags: independent bits for priority class -> NT assigns lowest-prio class set
Default prio class is normalunless creator has prio class idle
If real-time prio class is specified andcreator has insufficient privileges:prio class high is used
Caller‘s current desktop is used if no desktop is specified
Priority classes:• Real-time• High• Normal• idle
124124
Opening the image to be executed
What kind of application is it?
Run CMD.EXE Run NTVDM.EXE Use .EXE directly
Run NTVDM.EXERun POSIX.EXERun OS2.EXE
Win16 Windows
OS/2 1.x MS-DOS .EXE,.COM, or .PIF
MS-DOS .BATor .CMD
POSIX
125125
If executable has no Windows format...
CreateProcess uses Windows “support image”
No way to create non-Windows processes directly
– OS2.EXE runs only on Intel systems
– Multiple MS-DOS apps may share virtual DOS machine
– .BAT of .CMD files are interpreted by CMD.EXE
– Win16 apps may share virtual dos machine (VDM)Flags: CREATE_SEPARATE_WOW_VDM, CREATE_SHARED_WOW_VDMDefault: HKLM\System...\Control\WOW\DefaultSeparateVDM
– Sharing of VDM only if apps run on same desktop under same security
126126
If executable has no Windows format...
Debugger may be specified under (run instead of app !!)
– \Software\Microsoft\WindowsNT\CurrentVersion\ImageFileExecutionOptions
127127
Process Creation - next Steps...
CreateProcess has opened Windows executable and created a section object to map in proc‘s addr space
Now: create executive process object via NtCreateProcess
– Set up EPROCESS block
– Create initial process address space (page directory, hyperspace page, working set list)
– Create kernel process block (set inital quantum)
– Conclude setup of process address space VM, map NTDLL.DLL, map lang support tables,
register process: PsActiveProcessHead
– Set up Process Environment Block
– Complete setup of executive process object
128128
Further Steps...(contd.)
Create Initial Thread and Its Stack and Context
– NtCreateThread;
– new thread is suspended until CreateProcess returns
Notify Windows Subsystem about new process
KERNEL32.DLL sends message to Windows subsystem including:
– Process and thread handles
– Entries in creation flags
– ID of process‘s creator
– Flag describing Windows app (CSRSS may show startup cursor)
129129
Further Steps...(contd.)
Windows: duplicate handles (inc usage count), set priority class, bookkeeping
– allocate CSRSS proc/thread block, init exception port, init debug port
– Show cursor (arrow & hourglass), wait 2 sec for GUI call, then wait 5 sec for window
130130
CreateProcess: final steps
Process Initialization in context of new process:
Lower IRQL level (dispatch -> Async.Proc.Call. level)
Enable working set expansion
Queue APC to exec LdrInitializeThunk in NTDLL.DLL
Lower IRQL level to 0 – APC fires,
– Init loader, heap manager, NLS tables,
– TLS array, critical section, structures
– Load DLLs, call DLL_PROCESS_ATTACH func
131131
CreateProcess: Final Steps
Debuggee: all threads are suspended
– Send msg to proc‘s debug port Windows creates CREATE_PROCESS_DEBUG_INFO event
Image begins execution in user-mode (return from trap)
132132
1. DLL notification - unless TerminateProcess used
2. All handles to executive and kernel objects are closed
3. Terminate any active threads
4. exit code changes from STILL_ACTIVE to the specified exit code:
BOOL GetExitCodeProcess(HANDLE hProcess,LPDWORD lpdwExitCode);
5. Process object & thread objects become signaled
6. When handle and reference counts to process object == 0, process object is deleted
Process Rundown Sequence
Thread Startup(in-context thread init.)
Lower IRQLto APC
Enable workingset expansion
Queue user-modeAPC to run
LdrInitializeThunkAnd lower IRQL to 0
Perform in-processcontext initialization(init loader, load DLLs)
Process has
debugger?Suspend allthreads
Send new threadmessage tosubsystem
Resume allthreads
Notify debuggerprocess of newprocess and wait
for replyRestore trapframe and dismissexception
Begin execution in
user mode
LPC send/receive
APC fires
yes
no
User mode
Inside CSRSS
Kernel mode
Kernel mode
134134
1. DLL notification- unless TerminateThread was used
2. All handles to Windows User and GDI objects are closed
3. Outstanding I/Os are cancelled
4. Thread stack is deallocated
5. exit code changes from STILL_ACTIVE to the specified exit code
BOOL GetExitCodeThread(HANDLE hThread,LPDWORD lpdwExitCode);
6. Thread kernel object becomes signaled
7. When handle and reference counts == 0, thread object deleted
8. If last thread in process, process exits
Thread Rundown Sequence
135135
Start of Thread Wrapper
All threads in all processes appear to have one of just two different start addresses, regardless of.EXE running
– One for thread 0 (start of process wrapper)
– the other for all other threads (start of thread wrapper)
These “wrapper” functions are what Process Viewer shows as Thread Start Address for Windows apps
136136
Start of Thread Wrapper
Start of process and start of thread wrappers have same behavior
– Provides default exception handling, access to debugger, etc.
– Forces thread exit when thread function returns
To find “real” Windows start address, use TLIST <processname> (or Kernel Debugger !thread command)
137137
void BaseProcessStart [or BaseThreadStart - basically the same]
(LPTHREAD_START_ROUTINE lpStartAddr, LPVOID lpvThreadParm)
{
__try {
DWORD dwThreadExitCode = lpStartAddr(lpvThreadParm);
ExitThread(dwThreadExitCode);
}
__except(UnhandledExceptionFilter(
GetExceptionInformation())) {
ExitProcess(GetExceptionCode());
}
}
Start of Process/Thread Function (conceptual model)
138138
if process has a debugger attached return EXCEPTION_CONTINUE_SEARCHif AUTO=0 { // run debugger automatically? Display message box; // no - ask user what to do if(clicked OK)
ExitProcess();}
// either AUTO=1, or (AUTO=0 and user clicked CANCEL),// so run debuggerGetProfileString("AEdebug","debugger",...);hEvent = CreateEvent( ... );hProcess = CreateProcess(...); // Create debugger - pass process id, event to signalWaitForMultipleObjects( [hEvent, hProcess] );return EXCEPTION_CONTINUE_SEARCH;
Windows Unhandled Exception Filter
139139
Windows Unhandled Exception Filter
Implication: you can connect a debugger (VC++ or WinDbg) to a running process
– C:\> msdev -p pid
140140
Process Crashes (Windows 2000)
Registry defines behavior for unhandled exceptions
– HKLM\Software\Microsoft\Windows NT\CurrentVersion\AeDebug
– Debugger=filespec of debugger to run on app crash
– Auto 1=run debugger immediately 0=ask user first
141141
Process Crashes (Windows 2000)
Default on retail system is Auto=1; Debugger=DRWTSN32.EXE
Default with VC++ is Auto=0, Debugger=MSDEV.EXE
142142
On XP & Server 2003, when an unhandled exception occurs:
– System first runs DWWIN.EXEDWWIN creates a process microdump and XML file and offers the option to send the error report
– Then runs debugger (default is Drwtsn32.exe)
Process Crashes (XP & Server 2003)
143143
Windows Error Reporting
Configurable with System Properties->Advanced->Error Reporting
– HKLM\SOFTWARE\Microsoft\PCHealth\ErrorReporting
Configurable with group policies
– HKLM\SOFTWARE\Policies\Microsoft\PCHealth
144144
Scheduling Criteria
CPU utilization – keep the CPU as busy as possible
Throughput – # of processes/threads that complete their execution per time unit
Turnaround time – amount of time to execute a particular process/thread
Waiting time – amount of time a process/thread has been waiting in the ready queue
Response time – amount of time it takes from when a request was submitted until the first response is produced, not output (i.e.; the hourglass)
145145
Windows Scheduler
Priority-driven, preemptive scheduling system
Highest-priority runnable thread always runs
Thread runs for time amount of quantum
No single scheduler – event-based scheduling code spread across the kernel
146146
Windows Scheduler
Dispatcher routines triggered by the following events:
– Thread becomes ready for execution
– Thread leaves running state (quantum expires, wait state)
– Thread‘s priority changes (system call/NT activity)
– Processor affinity of a running thread changes
147147
Windows Scheduling Principles
32 priority levels
Threads within same priority are scheduled Round-Robin
Non-real-time priorities are adjusted dynamically
– Priority elevation as response to certain I/O and dispatch
– Quantum stretching to optimize responsiveness
Real-time priorities are assigned statically to threads
148148
Scheduling
Multiple threads may be ready to run
“Who gets to use the CPU?”
From Windows API point of view:
Processes are given a priority class upon creation
– Idle, Normal, High, Realtime
– Windows 2000 added “Above normal” and “Below normal”
Threads have a relative priority within the class
– Idle, Lowest, Below_Normal, Normal,
– Above_Normal, Highest, and Time_Critical
149149
Windows Scheduling-related APIs:Get/SetPriorityClassGet/SetThreadPriorityGet/SetProcessAffinityMaskSetThreadAffinityMaskSetThreadIdealProcessorSuspend/ResumeThread
Scheduling
From the kernel’s view:
– Threads have priorities 0 through 31
– Threads are scheduled, not processes
– Priority class is not used to make schedule decisions
150150
Kernel: Thread Priority Levels
16 “real-time” levels
15 variable levels
Used by zero page thread
Used by idle thread(s)
31
16
0
i
15
1
151151
Windows vs. NT Kernel Priorities
Win32 Process Classes
Realtime HighAboveNormal Normal
BelowNormal Idle
Win32 Time-critical 31 15 15 15 15 15Thread Highest 26 15 12 10 8 6
Priorities Above-normal 25 14 11 9 7 5
Normal 24 13 10 8 6 4
Below-normal 23 12 9 7 5 3
Lowest 22 11 8 6 4 2
Idle 16 1 1 1 1 1
152152
Windows vs. NT Kernel Priorities
Table shows base priorities
– current or dynamic thread priority may be higher if base <15
Many utilities (such as Process Viewer) show the “dynamic priority” of threads rather than the base
– Performance Monitor can show both
Drivers can set to any value with KeSetPriorityThread
153153
Special Thread Priorities
Idle threads -- one per CPU
When no threads want to run, Idle thread “runs”
– Not a real priority level - appears to have priority zero, but actually runs “below” priority 0
– Provides CPU idle time accounting (unused clock ticks are charged to the idle thread)
Loop:
– Calls HAL to allow for power management
– Processes DPC list; Dispatches to a thread if selected
Server 2003:
– in certain cases, scans per-CPU ready queues for next thread
154154
Special Thread Priorities
Zero page thread -- one per NT system
– Zeroes pages of memory in anticipation of “demand zero” page faults
– Runs at priority zero (lower than any reachable from Windows)
– Part of the “System” process (not a complete process)
155155
Thread Scheduling Priorities vs. Interrupt Request Levels (IRQLs)
Passive_LevelAPC
Dispatch/DPCDevice 1
.
.
.Device nClock
Interprocessor InterruptPower fail
High
Hardware interrupts
IRQLs
Software interrupts
012
302928
31
Thread priorities
0-31
156156
Priority driven, preemptive
– 32 queues (FIFO lists) of “ready” threads
– UP: highest priority thread always runs
– MP: One of the highest priority runnable thread will be running somewhere
– No attempt to share processor(s) “fairly” among processes, only among threadsTime-sliced, round-robin within a priority level
Single Processor Thread Scheduling
157157
Event-driven:
– no guaranteed execution period before preemption
– When a thread becomes Ready, it either runs immediately or is inserted at the tail of the Ready queue for its current (dynamic) priority
Single Processor Thread Scheduling
158158
Thread Scheduling
No central scheduler!
– there is no always-instantiated routine called “scheduler”
The “code that does scheduling” is not a thread
Scheduling routines are simply called whenever events occur that change the Ready state of a thread
159159
Thread Scheduling
Things that cause scheduling events include:
– interval timer interrupts (for quantum end)
– interval timer interrupts (for timed wait completion)
– other hardware interrupts (for I/O wait completion)
– one thread changes the state of a waitable object upon which other thread(s) are waiting
– a thread waits on one or more dispatcher objects
– a thread priority is changed
160160
Thread Scheduling
Based on doubly-linked lists (queues) of Ready threads
– Nothing that takes “order-n time” for n threads
161161
Scheduling Data Structures
Process
thread thread
Process
thread thread
Default base prioDefault proc affinityDefault quantum
31
0
Ready summary Idle summary31 0 31 0
Base priorityCurrent priorityProcessor affinityQuantum
Bitmask for non-emptyready queuesBitmask for idle CPUs
162162
Scheduling Scenarios
Preemption
– A thread becomes Ready at a higher priority than the running thread
– Lower-priority Running thread is preempted
– Preempted thread goes back to head of its Ready queueaction: pick lowest priority thread to preempt
Voluntary switch
– Waiting on a dispatcher object
– Termination
– Explicit lowering of priorityaction: scan for next Ready thread starting at your priority & down)
163163
Scheduling Scenarios
Running thread experiences quantum end
– Priority is decremented unless already at base priority
– Thread goes to tail of ready queue for its new priority
– May continue running if no equal or higher-priority threads are Readyaction: pick next thread at same priority level
164164
181716151413
Running Ready
from Wait state
Scheduling Scenarios Preemption
Preemption is strictly event-driven
– does not wait for the next clock tick
– no guaranteed execution period before preemption
– threads in kernel mode may be preempted (unless they raise IRQL to >= 2)
165165
181716151413
Running Ready
from Wait state
Scheduling Scenarios: Ready after Wait
If newly-ready thread is no higher than running thread…
– it is put at tail of ready queue for its current priority
– If priority >=14 quantum is reset (t.b.d.)
– If priority <14 and you’re about to be boosted and didn’t already have a boost, quantum is set to process quantum - 1
166166
Scheduling Scenarios: Voluntary Switch
to Waiting state
181716151413
Running Ready
When the running thread gives up the CPU…
– Schedule the thread at head of next non-empty “ready” queue
167167
Scheduling Scenarios: Quantum End (“time-slicing”) When the running thread exhausts its CPU quantum, it goes to the end of its ready queue
Applies to both real-time and dynamic priority threads, user and kernel mode
– Quantums can be disabled for a thread by a kernel function
Default quantum on Professional is 2 clock ticks, 12 on Server
– standard clock tick is 10 msec;
– might be 15 msec on some MP Pentium systems
if no other ready threads at that priority, same thread continues running (just gets new quantum)
if running at boosted priority, priority decays by one at quantum end (described later)
168168
Scheduling Scenarios: Quantum End (“time-slicing”)
Running Ready181716151413
169169
Basic Thread Scheduling States
Ready (1) Running (2)
Waiting (5)
voluntaryswitch
preemption, quantum end
170170
Watching Scheduling
CPUSTRES.EXE - Creating a Test Case
Run: cpustres.exe(Resource Kit)
171171
Watching the SchedulerPerformance Monitor - Threads Object
Screen snapshot from: Programs | Admin. Tools | Performance Monitor select “Add to Chart”, and Object: Thread. use Ctrl-leftClick to select multiple items in a selection box
172172
Watching the SchedulerPerformance Monitor - Options | Chart
Screen snapshot from: Performance MonitorOptions menu | Chart command
Set chart maximum vertical scale to 16
Set update interval to 0.1 seconds or less
173173
Watching the SchedulerPerformance Monitor
Screen snapshot from:PerfMon main window, setup from previous slide
Thread states are indicated by numbers (see thread state transition diagram on previous slide, or Perfmon Explain display for Thread State counter)
5 = waiting2 = running1 = ready
174174
Priority Adjustments
Dynamic priority adjustments (boost and decay) are applied to threads in “dynamic” classes– Threads with base priorities 1-15 (technically, 1 through 14)
– Disable if desired with SetThreadPriorityBoost or SetProcessPriorityBoost
Five types:– I/O completion
– Wait completion on events or semaphores
– When threads in the foreground process complete a wait
– When GUI threads wake up for windows input
– For CPU starvation avoidance
175175
Priority Adjustments
No automatic adjustments in real-time class (16 or above)
Real time here really means “system won’t change the relative priorities of your real-time threads”
Hence, scheduling is predictable with respect to other “real-time” threads (but not for absolute latency)
176176
To favor I/O intense threads:
After an I/O: specified by device driver– IoCompleteRequest( Irp, PriorityBoost )
Common boost values (see NTDDK.H)1: disk, CD-ROM, parallel, Video2: serial, network, named pipe, mailslot6: keyboard or mouse8: sound
Priority Boosting
177177
Other cases discussed in WIN Scheduling Internals Section
– After a wait on executive event or semaphore
– After any wait on a dispatcher object by a thread in the foreground process
– GUI threads that wake up to process windowing input (e.g. windows messages) get a boost of 2
Priority Boosting
178178
Thread Priority Boost and Decay
Behavior of these boosts:
– Applied to thread’s base prioritywill not take you above priority 15
– After a boost, you get one quantumThen decays 1 level, runs another quantum
179179
Priority
BasePriority
Run Wait Run
Preempt(beforequantumend)
Run
Priority decayat quantum end
Boostuponwaitcomplete
Round-robin atbase priority
quantum
Time
Thread Priority Boost and Decay
180180
Thread Scheduling States (2000, XP)
Ready (1) Running (2)
Waiting (5)
Ready = thread eligible to be scheduled to runStandby = thread is selected to run on CPU
voluntaryswitch
preemption,
quantum end
Init (0)
Terminate (4)
Transition (6)
wait resolvedafter kernelstack made pageable
Standby (3)preempt
181181
Other Thread States
Transition– Thread was in a wait entered from user mode for 12 seconds or
more
– System was short on physical memory
– Balance set manager (t.b.d.) marked the thread’s kernel stack as pageable (preparatory to “outswapping” the thread’s process)
– Later, the thread’s wait was satisfied, but...
– ...Thread can’t become Ready until the system allocates a nonpageable kernel stack; it is in the “transition” state until then
Initiate– Thread is “under construction” and can’t run yet
Standby– One processor has selected a thread for execution on another
processor
Terminate– Thread has executed its last code, but can’t be deleted until
all handles and references to it are closed (object manager)
182182
Scheduling Scenarios: Quantum Details
Quantum internally stored as “3 * number of clock ticks”
– Default quantum is 6 on Professional, 36 on Server
Thread->Quantum field is decremented by 3 on every clock tick
Process and thread objects have a Quantum field
– Process quantum is simply used to initialize thread quantum for all threads in the process
Quantum decremented by 1 when you come out of a wait
– So that threads that get boosted after I/O completion won't keep running and never experiencing quantum end
– Prevents I/O bound threads from getting unfair preference over CPU bound threads
183183
Scheduling Scenarios: Quantum Details
When Thread->Quantum reaches zero(or less than zero):– you’ve experienced quantum end
– Thread->Quantum = Process->Quantum; // restore quantum
– for dynamic-priority threads, this is the only thing that restores the quantum
– for real-time threads, quantum is also restored upon preemption
Interval timer interrupts when previous IRQL >= 2:– are not charged to the current thread’s “privileged”
time
– but do cause the thread “remaining quantum” counter to be decremented
184184
Quantum Stretching
Favoring foreground applications
If normal-priority process owns the foreground window, its threads may be given longer quantum
– Set by Control Panel / System applet / Performance tab
– Stored in…\System\CurrentControlSet\Control\PriorityControlWin32PrioritySeparation = 0, 1, or 2
– New behavior with 4.0 formerly implemented via priority shift
185185
Quantum Stretching
Screen snapshot from:Control Panel | System |Performance tab
186186
Quantum Stretching
Resulting quantum:– “Maximum” = 6 ticks
– (middle) = 4 ticks– “None” = 2 ticks
Quantum stretching does not happen on Server– Quantum on Server is always 12 ticks
8
Running Ready
187187
As of Windows 2000, can choose short or long quantums (e.g. for Terminal Services)
– NT Server 4.0 was always the same, regardless of slider bar
Screen snapshot from:Control Panel | System | Advanced tab | Performance
Windows 2000:
XP:
Quantum Selection
188188
Finer grained quantum control can be achieved by modifying
– HKLM\System\CurrentControlSet\Control\PriorityControl\
Win32PrioritySeparation
– 6 bit value
Short vs. Long Quantum BoostVariable vs.
Fixed
024
Quantum Control
189189
Short vs. Long0,3 default (short for Pro, long for
Server)1 long2 short
Variable vs. Fixed0,3 default (yes for Pro, no for
Server)1 yes2 no
Quantum Boost0 fixed (overrides above setting)1 double quantum of foreground
threads2,3 triple quantum of foreground
threads
Quantum Control
190190
Controlling Quantum with Jobs
Scheduling class
Quantum units
0 6
1 12
2 18
3 24
4 30
5 36
6 42
7 48
8 54
9 60
If a process is a member of a job, quantum can be adjusted by setting the “Scheduling Class”
– Only applies if process is higher then Idle priority class
– Only applies if system running with fixed quantums (the default on Servers)
Values are 0-9
– 5 is default
191191
Common boost values (see NTDDK.H)1: disk, CD-ROM, parallel, Video2: serial, network, named pipe, mailslot6: keyboard or mouse8: sound
After an I/O: specified by device driver
– IoCompleteRequest( Irp, PriorityBoost )
After a wait on executive event or semaphore
– Boost value of 1 is used for these objects– Server 2003: for critical sections and pushlocks:
Waiting thread is boosted to 1 more than setting thread’s priority (max boost is to 13)
Setting thread loses boost (lock convoy issue)
Priority Boosting
192192
After any wait on a dispatcher object by a thread in the foreground process:
– Boost value of 2XP/2003: boost is lost after one full quantum
– Goal: improve responsiveness of interactive apps
GUI threads that wake up to process windowing input (e.g. windows messages) get a boost of 2
– This is added to the current, not base priority
– Goal: improve responsiveness of interactive apps
Priority Boosting
193193
Lab: Foreground Priority Boosts
See Book “EXPERIMENT: Watching Foreground Priority Boosts and Decays”, p.351
See Book “EXPERIMENT: Watching Priority Boosts on GUI Threads”, p.353
194194
CPU Starvation Avoidance
Balance Set Manager (sys thread) looks for starved threads– This is a thread, running at priority 16
– Wakes up once per second and examines Ready queues
– Looks for threads that have been Ready for 300 clock ticksapproximate 4 seconds on a 10ms clock
– Attempts to resolve “priority inversions” (high priority thread (12 in diagram) waits on something locked by a lower thread (4), which can’t run because of a middle priority CPU-bound thread (7)), but not deterministically (no priority inheritance)
12
4
7
Wait
Run
Ready
195195
Priority is boosted to 15 (14 prior to NT 4 SP3)
– Quantum is doubled on Win2000/XP and set to 4 on 2003
– At quantum end, returns to previous priority (no gradual decay) and normal quantum
Scans up to 16 Ready threads per priority level each pass
Boosts up to 10 Ready threads per pass
Like all priority boosts, does not apply in the real-time range (priority 16 and above)
CPU Starvation Avoidance
196196
Lab: CPU Starvation Resolution
See Book EXPERIMENT: Watching Priority Boosts for CPU Starvation, p.355
– CpuStres with two compute-bound threads (“maximum” activity level)
– One is at lower priority than the other
See Book EXPERIMENT: “Listening to Priority Boosting”, p.357
197197
Multiprocessor Scheduling
Threads can run on any CPU, unless specified otherwise
– Tries to keep threads on same CPU (“soft affinity”)
– Setting of which CPUs a thread will run on is called “hard affinity”
Fully distributed (no “master processor”)
– Any processor can interrupt another processor to schedule a thread
Scheduling database:
– Pre-Windows Server 2003: single system-wide list of ready queues
– Windows Server 2003: per-CPU ready queues
198198
Hard Affinity
Affinity is a bit mask where each bit corresponds to a CPU number
– Hard Affinity specifies where a thread is permitted to runDefaults to all CPUs
– Thread affinity mask must be subset of process affinity mask, which in turn must be a subset of the active processor mask
199199
Hard Affinity
Functions to change:
– SetThreadAffinityMask, SetProcessAffinityMask, SetInformationJobObject
Tools to change:
– Task Manager or Process ExplorerRight click on process and choose “Set Affinity”
– Psexec -a
200200
Hard Affinity
Can also set an image affinity mask
– See “Imagecfg” tool in Windows 2000 Server Resource Kit Supplement 1E.g. Imagecfg –a 2 xyz.exe will run xyz on CPU 1
Can also set “uniprocessor only”: sets affinity mask to one processor
– Imagecfg –u xyz.exe
– System chooses 1 CPU for the processRotates round robin at each process creation
– Useful as temporary workaround for multithreaded synchronization bugs that appear on MP systems
201201
Hard Affinity
NOTE: Setting hard affinity can lead to threads’ getting less CPU time than they normally would
– More applicable to large MP systems running dedicated server apps
– Also, OS may in some cases run your thread on CPUs other than your hard affinity setting (flushing DPCs, setting system time)Thread “system affinity” vs “user affinity”
202202
Every thread has an “ideal processor”
– System selects ideal processor for first thread in process (round robin across CPUs)
– Next thread gets next CPU relative to the process seed
– Can override with:
SetThreadIdealProcessor (
HANDLE hThread, // handle to the thread to be changed
DWORD dwIdealProcessor); // processor number
Soft Processor Affinity
203203
Hard affinity changes update ideal processor settings
Used in selecting where a thread runs next
For Hyperthreaded systems, new Windows API in Server 2003 to allow apps to optimize
– GetLogicalProcessorInformation
For NUMA systems, new APIs to allow apps to optimize:
– Use GetProcessAffinityMask to get list of processorsThen GetNumaProcessorNode to get node # for each CPU
– Or call GetNumaHighestNodeNumber and then GetNumaNodeProcessorMask to get processor #s for each node
Soft Processor Affinity
204204
MP Systems Only0
Process
Thread 1 Thread 2 Thread 3 Thread 4
31
Ready Queues
Ready Summary
31 0
Idle Summary Mask
31 0
Process
Active Processor Mask
31 0
Windows 2000/XP Dispatcher Database
205205
Choosing a CPU for a Ready Thread (Windows 2000) When a thread becomes ready to run (e.g. its wait
completes, or it is just beginning execution), need to choose a processor for it to run on
First, it sees if any processors are idle that are in the thread’s hard affinity mask:
– If its “ideal processor” is idle, it runs there
– Else, if the previous processor it ran on is idle, it runs there
– Else if the current processor is idle, it runs there
– Else it picks the highest numbered idle processor in the thread’s affinity mask
206206
Choosing a CPU for a Ready Thread (Windows 2000) If no processors are idle:
– If the ideal processor is in the thread’s affinity mask, it selects that
– Else if the the last processor is in the thread’s affinity mask, it selects that
– Else it picks the highest numbered processor in the thread’s affinity mask
Finally, it compares the priority of the new thread with the priority of the thread running on the processor it selected (if any) to determine whether or not to perform a preemption
207207
Selecting a Thread to Run on a CPU (Windows 2000) System needs to choose a thread to run on a specific CPU
at:
– At quantum end
– When a thread enters a wait state
– When a thread removes its current processor from its hard affinity mask
– When a thread exits Starting with the first thread in the highest priority
non-empty ready queue, it scans the queue for the first thread that has the current processor in its hard affinity mask and:
– Ran last on the current processor, or
– Has its ideal processor equal to the current processor, or
– Has been in its Ready queue for 3 or more clock ticks, or
– Has a priority >=24
208208
Selecting a Thread to Run on a CPU (Windows 2000) If it cannot find such a candidate, it selects the
highest priority thread that can run on the current CPU (whose hard affinity includes the current CPU)
– Note: this may mean going to a lower priority ready queue to find a candidate
209209
0
Process
Thread 1 Thread 2 Thread 3 Thread 4
31
CPU 0 Ready Queues
Ready Summary
31 0
Process
0
31
CPU 1 Ready Queues
Ready Summary
31 0
Deferred Ready QueueDeferred Ready Queue
Windows Server 2003 Dispatcher Database
210210
Server 2003 Enhancements
Threads always go into the ready queue of their ideal processor
Instead of locking the dispatcher database to look for a candidate to run, per-CPU ready queue is checked first (first grabs PRCB spinlock)
– If a thread has been selected to run on the CPU, does the context swap
– Else begins scan of other CPU’s ready queues looking for a thread to runThis scan is done OUTSIDE the dispatcher lockJust acquires CPU PRCB lock
211211
Server 2003 Enhancements
Dispatcher lock still acquired to wait or unwait a thread and/or change state of a dispatcher object
Bottom line: dispatcher lock is now held for a MUCH shorter time
212212
DeferredReady (7)
Running (2)
Waiting (5)
voluntaryswitch
preemption, quantum end
Init (0)
Terminate (4)
Transition (6)
Standby (3)preempt
Ready (1)
Thread Scheduling States (Server 2003)
213213
Server 2003 Enhancements
Idle processor selection further refined to: NUMA system:
– if there are idle CPUs in the node containing the thread’s ideal processor, reduce to that set
hyperthreaded system: – if one of the idle processors is a physical processor with all logical processors idle, reduce to that set
Then try to eliminate idle CPUs that are sleeping
If thread ran last on a member of the set, pick that CPU– Else pick lowest numbered CPU in remaining set
214214
Affinity Collisions
CPU 1 CPU 0Thread A:Current priority 4Affinity mask 10
Thread B:Current priority 8Affinity mask 11
Thread C:Current priority 6Affinity mask 01
Highest-priority n threads may not be running if thread affinity interferes
NT guarantees the highest-priority thread will be Running
– But lower-priority n-1 Ready threads may not be…
– because scheduler will not move running threads among CPUs
Example: Threads became Ready in order A, B, C
Thoughts Change Life意念改变生活