Upload
arleen-hancock
View
232
Download
1
Tags:
Embed Size (px)
Citation preview
Carnegie Mellon
Exceptions and Processes
Slides adapted from: Gregory Kesden and Markus Püschel of Carnegie Mellon University
Carnegie Mellon
Control Flow
<startup>inst1
inst2
inst3
…instn
<shutdown>
Processors do only one thing: From startup to shutdown, a CPU simply reads and executes
(interprets) a sequence of instructions, one at a time This sequence is the CPU’s control flow (or flow of control)
Physical control flow
Time
Carnegie Mellon
Altering the Control Flow Up to now: two mechanisms for changing control flow:
Jumps and branches Call and returnBoth react to changes in program state
Insufficient for a useful system: Difficult to react to changes in system state data arrives from a disk or a network adapter instruction divides by zero user hits Ctrl-C at the keyboard System timer expires
System needs mechanisms for “exceptional control flow”
Carnegie Mellon
Exceptional Control Flow Exists at all levels of a computer system Low level mechanisms
Exceptions change in control flow in response to a system event
(i.e., change in system state) Combination of hardware and OS software
Higher level mechanisms Process context switch Signals Nonlocal jumps: setjmp()/longjmp() Implemented by either:
OS software (context switch and signals) C language runtime library (nonlocal jumps)
Carnegie Mellon
Exceptions An exception is a transfer of control to the OS in response to
some event (i.e., change in processor state)
Examples: div by 0, arithmetic overflow, page fault, I/O request completes, Ctrl-C
User Process OS
exceptionexception processingby exception handler
• return to I_current• return to I_next•abort
event I_currentI_next
Carnegie Mellon
01
2...
n-1
Interrupt Vectors
Each type of event has a unique exception number k
k = index into exception table (a.k.a. interrupt vector)
Handler k is called each time exception k occurs
ExceptionTable
code for exception handler 0
code for exception handler 1
code forexception handler 2
code for exception handler n-1
...
Exception numbers
Carnegie Mellon
Asynchronous Exceptions (Interrupts) Caused by events external to the processor
Indicated by setting the processor’s interrupt pin Handler returns to “next” instruction
Examples: I/O interrupts
hitting Ctrl-C at the keyboard arrival of a packet from a network arrival of data from a disk
Hard reset interrupt hitting the reset button
Soft reset interrupt hitting Ctrl-Alt-Delete on a PC
Carnegie Mellon
Synchronous Exceptions Caused by events that occur as a result of executing an
instruction: Traps
Intentional Examples: system calls, breakpoint traps, special instructions Returns control to “next” instruction
Faults Unintentional but possibly recoverable Examples: page faults (recoverable), protection faults
(unrecoverable), floating point exceptions Either re-executes faulting (“current”) instruction or aborts
Aborts unintentional and unrecoverable Examples: parity error, machine check Aborts current program
Carnegie Mellon
Trap Example: Opening File User calls: open(filename, options) Function open executes system call instruction int
OS must find or create file, get it ready for reading or writing Returns integer file descriptor
0804d070 <__libc_open>: . . . 804d082: cd 80 int $0x80 804d084: 5b pop %ebx . . .
User Process OS
exception
open filereturns
intpop
Carnegie Mellon
Fault Example: Page Fault User writes to memory location That portion (page) of user’s memory
is currently on disk
Page handler must load page into physical memory Returns to faulting instruction Successful on second try
int a[1000];main (){ a[500] = 13;}
80483b7: c7 05 10 9d 04 08 0d movl $0xd,0x8049d10
User Process OS
exception: page faultCreate page and load into memoryreturns
movl
Carnegie Mellon
Fault Example: Invalid Memory Reference
Page handler detects invalid address Sends SIGSEGV signal to user process User process exits with “segmentation fault”
int a[1000];main (){ a[5000] = 13;}
80483b7: c7 05 60 e3 04 08 0d movl $0xd,0x804e360
User Process OS
exception: page fault
detect invalid addressmovl
signal process
Carnegie Mellon
User mode vs. Kernel mode Priviliged instructions
Carnegie Mellon
Exception handlers Return adress
Depends on the class of exception Current instruction (page fault) Next instruction
Push some additional processor state Necessary to restart the interrupted program when the handler
returns If the control is being transferred from user to kernel
All these items are pushed on the kernel stack Exception handlers run in kernel mode
Carnegie Mellon
Exception Table IA32 (Excerpt)
Exception Number Description Exception Class
0 Divide error Fault
13 General protection fault Fault
14 Page fault Fault
18 Machine check Abort
32-127 OS-defined Interrupt or trap
128 (0x80) System call Trap
129-255 OS-defined Interrupt or trap
Check pp. 183:http://download.intel.com/design/processor/manuals/253665.pdf
Carnegie Mellon
# write our string to stdout
movl $len,%edx # third argument: message length movl $msg,%ecx # second argument: message to write movl $1,%ebx # first argument: file handle (stdout) movl $4,%eax # system call number (sys_write) int $0x80 # call kernel
# and exit
movl $0,%ebx # first argument: exit code movl $1,%eax # system call number (sys_exit) int $0x80 # call kernel
.data # section declaration
msg: .ascii "Hello, world!\n" # our dear string len = . - msg # length of our dear string
int main(){
write(1, “hello, world\n”, 13);exit(0);
}
Carnegie Mellon
Today Exceptional Control Flow Processes
Carnegie Mellon
Processes Definition: A process is an instance of a running program.
One of the most profound ideas in computer science Not the same as “program” or “processor”
Process provides each program with two key abstractions: Logical control flow
Each program seems to have exclusive use of the CPU Private virtual address space
Each program seems to have exclusive use of main memory
How are these Illusions maintained? Process executions interleaved (multitasking) Address spaces managed by virtual memory system
we’ll talk about this in a couple of weeks
Carnegie Mellon
What is a process? A process is the OS's abstraction for execution
A process represents a single running application on the system Process has three main components: 1) Address space
The memory that the process can access Consists of various pieces: the program code, static variables, heap,
stack, etc. 2) Processor state
The CPU registers associated with the running process Includes general purpose registers, program counter, stack pointer,
etc. 3) OS resources
Various OS state associated with the process Examples: open files, network sockets, etc.
Carnegie Mellon
Process Address Space
The range of virtual memory addresses that the process can access
Includes the code of the running program
The data of the running program (static variables and heap)
An execution stack Local variables
and saved registers for each procedure call
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Address space
0x00000000
0xFFFFFFFF
Stack pointer
Program counter
Uninitialized vars(BSS segment)
(Reserved for OS)
Carnegie Mellon
Process Address Space Note!!! This is the
process's own view of the address space ---
physical memory may not be laid out this way at all. In fact, on systems
that support multiple running processes, it's pretty muchguaranteed to look different.
The virtual memory system provides this illusion to each process.
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Address space
0x00000000
0xFFFFFFFF
Stack pointer
Program counter
Uninitialized vars(BSS segment)
(Reserved for OS)
Carnegie Mellon
Execution State of a Process Each process has an execution state
Indicates what the process is currently doing Running:
Process is currently using the CPU Ready:
Currently waiting to be assigned to a CPU That is, the process could be running, but another process is using
the CPU Waiting (or sleeping):
Process is waiting for an event Such as completion of an I/O, a timer to go off, etc. Why is this different than “ready” ?
As the process executes, it moves between these states What state is the process in most of the time?
Carnegie Mellon
Process State Transitions What causes schedule and unschedule transitions?
New
Terminated
Ready
Running
Waiting
create
kill or exit I/O, page fault,
etc.
I/O done
scheduleunschedule
Carnegie Mellon
Process Control Block OS maintains a Process Control Block (PCB) for each process The PCB is a big data structure with many fields:
Process ID User ID Execution state
ready, running, or waiting Saved CPU state
CPU registers saved the last time the process was suspended. OS resources
Open files, network sockets, etc. Memory management info Scheduling priority
Give some processes higher priority than others Accounting information
Total CPU time, memory usage, etc.
Carnegie Mellon
Context Switching Processes are managed by a shared chunk of OS code
called the kernel Important: the kernel is not a separate process, but rather runs as part
of some user process Control flow passes from one process to another via a
context switch
Process A Process B
user code
kernel code
user code
kernel code
user code
context switch
context switch
Time
Carnegie Mellon
Context Switching The act of swapping a process state on or off the CPU is a context
switch
PC
Registers
PC
Registers
PID 1342State:
RunningPC
Registers
PID 4277State: Ready
PC
Registers
PID 8109State: Ready
Save current CPU state
Currently running process
Carnegie Mellon
Context Switching The act of swapping a process state on or off the CPU is a context
switch
PC
Registers
PC
Registers
PID 1342State: Ready
PC
Registers
PID 4277State: Ready
PC
Registers
PID 8109State: Ready
Suspend process
Carnegie Mellon
Context Switching The act of swapping a process state on or off the CPU is a context
switch
PC
Registers
PC
Registers
PID 1342State: Ready
PC
Registers
PID 4277State:
RunningPC
Registers
PID 8109State: Ready
Restore CPU state of new process
Pick next process
PC
Registers
Carnegie Mellon
Context Switch Overhead Context switches are not cheap
Generally have a lot of CPU state to save and restore Also must update various flags in the PCB Picking the next process to run – scheduling – is also expensive
Context switch overhead in Linux 2.4.21 About 5.4 usec on a 2.4 GHz Pentium 4 This is equivalent to about 13,200 CPU cycles!
Not quite that many instructions since CPI > 1
Carnegie Mellon
Context Switching in Linux
Process A
time
Process A is happily running along...
Carnegie Mellon
Context Switching in Linux
Process A
time
Timer interrupt handler
1) Timer interrupt fires
2) PC saved on stackUserKernel
Carnegie Mellon
Context Switching in Linux
Process A
Timer interrupt handler
time
1) Timer interrupt fires
2) PC saved on stack
Scheduler
4) Call schedule() routine
3) Rest of CPU statesaved in PCB
UserKernel
Carnegie Mellon
Context Switching in Linux
Process A
Timer interrupt handler
time
1) Timer interrupt fires
2) PC saved on stack
Scheduler5) Decide next process to run
4) Call schedule() routine
3) Rest of CPU statesaved in PCB
Timer interrupthandler
6) Resume Process B(suspended withintimer interrupt handler!)
UserKernel
Process B
7) Return from interrupthandler – process CPUstate restored
Carnegie Mellon
State Queues The OS maintains a set of state queues for each process state
Separate queues for ready and waiting states Generally separate queues for each kind of waiting process
e.g., One queue for processes waiting for disk I/O Another queue for processes waiting for network I/O, etc.
PC
Registers
PID 4277State: Ready
PC
Registers
PID 4110State:
WaitingPC
Registers
PID 4002State:
WaitingPC
Registers
PID 4923State:
Waiting
PC
Registers
PID 4391State: Ready
Ready queue
Disk I/O queue
Carnegie Mellon
State Queue Transitions PCBs move between these queues as their state changes
When scheduling a process, pop the head off of the ready queue When I/O has completed, move PCB from waiting queue to ready
queue
PC
Registers
PID 4277State: Ready
PC
Registers
PID 4110State:
WaitingPC
Registers
PID 4002State:
Waiting
PC
Registers
PID 4391State: Ready
PC
Registers
PID 4923State:
Waiting
Ready queue
Disk I/O queue
PC
Registers
PID 4923State: Ready
Disk I/O completes
Carnegie Mellon
Process Creation One process can create, or fork, another process
The original process is the parent The new process is the child
What creates the first process in the system, and when? Parent process defines resources and access rights of
children Just like real life ... e.g., child process inherits parent's user ID
% pstree -p
init(1)-+-apmd(687) |-atd(847) |-crond(793) |-rxvt(2700)---bash(2702)---ooffice(2853) `-rxvt(2752)---bash(2754)
Carnegie Mellon
PC
Registers
PID 4110State: Ready
UNIX fork mechanism In UNIX, use the fork() system call to create a new process
This creates an exact duplicate of the parent process!! Creates and initializes a new PCB Creates a new address space Copies entire contents of parent's address space into the child Initializes CPU and OS resources to a copy of the parent's Places new PCB on ready queue
PC
Registers
PID 4109State:
Running
PC
Registers
PID 4277State: Ready
PC
Registers
PID 4391State: Ready
Ready queue
Process calls fork()
PC
Registers
PID 4110State: ReadyCopy state
PC
Registers
PID 4110State: Ready
Add to end ofready queue
Carnegie Mellon
fork: Creating New Processes int fork(void)
creates a new process (child process) that is identical to the calling process (parent process)
returns 0 to the child process returns child’s pid to the parent process
Fork is interesting (and often confusing) because it is called once but returns twice
pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}
Carnegie Mellon
Understanding fork
pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}
Process npid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}
Child Process m
pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}
pid = m
pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}
pid = 0
pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}
pid_t pid = fork();if (pid == 0) { printf("hello from child\n");} else { printf("hello from parent\n");}
hello from parent hello from childWhich one is first?
Carnegie Mellon
Fork Example #1
void fork1(){ int x = 1; pid_t pid = fork(); if (pid == 0) {
printf("Child has x = %d\n", ++x); } else {
printf("Parent has x = %d\n", --x); } printf("Bye from process %d with x = %d\n", getpid(), x);}
Parent and child both run same code Distinguish parent from child by return value from fork
Start with same state, but each has private copy Including shared output file descriptor Relative ordering of their print statements undefined
Carnegie Mellon
Fork Example #2
void fork2(){ printf("L0\n"); fork(); printf("L1\n"); fork(); printf("Bye\n");}
Both parent and child can continue forking
L0 L1
L1
Bye
Bye
Bye
Bye
Carnegie Mellon
Fork Example #3 Both parent and child can continue forking
void fork3(){ printf("L0\n"); fork(); printf("L1\n"); fork(); printf("L2\n"); fork(); printf("Bye\n");} L1 L2
L2
Bye
Bye
Bye
Bye
L1 L2
L2
Bye
Bye
Bye
Bye
L0
Carnegie Mellon
Fork Example #4 Both parent and child can continue forking
void fork4(){ printf("L0\n"); if (fork() != 0) {
printf("L1\n"); if (fork() != 0) { printf("L2\n"); fork();}
} printf("Bye\n");}
L0 L1
Bye
L2
Bye
Bye
Bye
Carnegie Mellon
Fork Example #4 Both parent and child can continue forking
void fork5(){ printf("L0\n"); if (fork() == 0) {
printf("L1\n"); if (fork() == 0) { printf("L2\n"); fork();}
} printf("Bye\n");}
L0 Bye
L1
Bye
Bye
Bye
L2
Carnegie Mellon
Why have fork() at all? Why make a copy of the parent process? Don't you usually want to start a new program instead? Where might “cloning” the parent be useful?
Carnegie Mellon
Why have fork() at all? Why make a copy of the parent process? Don't you usually want to start a new program instead? Where might “cloning” the parent be useful?
Web server – make a copy for each incoming connection Parallel processing – set up initial state, fork off multiple copies to
do work UNIX philosophy: System calls should be minimal.
Don't overload system calls with extra functionality if it is not always needed.
Better to provide a flexible set of simple primitives and let programmerscombine them in useful ways.
Carnegie Mellon
Memory concerns So fork makes a copy of a process. What about memory usage?
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Parent
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #1
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #2
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #3Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #4Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #5
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #6
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #7
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #8
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #10
Stack
Heap
Initialized vars(data
segment)Code
(text segment)
Uninitialized vars
(BSS segment)
(Reserved for OS)
Child #9
Carnegie Mellon
Problematic?
Carnegie Mellon
Memory concerns OS aggressively tries to share memory between processes.
Especially processes that are fork()'d copies of each other Copies of a parent process do not actually get a private copy
of the address space... ... Though that is the illusion that each process gets. Instead, they share the same physical memory, until one of them makes a
change. The virtual memory system is behind these shenanigans.
We will discuss this in much detail later in the course
Carnegie Mellon
exit: Ending a process void exit(int status)
exits a process Normally return with status 0
atexit() registers functions to be executed upon exit
void cleanup(void) { printf("cleaning up\n");}
void fork6() { atexit(cleanup); fork(); exit(0);}
Carnegie Mellon
Zombies Idea
When process terminates, still consumes system resources Various tables maintained by OS
Called a “zombie” Living corpse, half alive and half dead
Reaping Performed by parent on terminated child Parent is given exit status information Kernel discards process
What if parent doesn’t reap? If any parent terminates without reaping a child, then child will be
reaped by init process So, only need explicit reaping in long-running processes
e.g., shells and servers
Carnegie Mellon
linux> ./forks 7 &[1] 6639Running Parent, PID = 6639Terminating Child, PID = 6640linux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6639 ttyp9 00:00:03 forks 6640 ttyp9 00:00:00 forks <defunct> 6641 ttyp9 00:00:00 pslinux> kill 6639[1] Terminatedlinux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6642 ttyp9 00:00:00 ps
ZombieExample
ps shows child process as “defunct”
Killing parent allows child to be reaped by init
void fork7(){ if (fork() == 0) {
/* Child */printf("Terminating Child, PID = %d\n", getpid());exit(0);
} else {printf("Running Parent, PID = %d\n", getpid());while (1) ; /* Infinite loop */
}}
Carnegie Mellon
linux> ./forks 8Terminating Parent, PID = 6675Running Child, PID = 6676linux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6676 ttyp9 00:00:06 forks 6677 ttyp9 00:00:00 pslinux> kill 6676linux> ps PID TTY TIME CMD 6585 ttyp9 00:00:00 tcsh 6678 ttyp9 00:00:00 ps
NonterminatingChild Example
Child process still active even though parent has terminated
Must kill explicitly, or else will keep running indefinitely
void fork8(){ if (fork() == 0) {
/* Child */printf("Running Child, PID = %d\n", getpid());while (1) ; /* Infinite loop */
} else {printf("Terminating Parent, PID = %d\n", getpid());exit(0);
}}
Carnegie Mellon
wait: Synchronizing with Children int wait(int *child_status)
suspends current process until one of its children terminates return value is the pid of the child process that terminated if child_status != NULL, then the object it points to will be set
to a status indicating why the child process terminated
Carnegie Mellon
wait: Synchronizing with Childrenvoid fork9() { int child_status;
if (fork() == 0) { printf("HC: hello from child\n"); } else { printf("HP: hello from parent\n"); wait(&child_status); printf("CT: child has terminated\n"); } printf("Bye\n"); exit();}
HP
HC Bye
CT Bye
Carnegie Mellon
wait() Example If multiple children completed, will take in arbitrary order Can use macros WIFEXITED and WEXITSTATUS to get information about exit
status
void fork10(){ pid_t pid[N]; int i; int child_status; for (i = 0; i < N; i++)
if ((pid[i] = fork()) == 0) exit(100+i); /* Child */
for (i = 0; i < N; i++) {pid_t wpid = wait(&child_status);if (WIFEXITED(child_status)) printf("Child %d terminated with exit status %d\n",
wpid, WEXITSTATUS(child_status));else printf("Child %d terminate abnormally\n", wpid);
}}
Carnegie Mellon
waitpid(): Waiting for a Specific Process waitpid(pid, &status, options)
suspends current process until specific process terminates various options (that we won’t talk about)
void fork11(){ pid_t pid[N]; int i; int child_status; for (i = 0; i < N; i++)
if ((pid[i] = fork()) == 0) exit(100+i); /* Child */
for (i = 0; i < N; i++) {pid_t wpid = waitpid(pid[i], &child_status, 0);if (WIFEXITED(child_status)) printf("Child %d terminated with exit status %d\n",
wpid, WEXITSTATUS(child_status));else printf("Child %d terminated abnormally\n", wpid);
}
Carnegie Mellon
fork() and execve() How do we start a new program, instead of just a copy of
the old program? Use the UNIX execve() system call
int execve(const char *filename, char *const argv [], char *const envp[]);
filename: name of executable file to run argv: Command line arguments envp: environment variable settings (e.g., $PATH, $HOME, etc.)
Carnegie Mellon
fork() and execve() execve() does not fork a new process!
Rather, it replaces the address space and CPU state of the current process
Loads the new address space from the executable file and starts it from main()
So, to start a new program, use fork() followed by execve()
Carnegie Mellon
execl and exec Family int execl(char *path, char *arg0, char *arg1, …, 0) Loads and runs executable at path with args arg0, arg1, …
path is the complete path of an executable object file By convention, arg0 is the name of the executable object file “Real” arguments to the program start with arg1, etc. List of args is terminated by a (char *)0 argument Environment taken from char **environ, which points to an array
of “name=value” strings: USER=ganger LOGNAME=ganger HOME=/afs/cs.cmu.edu/user/ganger
Returns -1 if error, otherwise doesn’t return! Family of functions includes execv, execve (base
function), execvp, execl, execle, and execlp
Carnegie Mellon
exec: Loading and Running Programs
main() { if (fork() == 0) { execl("/usr/bin/cp", "cp", "foo", "bar", 0); } wait(NULL); printf("copy completed\n"); exit();}
Carnegie Mellon
Summary Exceptions
Events that require nonstandard control flow Generated externally (interrupts) or internally (traps and faults)
Processes At any given time, system has multiple active processes Only one can execute at a time, though Each process appears to have total control of
processor + private memory space
Carnegie Mellon
Summary (cont.) Spawning processes
Call to fork One call, two returns
Process completion Call exit One call, no return
Reaping and waiting for Processes Call wait or waitpid
Loading and running Programs Call execl (or variant) One call, (normally) no return