Upload
barnaby-henderson
View
223
Download
2
Tags:
Embed Size (px)
Citation preview
This lecture… Thread Creation Why do we need to handle cooperating threads? Atomic operations
Thread Creation Thread “fork” – create a new thread, three
arguments:– Pointer to application routine to execute (fcnPtr)– Pointer to arguments records (fcnArgPtr)– Size of stack to allocate
Fork for threads not the same as fork for processes
Fork(fcnPtr, fcnArgPtr, StackSize)
Appl. Routine to run
args
Thread Creation Thread fork implementation:
– Sanity check arguments – check stack size not infinite, fcnPtr points to something well formed, i.e. a valid code
– Enter kernel mode (sanity check again)– Allocate a stack and a new TCB and initialize its register
fields. » In particular, the stack pointer is made to point at the
stack, the PC return address is made to point at an OS (assembler) routine ThreadRoot, and two of the registers are initialized to fcnPtr and fcnArgPtr
– Put the newly allocated TCB on the ready list (Runnable). This will cause it to eventually be dispatched by run_new_thread, and start running the routine ThreadRoot.
Thread Creation ThreadRoot:
– Do start-up housekeeping (e.g., record start time, accounting information).
– Return to user mode.– Call fcnPtr(fcnArgPtr).– Do thread finish-up:call ThreadFinish.
ThreadRoot
fcnPtr
A
B
Thread Finish ThreadFinish:
– Put any threads waiting on the termination of this thread on the ready list.
– Can’t deallocate thread yet, since we’re still running on its stack. Record thread as “waitingToBeDestroyed”.
– Call run_new_thread to run another thread. ThreadHouseKeeping will examine waithingToBeDestroyed and deallocate the finished thread’s TCB and stack.
run_new_thread() { newThread = PickNewThread(); switch(curThread, newThread); ThreadHouseKeeping(); } ThreadRoot
ThreadFinish
Additional Details Thread fork is not the same thing as UNIX “fork”.
UNIX fork creates a new process, so it has to create a new address space, in addition to a new thread.
For now, don’t worry about how switching between different processes’ address spaces is done.
Thread fork is very much like an asynchronous procedure call – it means, go do this work, where the calling thread does not wait for the callee to complete.
Parent-Child relationship
Every thread (and/or Process) has a parentage– A “parent” is a thread that creates another
thread– A child of a parent was created by that parent
Typical process treefor Solaris system
Thread Join What if thread wants to exit early?
– ThreadFinish() and exit() are essentially the same procedure entered at user level
What if the calling thread needs to wait?– Thread Join – wait for a forked thread to finish.– Calling thread will be taken off run queue and
placed on waiting queue
ThreadJoin() system call Where is a logical place to store this wait queue?
– On queue inside the TCB
Similar to wait() system call in UNIX– Lets parents wait for child processes
OtherStateTCB9
LinkRegisters
OtherStateTCB6
LinkRegisters
OtherStateTCB16
LinkRegisters
HeadTail
TerminationWait queue
TCBtid
Use of Join for Traditional Procedure Call Thus, a traditional procedure call is logically
equivalent to doing a fork then immediately doing a join.
This is a normal procedure call (synchronous):A() { B(); }B() { }
The procedure A can also be implemented as:
A’() { Thread t = new Thread; t->Fork(B); t->Join();}
Synchronous/Asynchronous Procedure calls Why not replace synchronous with asynchronous
procedure calls everywhere?– Overhead: allocate TCB, allocate stack, startup
overhead, context switch– Crashing of a thread kills the process
Multithreading Multiple activities within the same process
– Web server, database server No protection!
Need coordination between threads– Web crawler: multiple threads going to different
links; make sure that different threads do not return the same page as it will waste bandwidth and resources
– File system – different threads don’t apply the same lock to two different users
Multiprocessing vs. Multiprogramming Multiprocessing = multiple CPU Multiprogramming = multiple jobs or processes Definition of “run concurrently” – scheduler is free to run
threads in any order (e.g., FIFO, random, etc.) For example:
Multiprocessing vs. Multiprogramming Dispatcher can choose to run each thread to
completion, or time-slice in big chunks, or time slice so that each thread executes only one instruction at a time (simulating a multiprocessor, where each CPU operates in lockstep).
If the dispatcher can do any of the above, programs must work under all cases, for all interleavings.
So how can you know if your concurrent program works? Whether all interleavings will work?
Definitions Independent threads:
– No state shared with other threads– Deterministic – input state determines result– Reproducible – (input state can be recreated)I/O, memory,
…– Scheduling order doesn’t matter
Cooperating threads:– Shared state– Non-deterministic – Non-reproducible
Non-reproducibility and non-determinism means that bugs can be intermittent.
This makes debugging really hard!
Interactions Complicate Debugging Is any program truly independent?
– Every process shares the file system, OS resources, network, etc
– Extreme example: buggy device driver causes thread A to crash “independent thread” B
You probably don’t realize how much you depend on reproducibility:– Example: Evil C compiler
» Modifies files behind your back by inserting errors into C program unless you insert debugging code
– Example: Debugging statements can overrun stack Non-deterministic errors are really difficult to find
– Example: Memory layout of kernel+user programs» depends on scheduling, which depends on timer/other
things» Original UNIX had a bunch of non-deterministic errors
– Example: Something which does interesting I/O» User typing of letters used to help generate secure keys
Why allow cooperating threads? People cooperate; and computers model people’s
behavior, so computers at some level have to cooperate!1. Share resources/information
– One computer, many users– One bank balance, many ATMs– Embedded systems (ex: robot control)
2. Speedup– Overlap I/O and computation– UNIX file system does read ahead– Multiprocessors – chop up program into smaller pieces
3. Modularity– chop large problem up into simpler pieces. For example, to
compile: gcc – cpp | cc1 | cc2 | as | ld– This makes the system easier to extend; you can replace the
assembler without changing the loader.
Some simple concurrent programs Most of the time, threads are working on separate data, so
scheduling order doesn’t matter. Initially, y = 12
What about?
What are the possible values for x after the above? What are the possible values of x below?
Can’t say anything useful about a concurrent program without knowing what are the underlying indivisible operations!
Thread A Thread B x = 1 y = 2
x = 1 y = 2x = y + 1 y = y * 2
x = 1 x = 2
Atomic operations What we want is some way of allowing a thread to
perform a task without having other threads interfere with the task.
Atomic operation: an operation that always runs to completion, or not at all. – It is indivisible: it can’t be stopped in the middle, and its
state can’t be modified by someone else during the operation.
On most machines, memory reference and assignment (i.e., load and store) of words are atomic.
Many instructions are not atomic. – For example, on most 32-bit architectures, double
precision floating point store is not atomic; it involves two separate memory operations.
High-level Example: Web Server
Server must handle many requests Non-cooperating version:
serverLoop() { con = AcceptCon(); ProcessFork(ServiceWebPage(),con);
} What are some disadvantages of this
technique?
Threaded Web Server Now, use a single process Multithreaded (cooperating) version:
serverLoop() { connection = AcceptCon(); ThreadFork(ServiceWebPage(),connection);
} Looks almost the same, but has many advantages:
– Can share file caches kept in memory, results of CGI scripts, other things
– Threads are much cheaper to create than processes, so this has a lower per-request overhead
Question: would a user-level (say many-to-one) thread package make sense here?– When one request blocks on disk, all block…
Thread Pools Problem with previous version: Unbounded Threads
– When web-site becomes too popular – throughput sinks
Instead, allocate a bounded “pool” of threads, representing the maximum level of multiprogramming
master() { allocThreads(slave,queue); while(TRUE) { con=AcceptCon(); Enqueue(queue,con); wakeUp(queue); }}
slave(queue) { while(TRUE) { con=Dequeue(queue); if (con==null) sleepOn(queue); else ServiceWebPage(con); }}
MasterThread
Thread Pool
qu
eu
e