Upload
josue-hosler
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Multiple Processor Systems I
CS 423Klara Nahrstedt/Sam King
04/11/23 1
04/11/23 2
Content
• Administrative• Introduction• Multiprocessors
– Hardware– OS Types– Synchronization– Scheduling
Administrative
• MP4 is out• If you are changing groups for MP4, let staff
(profs, TAs) and your partners know – We have to change multiple state machines to get
to a consistent state about your group status– If you take the extra days, you don’t have to
ask/inform TAs/Profs about taking them since we see the status in the compass
04/11/23 3
04/11/23 4
Introduction• Goal of computer architects, engineers and scientists to get more and more
computing power• Problem with the goal:
– Previous approaches: make clock run faster to speed up– We are beginning to hit some fundamental limits on clock speed– Reason: Einstein’s special theory of relativity
• No electrical signal can propagate faster than the speed of light: 30 cm/ns in a vacuum and 20 cm/ns over copper or optical fiber
• This means:– with 10 GHz clock, signals cannot travel more than 2 cm total, – with 100 GHz clock, total signal path length at most 2 mm, and– at 1000 GHz (1 THz) computer would have to be smaller than 100 µm
– Reason: heat dissipation • making computers this small may be possible, but the faster the computer runs, the
more heat it generates, and the smaller the computer is, the harder it is to get rid of the heat.
Computer Speedup
Moore’s Law: “The density of transistors on a chip doubles every 18 months, for the same cost” (1965)
Image: Tom’s Hardware
04/11/23 6
Multiprocessor Systems
• Continuous need for faster computers– shared memory model– message passing multiprocessor– wide area distributed system
Loosely coupled Tightly coupled
04/11/23 7
Multiprocessors• Shared-memory multiprocessor:
A computer system in which two or more CPUs share full access to a common RAM
• Inter-processor communication: one CPU writes some data into memory and another one reads the data out
• Multiprocessor OS performs:– Regular OS functions: handling system calls, memory management, file
system access, management of I/O devices– Unique to multiprocessor OS: process synchronization, resource
management, scheduling
• Uniform vs. non-uniform memory access (UMA, NUMA) multiprocessors
04/11/23 8
Multiprocessor Hardware (1)UMA Bus-based Multiprocessors
Problem: If the bus is busy when a CPU wants to read or write memory, (a)the CPU waits until the bus becomes idle. As CPU speeds scale up, this imposes an unacceptable performance penalty.
What is the solution?
04/11/23 9
Multiprocessor Hardware (2)UMA Multiprocessors Using Crossbar Switch
Problem: with best caching we can only support 16 or 32 CPUsSolution: crossbar switch with non-blocking network
04/11/23 10
• (a) 256-node directory based multiprocessor• (b) Fields of 32-bit memory address• (c) Directory at node 36
Multiprocessor Hardware (3)
04/11/23 11
Multiprocessor OS Types (1)Each CPU has its own OS
busCharacteristics: • share single copy of OS code• partition multiprocessor memory among four CPUs• sharing of set of disks and other I/O devices
Design aspects: • when a process makes a system call, system call is handled by its own CPU• no sharing of processes – consequence: no load balancing on CPUs• no sharing of pages – consequence: no load balancing on memory• independent buffer caching of disk blocks• problem: no load balancing on CPU, memory and inconsistencies
04/11/23 12
Multiprocessor OS Types (2)Master-Slave Multiprocessors
Characteristics: • one copy of OS and its tables are present on one CPU and not the others• all system calls are redirected to the one CPU for processing• CPU with OS is called master, all other CPUs running applications are called slavesDesign aspects: • there is only data structure that keeps track of ready processes – good for loadbalancing• pages can be allocated among all processes dynamically • there is only one buffer cache so inconsistencies never occurWhat are the problems with this OS type?
bus
04/11/23 13
Multiprocessor OS Types (3)Symmetric Multiprocessors
Characteristics: • eliminates the master/slave asymmetry• there is one copy of OS, but any CPU can run it• when a system call is made, the CPU traps to the kernel, which processes the system callDesign aspects: • dynamic balance of processes and memory, since there is only one set of OS tables• eliminates master as a bottleneckWhat are the problems with this OS Type?
bus
Parallel Programming Parallelization Idea
• Parallelization is “easy” if processing can be cleanly split into n units:
work
w1 w2 w3
Partition problem
Parallelization Idea (2)
w1 w2 w3
thread thread thread
Spawn worker threads:
In a parallel computation, we would like to have as many threads as we have processors. e.g., a four-processor computer would be able to run four threads at the same time.
Parallelization Idea (3)
thread thread thread
Workers process data:
w1 w2 w3
Parallelization Idea (4)
results
Report results
thread thread threadw1 w2 w3
Parallelization Pitfalls
But this model is too simple!
• How do we assign work units to worker threads?• What if we have more work units than threads?• How do we aggregate the results at the end?• How do we know all the workers have finished?• What if the work cannot be divided into completely
separate tasks?
Parallelization Pitfalls (2)
• Each of these problems represents a point at which multiple threads must communicate with one another, or access a shared resource.
• Golden rule: Any memory that can be used by multiple threads must have an associated synchronization system!
The Moral: Be Careful!• Synchronization is hard
– Need to consider all possible shared state– Must keep locks organized and use them
consistently and correctly
• Knowing there are bugs may be tricky; fixing them can be even worse!
• Keeping shared state to a minimum reduces total system complexity
04/11/23 21
MultiprocessorSynchronization (1)
• To ensure atomic execution of TSL,• we have to be able to lock the bus
04/11/23 22
MultiprocessorSynchronization (2)
• Multiple locks used to avoid cache thrashing
04/11/23 23
MultiprocessorSynchronization (3)
• Spinning versus Switching• In some cases CPU must wait
– waits to acquire ready list
• In other cases a choice exists– spinning wastes CPU cycles– switching uses up CPU cycles also– possible to make separate decision each time
locked mutex encountered
04/11/23 24
• Scheduling of independent processes– use of single data structure for scheduling
• Priorities and quantum expiration need to be carefully considered – What are the differences between uniprocessor and multiprocessor when quantum expires?
• Smart scheduling, affinity scheduling/two-level scheduling
Multiprocessor Scheduling (1)Time Sharing
04/11/23 25
Multiprocessor Scheduling (2)Space Sharing
• Processes are related to each other, or • Process consisting of multiple threads • Simple algorithm:
– all threads are created at the same time, – each thread is given a dedicated CPU and start at same time across multiple CPUs– if there are not enough CPUs to start all threads, wait until CPUs become available
• Consider a central server that keeps track of available CPUs• Advantage: no multi-programming• Disadvantage: Waste of CPU if it blocks
04/11/23 26
Multiprocessor Scheduling (3)Gang Scheduling
• Problem: communication between two threads– both belong to process A– both running out of phase
04/11/23 27
Multiprocessor Scheduling (4)Gang Scheduling
• Solution: Gang Scheduling1.Groups of related threads scheduled as a unit, or gang
2.All members of gang run simultaneously on different timeshared CPUs
3.All gang members start and end their time slices together
04/11/23 28
Multiprocessor Scheduling (5)Gang Scheduling
04/11/23 29
Conclusion • Multiprocessors
– Popular and attractive– Offer simple communication model—all CPUs share a
common memory– Processes write messages to memory which can be
read by others– Synchronization using mutexes, semaphores, monitors
and other well-established schemes
• Problem:– Large multiprocessors are difficult to build and are
expensive