Upload
vianca
View
22
Download
0
Embed Size (px)
DESCRIPTION
10/13: Lecture Topics. Data Hazards Control Hazards. Grading Disputes. Bring grading disputes to me up to one week after an assignment is handed back We try to be very fair when grading Please don’t beg for one point here or one point there each hw counts 5% of your grade - PowerPoint PPT Presentation
Citation preview
10/13: Lecture Topics
• Data Hazards• Control Hazards
Grading Disputes
• Bring grading disputes to me up to one week after an assignment is handed back
• We try to be very fair when grading• Please don’t beg for one point here or one
point there– each hw counts 5% of your grade– hw’s are out of ~70 points– each hw point is only 0.07% of your final grade
(or 0.0028 grade points)– I will add 0.028 to everyone’s final grade if you
don’t dispute 1 or 2 point grading issues
• Exams are worth more and I will be more tolerant of begging
Pipelined Xput and Latency
• What’s the throughput of this implementation?
• What’s the latency of this implementation?
1 2 3 4 5 6 7 8 9
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
inst 1
inst 2
inst 3
inst 4
inst 5
Data Hazards
• What happens in the following code?
add $s0, $s1, $s2
add $s4, $s3, $s0
$s0 is read here
$s0 is written here
• This is called as a data dependency
• When it causes a pipeline stall it is called a data hazard
IF ID EX MEM WB
IF ID EX MEM WB
Solution: Forwarding
• The value of $s0 is known after cycle 3 (after the first instruction’s EX stage)
• The value of $s0 isn’t needed until cycle 4 (before the second instruction’s EX stage)
• If we forward the result there isn’t a stall
add s0,s1,s2
add s4,s3,s0
IF ID EX MEM WB
IF ID EX MEM WB
Another data hazard
• What if the first instruction is lw?
lw s0,0(s2)
add s4,s3,s0
IF ID EX MEM WB
IF ID EX MEM WB
• s0 isn’t known until after the MEM stage
• We can’t forward back into the past• Either stall or reorder instructions
Solutions to the lw hazard
• We can stall for one cycle, but we hate to stalllw s0,0(s2)
add s4,s3,s0
IF ID EX MEM WB
IF ID EX MEM WB
• Try to execute an unrelated instruction between the two instructions
lw s0,0(s2)
add s4,s3,s0
IF ID EX MEM WB
sub t4,t2,t3 IF ID EX MEM WB
IF ID EX MEM WB
sub t4,t2,t3
stall
Reordering Instructions
• Reordering instructions is a common technique for avoiding pipeline stalls
• Sometimes the compiler does the reordering statically
• Almost all modern processors do this reordering dynamically– they can see several instructions and they
execute anyone that has no dependency– this is known as out-of-order execution
and is very complicated to implement
Structural Hazards
• Instructions in different stages want to use the same resource– Suppose a lw instruction is in stage four
(memory access)– Meanwhile, an add instruction is in stage one
(instruction fetch)– Both of these actions require access to memory;
they could collide
• Add more hardware to eliminate the problem
• Or stall (cheaper & easier), not usually done
Control Hazards• Branch instructions cause control hazards
(aka branch hazards) because we don’t know which instruction to execute next
do we fetch add or sub?
we don’t know until here
IF ID EX MEM WB
IF ID EX MEM WB
bne $s0, $s1, next
add $s4, $s3, $s0
next: sub $s4, $s3, $s0
...
Solution: Stall
• We can stall to see which instruction to execute next
IF ID EX MEM WB
IF ID EX MEM WBstall
bne $s0, $s1, next
sub $s4, $s3, $s0
• But we hate to stall
Solution: Move Branch to ID
• Move the branch hardware to ID stage– Hardware to compare to registers is
simpler than hardware to add them (i.e. EX stage hardware)
IF ID EX MEM WB
IF ID EX MEM WBstall
bne $s0, $s1, next
sub $s4, $s3, $s0
• We still have to stall for one cycle• But we can’t move the branch up
any more
Branch Delay Slot
• A branch now causes a stall of one cycle• Try to execute an instruction instead of stall• The compiler must find an instruction to fill
the branch delay slot– 50% of the instructions are useful– 50% are nop’s (no ops) which don’t do anything
• Might have been a good idea originally but not any more
Branch Delay Slot Example
• “addi $t0,$t0,1” will always execute move $t0,$zero bne $s0,$zero,Done addi $t0,$t0,1 addi $t0,$t0,3Done: move $t1,$t0
move $t0,$zero bne $s0,$zero,Done addi $t0,$t0,1 addi $t0,$t0,3 move $t1,$t0
move $t0,$zero bne $s0,$zero,Done addi $t0,$t0,1 move $t1,$t0
branch not taken branch taken
Solution: Speculate
• Executing the following instructions assuming the branch is taken (or not taken)
• If we guessed right, then let the instructions proceed
• If we guessed wrong, then squash the partially completed instructions. – This is called flushing the pipeline.– These instructions were wasted, but we would have
stalled otherwise
• Never let a speculating instruction write to memory or a register until we’re sure it should execute
• This is known as speculative execution
Speculate Never Taken
• Assume the branch isn’t taken and fetch the next instruction
bne $s0,$zero,Done addi $t0,$t0,1 addi $t0,$t0,3Done: move $t1,$t0
IF ID EX MEM WB
IF ID EX MEM WB
IF ID EX MEM WB
Branch not taken
bne
addi
addi
IF ID EX MEM WB
IF ID EX MEM WB
IF SQUASH
Branch taken
bne
addi
move
• Predicting taken is actually better, but still not good enough
Static Branch Prediction
• Most backwards branch are taken (80%)– they are part of loops
• Forward branches are taken about half the time– if statements
• A common static branch prediction scheme is to predict – backwards branches are taken– forward branches are not taken
• Some architectures allow the compiler to specify in the branch instruction to predict taken or not taken
• This does okay (70-80%), but still not good enough
Dynamic Branch Prediction
• In most programs you execute the same instructions over and over
• You encounter the same branch instructions over and over
• The same branch instruction is usually – taken if it was taken last time – not taken if it was not taken last time
• If we keep a history of each branch instruction, then we can predict much better
Dynamic Branch Prediction
• A table is kept on the CPU that • There is not room to store each instruction
– last few bits of the instruction index this table – some instructions collide like a hash table– usually store 2 bits per entry
• Dynamic branch prediction is 92-98% accurate
Instruction Taken last time?
Predict
... ... ...
0x10001234 no not taken
0x102F0268 yes taken
0x13D0122C no not taken
... ... ...
Importance of Branch Prediction
• Branches occur every five instructions• Today’s processors execute up to 4
instructions per cycle• A branch occurs every 2 cycles• Pipelines are longer than MIPS
(8,9,11,13 cycles)– branch mispredict penalty is 3-5 cycles
instead of 1 cycle
• Must predict accurately or you execute < 0.5 instructions per cycle instead of 4 instructions
Exceptions and Interrupts
• So far, we’ve assumed that the assembled code can always be executed
• Lots of ways for unexpected things to happen:– Undefined instruction– Arithmetic overflow– System call– I/O device request
Exceptions
• An exception is an internal event– The unexpected condition was caused
by something the program did– Undefined instructions and arithmetic
overflows are examples– If you ran the program again, the
exception would (probably) happen again at the same point in the program’s execution
Interrupts
• An interrupt is an external event– The unexpected condition was not
caused by the program– An I/O device request is an example– If you ran the program again, the
interrupt would probably not happen at the same point
What should happen?
• These events result in an unnatural change in the flow of control
• Normally, the next instruction executed is ________
• When one of these events takes place, something else happens– The system must respond to the
event– The response depends on the type of
event
Exception Handling
• Loosely, the following steps are taken:1. Save the address of the offending
instruction in a register2. Make the reason for the exception known
- Set the value of the status register, or- Use vectored interrupts to do step 3
3. Transfer control to the operating system4. Operating system decides what to do: - May report the error to the user - May terminate the program
Exception/Pipelining Interface
• Suppose an add instruction overflows, causing an exception
• Instructions after the add are already in the pipeline– The partially computed instructions must be
flushed
• Exception must be caught before register contents have changed– Pipeline designers must be wary of
exception handling