25
1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction scheduling • Mid-term exam stats: Highest: 90, Mean: 58

Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

Embed Size (px)

Citation preview

Page 1: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

1

Lecture 17: Basic Pipelining

• Today’s topics:

� 5-stage pipeline� Hazards and instruction scheduling

• Mid-term exam stats:� Highest: 90, Mean: 58

Page 2: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

2

Multi-Cycle Processor

• Single memory unit shared by instructions and memory• Single ALU also used for PC updates• Registers (latches) to store the result of every block

Page 3: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

3

The Assembly Line

A

Start and finish a job before moving to the next

Time

Jobs

Break the job into smaller stagesB C

A B C

A B C

A B C

Unpipelined

Pipelined

Page 4: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

4

Performance Improvements?

• Does it take longer to finish each individual job?

• Does it take shorter to finish a series of jobs?

• What assumptions were made while answering thesequestions?

• Is a 10-stage pipeline better than a 5-stage pipeline?

Page 5: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

5

Quantitative Effects

• As a result of pipelining:� Time in ns per instruction goes up� Each instruction takes more cycles to execute� But… average CPI remains roughly the same� Clock speed goes up� Total execution time goes down, resulting in lower

average time per instruction� Under ideal conditions, speedup

= ratio of elapsed times between successive instructioncompletions

= number of pipeline stages = increase in clock speed

Page 6: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

6

A 5-Stage Pipeline

Page 7: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

7

A 5-Stage Pipeline

Use the PC to access the I-cache and increment PC by 4

Page 8: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

8

A 5-Stage Pipeline

Read registers, compare registers, compute branch target; for now, assumebranches take 2 cyc (there is enough work that branches can easily take more)

Page 9: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

9

A 5-Stage Pipeline

ALU computation, effective address computation for load/store

Page 10: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

10

A 5-Stage Pipeline

Memory access to/from data cache, stores finish in 4 cycles

Page 11: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

11

A 5-Stage Pipeline

Write result of ALU computation or load into register file

Page 12: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

12

Conflicts/Problems

• I-cache and D-cache are accessed in the same cycle – ithelps to implement them separately

• Registers are read and written in the same cycle – easy todeal with if register read/write time equals cycle time/2(else, use bypassing)

• Branch target changes only at the end of the second stage-- what do you do in the meantime?

• Data between stages get latched into registers (overheadthat increases latency per instruction)

Page 13: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

13

Hazards

• Structural hazards: different instructions in different stages(or the same stage) conflicting for the same resource

• Data hazards: an instruction cannot continue because itneeds a value that has not yet been generated by anearlier instruction

• Control hazard: fetch cannot continue because it doesnot know the outcome of an earlier branch – special caseof a data hazard – separate category because they aretreated in different ways

Page 14: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

14

Structural Hazards

• Example: a unified instruction and data cache �stage 4 (MEM) and stage 1 (IF) can never coincide

• The later instruction and all its successors are delayeduntil a cycle is found when the resource is free � theseare pipeline bubbles

• Structural hazards are easy to eliminate – increase thenumber of resources (for example, implement a separateinstruction and data cache)

Page 15: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

15

Data Hazards

Page 16: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

16

Bypassing

• Some data hazard stalls can be eliminated: bypassing

Page 17: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

17

Data Hazard Stalls

Page 18: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

18

Data Hazard Stalls

Page 19: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

19

Example

add $1, $2, $3

lw $4, 8($1)

Page 20: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

20

Example

lw $1, 8($2)

lw $4, 8($1)

Page 21: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

21

Example

lw $1, 8($2)

sw $1, 8($3)

Page 22: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

22

Control Hazards

• Simple techniques to handle control hazard stalls:� for every branch, introduce a stall cycle (note: every

6th instruction is a branch!)� assume the branch is not taken and start fetching the

next instruction – if the branch is taken, need hardwareto cancel the effect of the wrong-path instruction

� fetch the next instruction (branch delay slot) andexecute it anyway – if the instruction turns out to beon the correct path, useful work was done – if theinstruction turns out to be on the wrong path,hopefully program state is not lost

Page 23: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

23

Branch Delay Slots

Page 24: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

24

Slowdowns from Stalls

• Perfect pipelining with no hazards � an instructioncompletes every cycle (total cycles ~ num instructions)� speedup = increase in clock speed = num pipeline stages

• With hazards and stalls, some cycles (= stall time) go byduring which no instruction completes, and then the stalledinstruction completes

• Total cycles = number of instructions + stall cycles

Page 25: Lecture 17: Basic Pipelining - School of Computingrajeev/cs3810/slides/3810-17.pdf · 1 Lecture 17: Basic Pipelining • Today’s topics: 5-stage pipeline Hazards and instruction

25

Title

• Bullet