34
UC Regents Fall 2005 © UCB CS 152 L9: Pipelining III 2005-9-27 John Lazzaro (www.cs.berkeley.edu/~lazzaro) CS 152 Computer Architecture and Engineering Lecture 9 Pipelining III www-inst.eecs.berkeley.edu/~cs152/ Congrats on Lab 2! TAs: David Marquardt and Udam Saini

CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

2005-9-27John Lazzaro

(www.cs.berkeley.edu/~lazzaro)

CS 152 Computer Architecture and Engineering

Lecture 9 – Pipelining III

www-inst.eecs.berkeley.edu/~cs152/

Congrats on Lab 2!

TAs: David Marquardt and Udam Saini

Page 2: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Last time: A Hazard Taxonomy

Structural Hazards

Data Hazards (RAW, WAR, WAW)

Control Hazards (taken branches and jumps)

On each clock cycle, we must detect the presenceof all of these hazards, and resolve them before they break the “contract with the programmer”.

Page 3: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Last Time: Hazard Resolution Toolkit

Stall earlier instructions in pipeline.

Kill earlier instructions in pipeline.

Forward results computed in later pipeline stages to earlier stages.Add new hardware or rearrange hardware design to eliminate hazard.

Make hardware handle concurrent requests to eliminate hazard.

Change ISA to eliminate hazard.

Page 4: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Today: Putting it All Together

Specifications for Lab 3

Preferred hazard resolution tools.

At-risk hazards for Lab 3

Tips for control design

Page 5: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Lab 3: ISA Specifications

No load “delay slot”

Single “delay slot”

Also: RESET signal, BREAK release signal, etc ...

Page 6: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

CS 152 L1: The MIPS ISA UC Regents Fall 2005 © UCB

The level of detail needed for a pipelined design can only be found in this document.

Remember: Online MIPS documentation

42 MIPS32™ Architecture For Programmers Volume II, Revision 2.00

Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved.

AND

Format: AND rd, rs, rt MIPS32

Purpose:

To do a bitwise logical AND

Description: rd ! rs AND rt

The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical AND operation. The result is

placed into GPR rd.

Restrictions:

None

Operation:

GPR[rd] ! GPR[rs] and GPR[rt]

Exceptions:

None

31 26 25 21 20 16 15 11 10 6 5 0

SPECIAL

000000rs rt rd

0

00000

AND

100100

6 5 5 5 5 6

And AND

Page 7: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Hazard Diagnosis

Page 8: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Data Hazards: Read After Write

Read After Write (RAW) hazards.Instruction I2 expects to read a datavalue written by an earlier instruction,but I2 executes “too early” and readsthe wrong copy of the data.

Lab 3 solution: use forwarding heavily, fall back on stalling when forwarding won’twork or slows down the critical path too much.

Page 9: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Mux,Logic

Full bypass network ...

rd1

RegFile

rd2

WEwd

rs1

rs2

ws

Ext

IR IR

B

A

M

32A

L

U

32

32

op

IR

Y

M

IR

Dout

Data Memory

WE

Din

Addr

MemToReg

R

WE, MemToReg

ID (Decode) EX MEM WB

From WB

Page 10: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Mux,Logic

Common bug: Multiple forwards ...

rd1

RegFile

rd2

WEwd

rs1

rs2

ws

Ext

IR IR

B

A

M

32A

L

U

32

32

op

IR

Y

M

IR

Dout

Data Memory

WE

Din

Addr

MemToReg

R

WE, MemToReg

ID (Decode) EX MEM WB

From WB

ADD R4,R3,R2 OR R2,R3,R1 AND R2,R2,R1

Which do we forward from?

Page 11: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Data Hazards: WAR and WAW ...

Write After Read (WAR) hazards. Instruction I2 expects to write over a data value after an earlier instruction I1 reads it. But instead, I2 writes too early, and I1 sees the new value.

Write After Write (WAW) hazards. Instruction I2 writes over data an earlier instruction I1 also writes. But instead, I1 writes after I2, and the final data value is incorrect.

WAR and WAW not possible in our 5-stage pipeline. However, TA test code checks for these, and every semester a few WAR/WAWs are found. Why?

Page 12: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

LW and Hazards

Page 13: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Mux,Logic

Questions about LW and forwarding

rd1

RegFile

rd2

WEwd

rs1

rs2

ws

Ext

IR IR

B

A

M

32A

L

U

32

32

op

IR

Y

M

IR

Dout

Data Memory

WE

Din

Addr

MemToReg

R

WE, MemToReg

ID (Decode) EX MEM WB

From WB

ADDIU R1 R1 24 LW R1 128(R29)

Will this work as shown?OR R3,R3,R2

Page 14: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Mux,Logic

rd1

RegFile

rd2

WEwd

rs1

rs2

ws

Ext

IR IR

B

A

M

32A

L

U

32

32

op

IR

Y

M

IR

Dout

Data Memory

WE

Din

Addr

MemToReg

R

WE, MemToReg

ID (Decode) EX MEM WB

From WB

ADDIU R1 R1 24 LW R1 128(R29)

Will this work as shown?OR R1,R3,R1

Questions about LW and forwarding

Page 15: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Resolving a RAW hazard by stalling

rd1

RegFile

rd2

WEwd

rs1

rs2

ws

D

PC

Q

+

0x4

Addr Data

Instr

Mem

Ext

IR IR IR

B

A

M

Instr Fetch

Stage #1 Stage #2 Stage #3

Decode & Reg Fetch

ADD R4,R3,R2OR R5,R4,R2

Let ADD proceed to WB stage, so that R4 is written to regfile.

ADD R4,R3,R2OR R5,R4,R2

Sample programKeep executingOR instructionuntil R4 is ready.Until then, sendNOPS to IR 2/3.

Freeze PC and IR until stall is over.

New datapath hardware

(1) Mux into IR 2/3to feed in NOP.

(2) Write enable on PC and IR 1/2

Page 16: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Branches and Hazards

Page 17: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Recall: Control hazard and hardware

rd1

RegFile

rd2

WEwd

rs1

rs2

ws

D

PC

Q

+

0x4

Addr Data

Instr

Mem

Ext

IR IR IR

B

A

M

Instr Fetch

Stage #1 Stage #2 Stage #3

Decode & Reg Fetch

==

To branch control logic

Page 18: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

I1:I2:I3:I4:I5:

t1 t2 t3 t4 t5 t6 t7 t8Time:Inst

I6:

Recall: After more hardware, change ISA

D

PC

Q

+

0x4

Addr Data

Instr

Mem

IR IR

IF (Fetch) ID (Decode) EX (ALU)

IR IR

MEM WB

BEQ R4,R3,25

SUB R1,R9,R8AND R6,R5,R4

I1:I2:I3:

Sample Program(ISA w/o branch delay slot) IF ID

IF

EX MEM WB

If branch is taken, this instruction MUST NOT

complete!

ID stage computes if branch is taken

If we change ISA, can we always let I2 complete (”branch delay slot”) and

eliminate the control hazard.

Page 19: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Mux,Logic

Questions about branch and forwards

rd1

RegFile

rd2

WEwd

rs1

rs2

ws

Ext

IR IR

B

A

M

32A

L

U

32

32

op

IR

Y

M

IR

Dout

Data Memory

WE

Din

Addr

MemToReg

R

WE, MemToReg

ID (Decode) EX MEM WB

BEQ R1 R3 label

Will this work as shown?OR R3,R3,R1

==

To branch control logic

Page 20: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Lessons learned

Pipelining is hard

Write test code in advance

Study every instruction

Think about interactions ...

Page 21: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Control Implementation

Page 22: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Recall: What is single cycle control?

32rd1

RegFile

32rd2

WE32wd

5rs1

5rs2

5ws

ExtRegDest

ALUsrcExtOp

ALUctr

32A

L

U

32

32

op

MemToReg

32Dout

Data Memory

WE32

Din

Addr

MemWr

Equal

RegWr

32Addr Data

InstrMem

Equal

RegDestRegWr

ExtOpALUsrc MemWr

MemToReg

PCSrc

Combinational Logic(Only Gates, No Flip Flops)Just specify logic functions!

Page 23: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

In pipelines, all IR registers are used

IR IR IR IR

ID (Decode) EX MEM WB

Equal

RegDestRegWr

ExtOp MemToReg

PCSrc

Combinational Logic(Only Gates, No Flip Flops)

(add extra state outside!)

A “conceptual” design -- for shortest critical path, IR registers may hold decoded info,

not the complete 32-bit instruction

Page 24: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Two goals when specifying control logic

Bug-free: One “0” that should be a “1” in the control logic function breaks contract with the programmer.

Efficient: Logic function specification should map to hardware with good performance properties: fast, small, low power, etc.

Should be easy for humans to read and understand: sensible signal names, symbolic constants ...

Page 25: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Midterm week begins on Thursday ...HW graded on effort

Midterm (6-9, 310 Soda), no class that day.

Thursday review session.Will cover format, material, and ground rules for test.

Page 26: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

And concurrently, Lab 3 deadlines ...

Lab 2 team evals,Lab 3 design document,Weekend: start design work

Page 27: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Admin: Team Evaluations due Thursday

Page 28: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Admin: Design Document Deadlines

Page 29: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Lab 3 deadlines after the mid-term ...

Lab 3 design doc + Xilinx checkoff later in week ...

Page 30: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Are pipelined CPUs a “solved problem”?

Embedded CPU -- runs one program its entire life.

Embedded CPU -- “embedded into products”. Unlike desktop/laptop/server, where customer knows he is buying a “computer”.

Page 31: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Theme: don’t do hardware in a vacuum

Optimize architecture, compilers for code that runs on the hardware, and the CAD tools that create the hardware together.

Page 32: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Example: Customize “ALU” for the app

Designing a new CPU just for one product is hard. Why?

SecondsProgram

Instructions= SecondsCycleProgram

Cycles Instruction

Page 33: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Example: Specialize memory for the app

Teaching compilers and operating systems about unusual memory systems is hard.

Page 34: CS 152 Computer Architecture and Engineering Lecture 9 ...cs152/fa05/lecnotes/lec5-1.pdf · CS 152 L9: Pipelining III UC Regents Fall 2005 © UCB 2005-9-27 John Lazzaro (lazzaro)

UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III

Example: Energy & operating systems

Teaching compilers and operating systems about E = 1/2 C V2 is hard.