90
A. Pipelining: Basic Concepts What is pipelining? How is the pipelining Implemented? What makes pipelining hard to implement?

What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

  • Upload
    lydang

  • View
    225

  • Download
    3

Embed Size (px)

Citation preview

Page 1: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

A. Pipelining: Basic ConceptsWhat is pipelining?

How is the pipelining Implemented?

What makes pipelininghard to implement?

Page 2: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.2ComputerArchitecture_pipeline 2/90

What is Pipelining ?Pipelining:

“A technique designed into some computers to increase speed by starting the execution of one instruction before completing the previous one.”

----Modern English-Chinese Dictionary

implementation technique whereby different instructions are overlapped in execution at the same time.implementation technique to make fast CPUs

Page 3: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.3ComputerArchitecture_pipeline 3/90

It likes Auto Assembly lineAn arrangement of workers, machines, and equipment in which the product being assembled passes consecutively from operation to operation until completed.

Ford installs first moving assembly line in 1913. The right picture shows the moving assembly line at Ford Motor Company's michigan plant.

( 84 distinct steps)

Page 4: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.4ComputerArchitecture_pipeline 4/90

Trucking gas from depot to gas station

The steps:Get the barrelsLoad them into the truckDrive to the gas stationUnload the gasReturn for more oil

Let’s do the mathEach truck can carry 5 barrelsCan load a truck with 5 barrels in 1 hourIt takes each truck 1 day to drive to and from gas stationHow many barrels per week are delivered?

Page 5: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.5ComputerArchitecture_pipeline 5/90

Looks a Lot Like a Multi-cycle Processor

What are the steps ?Fetch an instruction (Get the barrels)Decode the instruction (Load them into the truck)ALU OP (Drive to the gas station)Memory Access (Unload the gas)Write-back (Return for more oil)

Page 6: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.6ComputerArchitecture_pipeline 6/90

A better way, but dangerous

Roll the barrels down the roadBig fire hazard

Page 7: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.7ComputerArchitecture_pipeline 7/90

Big idea: Build a pipeline

Now let’s do the mathPipeline can accept 1 barrel every hourHow many barrels get delivered to the gas station per day?

Page 8: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.8ComputerArchitecture_pipeline 8/90

Trucking vs. Pipelines

TrucksTruck with 5 barrels takes 1 day to drive to and from gas station, while need 2 hours for loading and unloadingLOTS of TIME when loading area,gas station, and pieces of the road are unused

PipelinesPipeline can accept 1 barrel every hourResources (loading area, gas station,pipelines) are always in use

Page 9: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.9ComputerArchitecture_pipeline 9/90

Why Pipelining: Its NaturalLaundry

Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold

Washer takes 30 minutesDryer takes 40 minutes“Folder” takes 20 minutes

A B C D

Page 10: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.10ComputerArchitecture_pipeline 10/90

Sequential Laundry

Sequential laundry takes 6 hours for 4 loadsIf they learned pipelining, how long would laundry take?

A

B

C

D

30 40 20 30 40 20 30 40 20 30 40 20

6 PM 7 8 9 10 11 MidnightTime

Task

Order

Page 11: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.11ComputerArchitecture_pipeline 11/90

Pipelined Laundry--Start work ASAP

Pipelined laundry takes 3.5 hours for 4 loads

A

B

C

D

6 PM 7 8 9 10 11 Midnight

Task

Order

Time

30 40 40 40 40 20

Page 12: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.12ComputerArchitecture_pipeline 12/90

Why pipelining : overlapped

Latches, called pipeline registers’break up computation into 5 stagesDeal 5 tasks at the same time.

Only deal one task each time.This task takes “ such a long time”

Page 13: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.13ComputerArchitecture_pipeline 13/90

Why pipelining: more faster

Can “launch” a new computation every 100ns in this structureCan finish 107

computations per second

Can launch a new computation every 20ns in pipelined structureCan finish 5×107

computations per second

Page 14: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.14ComputerArchitecture_pipeline 14/90

Why pipelining : conclusion

The key implementation technique used to Make fast CPU: decrease CPUtime.

Improving of Throughput ( rather than individual execution time)

Improving of efficiency for resources (functional unit)

Page 15: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.15ComputerArchitecture_pipeline 15/90

What is a pipeline ?A pipeline is like an auto assemble lineA pipeline has many stagesEach stage carries out a different part of instruction or operationThe stages, which cooperates at a synchronized clock, are connected to form a pipeAn instruction or operation enters through one end and progresses through the stages and exit through the other endPipelining is an implementation technique that exploits parallelism among the instructions in a sequential instruction stream

Page 16: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.16ComputerArchitecture_pipeline 16/90

Pipeline Characteristics--latency vs. Throughput

LatencyEach instruction takes a certain time to complete. This is the latency for that operation. It's the amount of time between when the instruction is issued and when it completes.

ThroughputNumber of items (cars, instructions) exiting the pipeline per unit time.The throughput of the assembly line is the number of products completed per hour. The throughput of a CPU pipeline is the number of instructions completed per second.

Page 17: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.17ComputerArchitecture_pipeline 17/90

Pipeline Characteristics-clock cycle vs. Machine cycle

Clock cycleEverything in a CPU moves in lockstep, synchronized by the clock ( The Heartbeat of CPU )

Machine cycle (Processor cycle, Stage time)time required to complete a single pipeline stage.The pipeline designer’s goal is to balance the length of each pipeline stage. In many instances, machine cycle = max ( times for all stages).A machine cycle is usually one, sometimes two, clock cycles long, but rarely more.

Page 18: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.18ComputerArchitecture_pipeline 18/90

Performance for Pipelining

Page 19: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.19ComputerArchitecture_pipeline 19/90

Ideal Performance for Pipelining

If the stages are perfectly balanced, The time per instruction on the pipelined processor equal to:

Time per instruction on unpipelined machineNumber of pipe stages

So, Ideal speedup equal toNumber of pipe stages.

Page 20: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.20ComputerArchitecture_pipeline 20/90

Why not just make a 50-stage pipeline?

Some computations just won’t divide into any finer (shorter in time) logical implementation.

5 stages OK

50 stages NO. Sorry!

Page 21: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.21ComputerArchitecture_pipeline 21/90

Why not just make a 50-stage pipeline ?

Those latches are NOT free, they take up area, and there is a real delay to go THRU the latch itself.

Machine cycle > latch latency + clock skewIn modern, deep pipeline (10-20 stages), this is a real effectTypically see logic “depths” in one pipe stage of 10-20 “gates”.

At these speeds, and with this few levels of logic, latch delay is important

Page 22: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.22ComputerArchitecture_pipeline 22/90

How Many Pipeline Stages?E.g., Intel

Pentium III, Pentium 4: 20+ stagesMore than 20 instructions in flightHigh clock frequency (>1GHz)High IPC

Too many stages:Lots of complicationsShould take care of possible dependencies among in-flight instructionsControl logic is huge

Page 23: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.23ComputerArchitecture_pipeline 23/90

Simple implementation of a RISC (MIPS)

Start with Implementation without pipelining

single-cycle implementationmulti-cycle implementation

Pipelining the RISC Instruction SetPipelining performance issuesHow can we do it efficiently ?Examples

Page 24: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.24ComputerArchitecture_pipeline 24/90

MIPS ISA

Page 25: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.25ComputerArchitecture_pipeline 25/90

MIPS instruction format

Page 26: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.26ComputerArchitecture_pipeline 26/90

9 Instructions

Page 27: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.27ComputerArchitecture_pipeline 27/90

R-instr. Add / Sub

Page 28: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.28ComputerArchitecture_pipeline 28/90

R-Instruction: And/Or

Page 29: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.29ComputerArchitecture_pipeline 29/90

I-Instruction: LW/SW

Page 30: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.30ComputerArchitecture_pipeline 30/90

Branch / Jump

Page 31: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.31ComputerArchitecture_pipeline 31/90

SLT: set when less than

Page 32: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.32ComputerArchitecture_pipeline 32/90

Single-cycle implementation

Data Path for R-instruction

Page 33: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.33ComputerArchitecture_pipeline 33/90

R-instruction Data Path

Page 34: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.34ComputerArchitecture_pipeline 34/90

What’s the data path for I-instruction (R-I, LW/SW/BEQ)?

Page 35: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.35ComputerArchitecture_pipeline 35/90

Data Path for LW

Page 36: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.36ComputerArchitecture_pipeline 36/90

Data Path for SW

Page 37: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.37ComputerArchitecture_pipeline 37/90

Data Path for BEQ

Page 38: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.38ComputerArchitecture_pipeline 38/90

What’s the data path for J-instruction (jump)?

Page 39: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.39ComputerArchitecture_pipeline 39/90

Data Path for Jump

Page 40: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.40ComputerArchitecture_pipeline 40/90

Page 41: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.41ComputerArchitecture_pipeline 41/90

Page 42: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.42ComputerArchitecture_pipeline 42/90

Page 43: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.43ComputerArchitecture_pipeline 43/90

Page 44: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.44ComputerArchitecture_pipeline 44/90

Page 45: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.45ComputerArchitecture_pipeline 45/90

Page 46: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.46ComputerArchitecture_pipeline 46/90

Single-cycle implementation

seldom used !

Page 47: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.47ComputerArchitecture_pipeline 47/90

Let’s have a break

Page 48: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.48ComputerArchitecture_pipeline 48/90

Single cycle Multiple cycle

Page 49: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.49ComputerArchitecture_pipeline 49/90

Finite State Diagram

Page 50: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.50ComputerArchitecture_pipeline 50/90

Multiple Cycle MIPS CPU

Page 51: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.51ComputerArchitecture_pipeline 51/90

Multiple Cycle MIPS CPU

Page 52: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.52ComputerArchitecture_pipeline 52/90

Page 53: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.53ComputerArchitecture_pipeline 53/90

5-cycle MIPS CPU

Page 54: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.54ComputerArchitecture_pipeline 54/90

Page 55: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.55ComputerArchitecture_pipeline 55/90

Page 56: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.56ComputerArchitecture_pipeline 56/90

Page 57: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.57ComputerArchitecture_pipeline 57/90

Page 58: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.58ComputerArchitecture_pipeline 58/90

Page 59: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.59ComputerArchitecture_pipeline 59/90

Page 60: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.60ComputerArchitecture_pipeline 60/90

Page 61: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.61ComputerArchitecture_pipeline 61/90

Page 62: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.62ComputerArchitecture_pipeline 62/90

Page 63: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.63ComputerArchitecture_pipeline 63/90

Page 64: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.64ComputerArchitecture_pipeline 64/90

Page 65: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.65ComputerArchitecture_pipeline 65/90

Page 66: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.66ComputerArchitecture_pipeline 66/90

Page 67: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.67ComputerArchitecture_pipeline 67/90

Multi-cycle implementation

Page 68: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.68ComputerArchitecture_pipeline 68/90

About Multi-cycle implementation

The temporary storage locations were added to the datapath of the unpipelinedmachine to make it easy to pipeline.Note that branch and store instructions take 4 clock cycles.

Assuming branch frequency of 12% and a store frequency of 10%, CPI is 4.78.

This implementation is not optimal.

Page 69: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.69ComputerArchitecture_pipeline 69/90

How to improve the performance ?

For a possible branch, do the equality test and compute the possible branch target by adding the sign-extended offset to the incremented PC earlier in ID. Completing ALU instructions during the MEM cycle So, branch instructions take only 2 cycles, store and ALU instructions take 4 cycles, and load instruction takes the longest time 5 cycles. CPI drops to 4.07 assuming 47% ALU operation frequency.

2×12% +4×(10%+47%)+ 31×5=4.07

Page 70: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.70ComputerArchitecture_pipeline 70/90

Optimized Multi-cycle implementation

PC

Instr.MEM

NPC

IR

MU

X

Reg.File

Data.MEM

Zero?

A

B

IM

MU

X

4

ALU

output

LMD

MU

X

Sign Ex

IF ID EX MEM W B

Temporary storage locations

Page 71: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.71ComputerArchitecture_pipeline 71/90

Improvement on hardware redundancy

ALU can be shared.

Data and instruction memory can be combined since access occurs on different clock cycles.

Page 72: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.72ComputerArchitecture_pipeline 72/90

Pipelining MIPS instruction set Since there are five separate stages, we can have a pipeline in which one instruction is in each stage. CPI is decreased to 1, since one instruction will be issued (or finished) each cycle. During any cycle, one instruction is present in each stage.

Ideally, performance is increased five fold !

Page 73: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.73ComputerArchitecture_pipeline 73/90

store

load

5-stage Version of MIPS Datapath

pipeline registers or latches

Why Latches ?

Page 74: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.74ComputerArchitecture_pipeline 74/90

Single-cycle implementation vs. pipelining

Single Cycle Implementation: CPI=1, long clock cycle

Clk

Load Store Waste

Cycle 1 Cycle 2

Pipeline Implementation: CPI=1, clock cycle ≈ long clock cycle/5

Load IF ID EX MEM WB

IF ID EX MEM WBStore

IF ID EX MEM WBR-type

Clk

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

Page 75: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.75ComputerArchitecture_pipeline 75/90

Multi-cycle implementation vs. pipelining

Load Store R-type

Multip-Cycle Implementation: CPI=5,

Pipeline Implementation: CPI=1,

Clk

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

Load IF ID EX MEM WB

IF ID EX MEM WBStore

IF ID EX MEM WBR-type

Clk

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

IF ID EX MEM WB IF ID EX MEM WB

Page 76: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.76ComputerArchitecture_pipeline 76/90

How pipelining decrease execution time?

If your starting point is a single clock cycle per instruction machine then

pipelining decreases cycle time.

a multiple clock cycle per instruction machine then

pipelining decreases CPI.

Page 77: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.77ComputerArchitecture_pipeline 77/90

performance issues in pipelining

Latency: The execution time of each instruction in pipelining does not decrease, instead, always longer than that of unpipelined machine.Imbalance among stages reduces performanceOverhead rise from register delay and clock skew also contribute to the lower limit of machine cycle.Pipeline hazards are the major hurdle of pipeline, which prevent the machine from reaching the ideal performance.Time to “fill” pipeline and time to “drain” it reduces speedup

Page 78: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.78ComputerArchitecture_pipeline 78/90

store

load

How simple as this ! Really ?

pipeline registers or latches

Why need to add this line?

Page 79: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.79ComputerArchitecture_pipeline 79/90

Problems that pipelining introducesFocus: no different operations with the same

data path resource on the same clock cycle.(structure hazard)There is conflict about the memory !

Mem

Instr.

Order

Time (clock cycles)

Ld/St

Instr 1

Instr 2

Instr 3A

LUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUMem Reg Mem Reg

ALUReg Mem Reg

Page 80: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.80ComputerArchitecture_pipeline 80/90

The conflict about the registers !

Page 81: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.81ComputerArchitecture_pipeline 81/90

Conflict occurs when PC update

Must increment and store the PC every clock.What happens when meet a branch ?

Branches change the value of the PC -- but the condition is not evaluated until ID ! If the branch is taken, the instructions fetched behind the branch are invalid !

This is clearly a serious problem ( Control hazard ) that needs to be addressed. We will deal it later.

Page 82: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.82ComputerArchitecture_pipeline 82/90

Must latches be engaged ? Yeah !

Ensure the instructions in different stages do not interfere with one another . Through the latches, can the stages be combined one by one to form a pipeline.The latches are the pipeline registers , which are much more than those in multi-cycle version

IR: IF/ID.IR; ID/EX.IR; EX/DM.IR; DM/WB.IRB: ID/EX.B; EX/DM.BALUoutput: EX/DM.ALUoutput, DM/WB.ALUoutput

Any value needed on a later stage must be placed in a register and copied from one register to the next, until it is no longer needed.

Page 83: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.83ComputerArchitecture_pipeline 83/90

Separate instruction and data memoriesuse split instruction and data cache

the memory system must deliver 5 times the bandwidth over the unpipelined version.

IM

Instr.

Order

Time (clock cycles)

Ld/St

Instr 1

Instr 2

Instr 3

ALUIM Reg DM Reg

ALUIM Reg DM Reg

ALUIM Reg DM Reg

ALUReg DM Reg

Page 84: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.84ComputerArchitecture_pipeline 84/90

Pipeline hazard: the major hurdle

A hazard is a condition that prevents an instruction in the pipe from executing its next scheduled pipe stage

Taxonomy of hazardStructural hazards

These are conflicts over hardware resources.Data hazards

Instruction depends on result of prior computation which is not ready (computed or stored) yet

Control hazardsbranch condition and the branch PC are not available in time to fetch an instruction on the next clock

Page 85: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.85ComputerArchitecture_pipeline 85/90

Hazards can always be resolved by Stall

The simplest way to "fix" hazards is to stall the pipeline. Stall means suspending the pipeline for some instructions by one or more clock cycles. The stall delays all instructions issued after the instruction that was stalled, while other instructions in the pipeline go on proceeding.A pipeline stall is also called a pipeline bubble or simply bubble.No new instructions are fetched during a stall .

Page 86: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.86ComputerArchitecture_pipeline 86/90

Performance of pipeline with stalls

Pipeline stalls decrease performance from the ideal Recall the speedup formula:

Page 87: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.87ComputerArchitecture_pipeline 87/90

Assumptions for calculation The ideal CPI on a pipelined processor is almost always 1. (may less than or greater that )So

Ignore the overhead of pipelining clock cycle.Pipe stages are ideal balanced.Two different implementation

single-cycle implementationmulti-cycle implementation

Page 88: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.88ComputerArchitecture_pipeline 88/90

Case of single-cycle implementation

CPI unpipelined = 1

• Clock cycle pipelined = Clock cycle unpipelinedpipeline depth

Page 89: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.89ComputerArchitecture_pipeline 89/90

Case of multi-cycle implementation

Clock cycle unpipelined = Clock cycle pipeliningCPl unpipelined = pipeline depth

Page 90: What is pipelining? How is the pipelining Implemented ... implementation Data Path for R-instruction. ... Multiple Cycle MIPS CPU. Feb.2008_jxh_Introduction ComputerArchitecture_pipeline

Feb.2008_jxh_Introduction 1.90ComputerArchitecture_pipeline 90/90

That’s all for today !