2/6/02 CSE 141 - Single Cycle Datapath
The Single Cycle Datapath
RegistersRegister #
Data
Register #
Datamemory
Address
Data
Register #
PC Instruction ALU
Instructionmemory
Address
Note: Some of the material in this lecture are COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGH RESERVED.
Figures may be reproduced only for classroom or personal education use in conjunction with our text and only when the above line is included.
CSE 141 - Single Cycle Datapath2
The Performance Big Picture• Execution Time = Insts * CPI * Cycle Time
• Processor design (datapath and control) will determine:– Clock cycle time– Clock cycles per instruction
• Starting today:– Single cycle processor:
• Advantage: CPI = 1• Disadvantage: long cycle time
Execute anentire instruction
CSE 141 - Single Cycle Datapath3
• We're ready to implement the MIPS “core”– load-store instructions: lw, sw
– reg-reg instructions: add, sub, and, or, slt
– control flow instructions: beq
• First, we need to fetch an instruction into processor– program counter (PC) supplies instruction address– get the instruction from memory
Processor Design
Clk
Data In
Write Enable
32 32DataOut
Address
PC
CSE 141 - Single Cycle Datapath4
• We're ready to implement the MIPS “core”– load-store instructions: lw, sw
– reg-reg instructions: add, sub, and, or, slt
– control flow instructions: beq
• First, we need to fetch an instruction into processor– program counter (PC) supplies instruction address– get the instruction from memory
Processor Design
Clk
Data In
Write Enable
32 32DataOut
Address
PC
0
instruction appears here
CSE 141 - Single Cycle Datapath5
That was too easy• A problem – how will we do a load or store?
– remember that memory has only 1 port– and we want to do everything in 1 cycle
Clk
Data In
Write Enable
32 32DataOut
Address
PC
0
instruction appears here
CSE 141 - Single Cycle Datapath6
Instruction & Data in same cycle? Solution: separate data and instruction memory
There will be only one DRAM memoryWe want a stored program architecture
How else can you compile and then run a program??
But we can have separate SRAM caches(We’ll study caches later)
Clk
Data In
Write Enable
32 32DataOut
Address
PC
instruction appears here
Instructioncache
address
Data Cache
CSE 141 - Single Cycle Datapath7
Instruction Fetch UnitUpdating the PC for next instruction
– Sequential Code: PC <- PC + 4 – Branch and Jump: PC <- “something else”
• we’ll worry about these later
PC
Instructionmemory
Readaddress
Instruction
4
Add
CSE 141 - Single Cycle Datapath8
The MIPS core subset• R-type
– add rd, rs, rt– sub, and, or, slt
• LOAD and STORE
– lw rt, rs, imm– sw rt, rs, imm
• BRANCH:– beq rs, rt, imm
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
op rs rt displacement016212631
6 bits 16 bits5 bits5 bits
1. Read registers rs and rt2. Feed them to ALU3. Update register file
1. Read register rs (and rt for store)2. Feed rs and immed to ALU3. Move data between mem and reg
1. Read registers rs and rt2. Feed to ALU to compare3. Add PC to disp; update PC
CSE 141 - Single Cycle Datapath9
• Generic Implementation:– all instruction read some registers– all instructions use the ALU after reading registers– memory accessed & registers updated after ALU
• Suggests basic design:
Processor Design
RegistersRegister #
Data
Register #
Datamemory
Address
Data
Register #
PC Instruction ALU
Instructionmemory
Address
CSE 141 - Single Cycle Datapath10
Datapath for Reg-Reg Operations• R[rd] <- R[rs] op R[rt] Example: add rd, rs, rt
– Ra, Rb, and Rw come from rs, rt, and rd fields– ALUoperation signal depends on op and funct
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
InstructionRegisters
Writeregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALUZero
RegWrite
ALU operation3
CSE 141 - Single Cycle Datapath11
Datapath for Load OperationsR[rt] <- Mem[R[rs] + SignExt[imm16]] Example: lw rt, rs, imm16
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
Instruction
16 32
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Datamemory
Writedata
Readdata
Writedata
Signextend
ALUresult
ZeroALU
Address
MemRead
MemWrite
RegWrite
ALU operation3
CSE 141 - Single Cycle Datapath12
Datapath for Store OperationsMem[R[rs] + SignExt[imm16]] <- R[rt] Example: sw rt, rs, imm16
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
Instruction
16 32
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Datamemory
Writedata
Readdata
Writedata
Signextend
ALUresult
ZeroALU
Address
MemRead
MemWrite
RegWrite
ALU operation3
CSE 141 - Single Cycle Datapath13
Combining datapaths• How do we allow different datapaths for
different instructions??
InstructionRegisters
Writeregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALUZero
RegWrite
ALU operation3
Instruction
16 32
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Datamemory
Writedata
Readdata
Writedata
Signextend
ALUresult
ZeroALU
Address
MemRead
MemWrite
RegWrite
ALU operation3
R-type Store
CSE 141 - Single Cycle Datapath14
Combining datapaths• How do we allow different datapaths for
different instructions??
• Use a multiplexor!
InstructionRegisters
Writeregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALUZero
RegWrite
ALU operation3
Instruction
16 32
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Datamemory
Writedata
Readdata
Writedata
Signextend
ALUresult
ZeroALU
Address
MemRead
MemWrite
RegWrite
ALU operation3
Instruction
16 32
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Datamemory
Writedata
Readdata
Writedata
Signextend
ALUresult
ZeroALU
Address
MemRead
MemWrite
RegWrite
ALU operation3
ALUscr
CSE 141 - Single Cycle Datapath15
Datapath for Branch Operationsbeq rs, rt, imm16 We need to compare Rs and Rt
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
16 32Sign
extend
ZeroALU
Sum
Shiftleft 2
To branchcontrol logic
Branch target
PC + 4 from instruction datapath
Instruction
Add
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
RegWrite
ALU operation3
CSE 141 - Single Cycle Datapath16
Computing the Next Address• PC is a 32-bit byte address into the instruction memory:
– Sequential operation: PC<31:0> = PC<31:0> + 4– Branch: PC<31:0> = PC<31:0> + 4 + SignExt[Imm16] * 4
• We don’t need the 2 least-significant bits because:– The 32-bit PC is a byte address– And all our instructions are 4 bytes (32 bits) long– The 2 LSB's of the 32-bit PC are always zeros
CSE 141 - Single Cycle Datapath17
All together: the single cycle datapath
MemtoReg
MemRead
MemWrite
ALUOp
ALUSrc
RegDst
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20–16]
Instruction [25–21]
Add
Instruction [5–0]
RegWrite
4
16 32Instruction [15–0]
0Registers
WriteregisterWritedata
Writedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address Readdata M
ux
1
0
Mux
1
0
Mux
1
0
Mux
1
Instruction [15–11]
ALUcontrol
Shiftleft 2
PCSrc
ALU
Add ALUresult
CSE 141 - Single Cycle Datapath18
The R-Format (e.g. add) Datapath
MemtoReg
MemRead
MemWrite
ALUOp
ALUSrc
RegDst
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20–16]
Instruction [25–21]
Add
Instruction [5–0]
RegWrite
4
16 32Instruction [15–0]
0Registers
WriteregisterWritedata
Writedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address Readdata M
ux
1
0
Mux
1
0
Mux
1
0
Mux
1
Instruction [15–11]
ALUcontrol
Shiftleft 2
PCSrc
ALU
Add ALUresult
Need ALUsrc=1, ALUop=“add”, MemWrite=0, MemToReg=0,RegDst = 0, RegWrite=1 and PCsrc=1.
CSE 141 - Single Cycle Datapath19
The Load Datapath
MemtoReg
MemRead
MemWrite
ALUOp
ALUSrc
RegDst
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20–16]
Instruction [25–21]
Add
Instruction [5–0]
RegWrite
4
16 32Instruction [15–0]
0Registers
WriteregisterWritedata
Writedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address Readdata M
ux
1
0
Mux
1
0
Mux
1
0
Mux
1
Instruction [15–11]
ALUcontrol
Shiftleft 2
PCSrc
ALU
Add ALUresult
What control signals do we need for load??
CSE 141 - Single Cycle Datapath20
The Store Datapath
MemtoReg
MemRead
MemWrite
ALUOp
ALUSrc
RegDst
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20–16]
Instruction [25–21]
Add
Instruction [5–0]
RegWrite
4
16 32Instruction [15–0]
0Registers
WriteregisterWritedata
Writedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address Readdata M
ux
1
0
Mux
1
0
Mux
1
0
Mux
1
Instruction [15–11]
ALUcontrol
Shiftleft 2
PCSrc
ALU
Add ALUresult
CSE 141 - Single Cycle Datapath21
The beq Datapath
MemtoReg
MemRead
MemWrite
ALUOp
ALUSrc
RegDst
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20–16]
Instruction [25–21]
Add
Instruction [5–0]
RegWrite
4
16 32Instruction [15–0]
0Registers
WriteregisterWritedata
Writedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address Readdata M
ux
1
0
Mux
1
0
Mux
1
0
Mux
1
Instruction [15–11]
ALUcontrol
Shiftleft 2
PCSrc
ALU
Add ALUresult
CSE 141 - Single Cycle Datapath22
Key Points• CPU is just a collection of state and
combinational logic
• We just designed a very rich processor, at least in terms of functionality
• Execution time = Insts * CPI * Cycle Time– where does the single-cycle machine fit in?
CSE 141 - Single Cycle Datapath23
Computer of the Day• The IBM 1620 (1959)
– A 2nd generation computer: transistors & core storage(First generation ones used tubes and delay-based memory)
– Example of creative architecture– ~ 2000 built. Relatively inexpensive ( < $1620/month rental)
• A decimal computer – 6 bits per digit or character– 4 bits, flag (for +/- and end-of-word), ECC– Variable-length data – fields terminated by flag
• Arithmetic by table lookup!
• Codenamed CADET– “Can’t Add, Doesn’t Even Try”