29
The University of Adelaide, School of Computer Science 17 January 2011 Chapter 2 — Instructions: Language of the Computer 2 CSE 420 Chapter 2 — Instructions: Language of the Computer — 3 The MIPS Instruction Set Used as the example throughout the book Large share of embedded core market but dwarfed by ARM Typical of many modern ISAs See MIPS Reference Data tear-out card, and Appendixes B and E CSE 420 Chapter 2 — Instructions: Language of the Computer — 4 Arithmetic Operations Add and subtract have three operands Two sources and one destination add a, b, c # a gets b + c All MIPS arithmetic ops have this form Design Principle 1: Simplicity favors regularity. Regularity makes implementation simpler Regularity enables higher performance at lower cost §2.2 Operations of the Computer Hardware

The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

  • Upload
    others

  • View
    28

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 2

CSE 420 Chapter 2 — Instructions: Language of the Computer — 3

The MIPS Instruction Set n  Used as the example throughout the book n  Large share of embedded core market

but dwarfed by ARM

n  Typical of many modern ISAs n  See MIPS Reference Data tear-out card, and

Appendixes B and E

CSE 420 Chapter 2 — Instructions: Language of the Computer — 4

Arithmetic Operations n  Add and subtract have three operands

n  Two sources and one destination add a, b, c # a gets b + c

n  All MIPS arithmetic ops have this form n  Design Principle 1:

Simplicity favors regularity. n  Regularity makes implementation simpler n  Regularity enables higher performance

at lower cost

§2.2 Operations of the C

omputer H

ardware

Page 2: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 4

CSE 420 Chapter 2 — Instructions: Language of the Computer — 7

Register Operands n  Arithmetic instructions use register operands n  MIPS has a 32 × 32-bit register file

n  Used for frequently accessed data n  Numbered 0 to 31

n  Assembler names n  $t0, $t1, …, $t9 for temporary values n  $s0, $s1, …, $s7 for saved variables

n  Design Principle 2: Smaller is faster.

§2.3 Operands of the C

omputer H

ardware

CSE 420 Chapter 2 — Instructions: Language of the Computer — 8

Register Operand Example n  C code: f = (g + h) - (i + j);

n  Register assignment? n  f in $s0, g in $s1, h in $s2, i in $s3, j in $s4 n  Temps?

n  Compiled MIPS code: add $t0, $s1, $s2 add $t1, $s3, $s4 sub $s0, $t0, $t1

Page 3: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 5

CSE 420 Chapter 2 — Instructions: Language of the Computer — 9

Load-Store Architecture n  Data is in memory

n  Load values from memory into registers n  Store result from register to memory

n  Details: n  Memory is byte addressed

n  Each address identifies an 8-bit byte n  Words are aligned in memory

n  Word addresses must be a multiple of 4 n  MIPS is Big Endian

n  Most-significant byte at least address of a word n  c.f. Little Endian: least-significant byte at least address

CSE 420 Chapter 2 — Instructions: Language of the Computer — 10

Memory Operand Example 1 n  C code: g = h + A[8];

n  g in $s1, h in $s2, base address of A in $s3 n  Compiled MIPS code:

n  Index 8 requires offset of 32 Why?

lw $t0, 32($s3) # load word add $s1, $s2, $t0

offset base register

Page 4: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 6

CSE 420 Chapter 2 — Instructions: Language of the Computer — 11

Memory Operand Example 2 n  C code: A[12] = h + A[8];

n  h in $s2, base address of A in $s3 n  Compiled MIPS code:

n  Index 8 requires offset of 32 n  Index 12 requires offset of ?

lw $t0, 32($s3) # load word add $t0, $s2, $t0 sw $t0, 48($s3) # store word

CSE 420 Chapter 2 — Instructions: Language of the Computer — 12

Registers vs. Memory n  Registers are faster than memory n  Load-store è

“more” instructions to be executed n  Register optimization is important!

Page 5: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 7

CSE 420 Chapter 2 — Instructions: Language of the Computer — 13

Immediate Operands n  Constant data specified in an instruction addi $s3, $s3, 4

n  There is no “subtract immediate” instruction n  Subtract is “add a negative” addi $s2, $s1, -4

n  Design Principle 3: Make the common case fast. n  Small constants are common n  Immediate operand avoids a load instruction n  Why?

CSE 420 Chapter 2 — Instructions: Language of the Computer — 14

The Constant Zero n  MIPS register 0 ($zero) is the constant 0

n  Cannot be overwritten n  Useful for common operations

n  e.g., move between registers add $t2, $s1, $zero

n  Why a fixed register zero?

Page 6: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 10

CSE 420 Chapter 2 — Instructions: Language of the Computer — 19

Sign Extension n  How do you

represent a number using more bits? e.g. move an 8-bit number into a 16-bit word

n  Replicate the sign bit to the left n  c.f. unsigned values: extend with 0s

n  Examples: 8-bit to 16-bit n  +2: 0000 0010 => 0000 0000 0000 0010 n  –2: 1111 1110 => 1111 1111 1111 1110

n  In MIPS instruction set n  addi: extend immediate value n  lb, lh: extend loaded byte/halfword n  beq, bne: extend the displacement

CSE 420 Chapter 2 — Instructions: Language of the Computer — 20

Representing Instructions n  Instructions are encoded in binary

n  Called “machine code”

n  MIPS instructions n  Encoded as 32-bit instruction words (Regularity!) n  Small number of formats encode

opcode, register numbers, … n  Register numbers

n  $t0 – $t7 are reg’s 8 – 15 n  $s0 – $s7 are reg’s 16 – 23 n  $t8 – $t9 are reg’s 24 – 25 n  (0-8 and 25-31 described later – procedure calls)

§2.5 Representing Instructions in the C

omputer

Page 7: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 11

CSE 420 Chapter 2 — Instructions: Language of the Computer — 21

MIPS R-format Instructions

n  Instruction fields n  op: operation code (opcode) n  rs: first source register number n  rt: second source register number n  rd: destination register number n  shamt: shift amount (00000 for now) n  funct: function code (extends opcode) (Why?)

op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits

CSE 420 Chapter 2 — Instructions: Language of the Computer — 22

R-format Example

add $t0, $s1, $s2

special $s1 $s2 $t0 0 add

0 17 18 8 0 32

000000 10001 10010 01000 00000 100000

000000100011001001000000001000002 = 0232402016

op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits

Page 8: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 12

CSE 420 Chapter 2 — Instructions: Language of the Computer — 23

Hexadecimal n  Base 16

n  Compact representation of bit strings n  4 bits per hex digit

0 0000 4 0100 8 1000 c 1100 1 0001 5 0101 9 1001 d 1101 2 0010 6 0110 a 1010 e 1110 3 0011 7 0111 b 1011 f 1111

n  Example: eca8 6420 n  1110 1100 1010 1000 0110 0100 0010 0000

CSE 420 Chapter 2 — Instructions: Language of the Computer — 24

MIPS I-format Instructions

n  Immediate arithmetic and load/store instructions n  rt: destination or source register number n  Constant: –215 to +215 – 1 n  Address: offset added to base address in rs

n  Design Principle 4: Good design demands good compromises. n  Different formats complicate decoding,

but allow 32-bit instruction uniformity n  Keep formats as similar as possible

op rs rt constant or address 6 bits 5 bits 5 bits 16 bits

Page 9: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 13

CSE 420 Chapter 2 — Instructions: Language of the Computer — 25

Stored Program Computers n  Instructions represented in

binary, just like data n  Instructions and data

stored in memory n  Programs can

operate on programs n  e.g., compilers, linkers, …

n  Binary compatibility allows compiled programs to work on other computers with the same ISA.

The BIG Picture

CSE 420 Chapter 2 — Instructions: Language of the Computer — 26

Logical Operations n  Instructions for bitwise manipulation

Operation C Java MIPS Shift left << << sll

Shift right >> >>> srl

Bitwise AND & & and, andi

Bitwise OR | | or, ori

Bitwise NOT ~ ~ nor

n  Useful for extracting and inserting groups of bits in a word

§2.6 Logical Operations

Page 10: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

The University of Adelaide, School of Computer Science 17 January 2011

Chapter 2 — Instructions: Language of the Computer 27

CSE 420 Chapter 2 — Instructions: Language of the Computer — 53

Branch Addressing n  Branch instructions specify

n  Opcode, two registers, target address n  Most branch targets are near branch

n  Forward or backward

op rs rt constant or address 6 bits 5 bits 5 bits 16 bits

n  PC-relative addressing n  Target address = PC + offset × 4 n  PC already incremented by 4 by this time

CSE 420 Chapter 2 — Instructions: Language of the Computer — 54

Jump Addressing n  Jump (j and jal) targets could be

anywhere in text segment n  Encode full address in instruction

op address 6 bits 26 bits

n  (Pseudo)Direct jump addressing n  Target address = PC31…28 : (address × 4)

Page 11: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 1

Chapter 4

The Processor

CSE 420 Chapter 4 — The Processor — 2

Introduction n  CPU performance factors

n  Instruction count n  Determined by ISA and compiler

n  CPI and Cycle time n  Determined by CPU hardware

n  We will examine two MIPS implementations n  A simplified version n  A more realistic pipelined version

n  Simple subset, shows most aspects n  Memory reference: lw, sw n  Arithmetic/logical: add, sub, and, or, slt n  Control transfer: beq, j

§4.1 Introduction

Page 12: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 2

CSE 420 Chapter 4 — The Processor — 3

Instruction Execution n  PC → instruction memory, fetch instruction n  Register numbers → register file, read registers n  Depending on instruction class

n  Use ALU to calculate n  Arithmetic result n  Memory address for load/store n  Branch target address

n  Access data memory for load/store n  PC ← target address or PC + 4

CSE 420 Chapter 4 — The Processor — 4

CPU Overview

Page 13: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 3

CSE 420 Chapter 4 — The Processor — 5

Multiplexers n  Can’t just join

wires together n  Use multiplexers

CSE 420 Chapter 4 — The Processor — 6

Control

Page 14: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 6

CSE 420 Chapter 4 — The Processor — 11

Clocking Methodology n  Combinational logic

transforms data during clock cycles n  Between clock edges n  Input from state elements,

output to state element n  Longest delay determines clock period

CSE 420 Chapter 4 — The Processor — 12

Building a Datapath n  Datapath

n  Elements that process data and addresses in the CPU

n  Registers, ALUs, mux’s, memories, …

n  We will build a MIPS datapath incrementally n  Refining the overview design

§4.3 Building a D

atapath

Page 15: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 7

CSE 420 Chapter 4 — The Processor — 13

Instruction Fetch

32-bit register

Increment by 4 for next instruction

CSE 420 Chapter 4 — The Processor — 14

R-Format Instructions n  Read two register operands n  Perform arithmetic/logical operation n  Write register result

Page 16: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 8

CSE 420 Chapter 4 — The Processor — 15

Load/Store Instructions n  Read register operands n  Calculate address using 16-bit offset

n  Use ALU, but sign-extend offset n  Load: Read memory and update register n  Store: Write register value to memory

CSE 420 Chapter 4 — The Processor — 16

Branch Instructions n  Read register operands n  Compare operands

n  Use ALU, subtract and check Zero output n  Calculate target address

n  Sign-extend displacement n  Shift left 2 places (word displacement) n  Add to PC + 4

n  Already calculated by instruction fetch

Page 17: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 9

CSE 420 Chapter 4 — The Processor — 17

Branch Instructions

Just re-routes

wires

Sign-bit wire replicated

CSE 420 Chapter 4 — The Processor — 18

Composing the Elements n  First-cut data path

does an instruction in one clock cycle n  Each data-path element

can only do one function at a time n  Hence, we need

separate instruction and data memories n  Use multiplexers where alternate data

sources are used for different instructions

Page 18: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 10

CSE 420 Chapter 4 — The Processor — 19

R-Type/Load/Store Datapath

Full Data Path

Page 19: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 11

CSE 420 Chapter 4 — The Processor — 21

ALU Control n  ALU used for

n  Load/Store: F = add n  Branch: F = subtract n  R-type: F depends on funct field

§4.4 A Sim

ple Implem

entation Schem

e

ALU control Function 0000 AND 0001 OR 0010 add 0110 subtract 0111 set-on-less-than 1100 NOR

CSE 420 Chapter 4 — The Processor — 22

ALU Control n  Assume 2-bit ALUOp derived from opcode

n  Combinational logic derives ALU control

opcode ALUOp Operation funct ALU function ALU control lw 00 load word XXXXXX add 0010

sw 00 store word XXXXXX add 0010 beq 01 branch equal XXXXXX subtract 0110 R-type 10 add 100000 add 0010

subtract 100010 subtract 0110 AND 100100 AND 0000 OR 100101 OR 0001

set-on-less-than 101010 set-on-less-than 0111

Page 20: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 12

CSE 420 Chapter 4 — The Processor — 23

The Main Control Unit n  Control signals derived from instruction

0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6

35 or 43 rs rt address 31:26 25:21 20:16 15:0

4 rs rt address 31:26 25:21 20:16 15:0

R-type

Load/ Store

Branch

opcode always read

read, except for load

write for R-type

and load

sign-extend and add

Data Path With Control

Page 21: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 16

CSE 420 Chapter 4 — The Processor — 31

Pipelining Analogy n  Pipelined laundry: overlapping execution

n  Parallelism improves performance n  Four stages: Wash – Dry – Fold - Hang

§4.5 An O

verview of P

ipelining n  Four loads: n  Speedup

= 8/3.5 = 2.3 n  Non-stop:

n  Speedup = 2n/(0.5n + 1.5) ≈ 4 = number of stages

CSE 420 Chapter 4 — The Processor — 32

MIPS Pipeline n  Five stages, one step per stage

1.  IF: Instruction fetch from memory 2.  ID: Instruction decode & register read 3.  EX: Execute operation or calculate address 4.  MEM: Access memory operand 5.  WB: Write result back to register

Page 22: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 19

CSE 420 Chapter 4 — The Processor — 37

Hazards n  Situations that prevent starting the next

instruction in the next cycle n  Structural hazard

n  A required resource is busy n  Data hazard

n  Need to wait for previous instruction to complete its data read/write

n  Control hazard n  Deciding on control action

depends on previous instruction

CSE 420 Chapter 4 — The Processor — 38

Structure Hazards n  Conflict for use of a resource n  In MIPS pipeline with a single memory

n  Load/store requires data access n  Instruction fetch would have to stall

for that cycle n  Would cause a pipeline “bubble”

n  Hence, pipelined data paths require separate instruction & data memories n  Or separate instruction & data caches

Page 23: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 25

CSE 420 Chapter 4 — The Processor — 49

MIPS Pipelined Datapath §4.6 P

ipelined Datapath and C

ontrol

WB

MEM

Right-to-left flow leads to hazards

CSE 420 Chapter 4 — The Processor — 50

Pipeline registers n  Need registers between stages

n  To hold information produced in previous cycle

Page 24: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 26

CSE 420 Chapter 4 — The Processor — 51

Pipeline Operation n  Cycle-by-cycle flow of instructions

through the pipelined data path n  “Single-clock-cycle” pipeline diagram

n  Shows pipeline usage in a single cycle n  Highlight resources used

n  c.f. “multi-clock-cycle” diagram n  Graph of operation over time

n  We’ll look at “single-clock-cycle” diagrams for load & store

CSE 420 Chapter 4 — The Processor — 52

IF for Load, Store, …

Page 25: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 27

CSE 420 Chapter 4 — The Processor — 53

ID for Load, Store, …

CSE 420 Chapter 4 — The Processor — 54

EX for Load

Page 26: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 28

CSE 420 Chapter 4 — The Processor — 55

MEM for Load

CSE 420 Chapter 4 — The Processor — 56

WB for Load

Wrong register number

Page 27: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 29

CSE 420 Chapter 4 — The Processor — 57

Corrected Datapath for Load

CSE 420 Chapter 4 — The Processor — 58

EX for Store

Page 28: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 33

CSE 420 Chapter 4 — The Processor — 65

Pipelined Control n  Control signals derived from instruction

n  As in single-cycle implementation

Chapter 4 — The Processor — 66

Pipelined Control

Page 29: The MIPS Instruction Set - egr.msu.edu · The MIPS Instruction Set ! Used as the example throughout the book ! Large share of embedded core market but dwarfed by ARM ! Typical of

Morgan Kaufmann Publishers 26 January 2011

Chapter 4 — The Processor 38

CSE 420 Chapter 4 — The Processor — 75

Datapath with Forwarding

CSE 420 Chapter 4 — The Processor — 76

Load-Use Data Hazard

Need to stall for one cycle