View
217
Download
2
Tags:
Embed Size (px)
Citation preview
ECE 15B Computer OrganizationSpring 2010
Dmitri Strukov
Lecture 4: Arithmetic / Data Transfer Instructions
Partially adapted from Computer Organization and Design, 4th edition, Patterson and Hennessy, and classes taught by Ryan Kastner at UCSB
ECE 15B Spring 2010
Agenda
• Review of last lecture • Load/store operations • Multiply and divide instructions
ECE 15B Spring 2010
Last Lecture
ECE 15B Spring 2010
Assembly Language
• Basic job of a CPU: execute lots of instructions• Instructions are the primitive operations that the CPU may
execute
• Different CPUs implement different sets of instructions• Instruction Set Architecture (ISA) is a set of instructions a
particular CPU implements• Examples: Intel 80x86 (Pentium 4), IBM/Motorola Power
PC (Macintosh), MIPS, Intel IA64, ARM
ECE 15B Spring 2010
Assembly Variables: Registers
• Unlike HLL like C or Java, assembly cannot use variables– Why not? Keep hardware simple
• Assembly Operands are registers– Limited number of special locations built directly
into the hardware– Operations can only be performed on these– Benefit: Since registers file is small, it is very fast
ECE 15B Spring 2010
Assembly Variables: Registers
• By convention, each register also has a name to make it easier to code
• For now:$16 - $23 $s0 - $s7(correspond to C variables)$8 - $15 $t0 - $t7(correspond to temporary variables)
Will explain other 16 register names later
• In general, use names to make your code more readable
ECE 15B Spring 2010
MIPS Syntax• Instruction Syntax:[Label:] Op-code [oper. 1], [oper. 2], [oper.3], [#comment] (0) (1) (2) (3) (4) (5)
– Where1) operation name2,3,4) operands5) comments0) label field is optional, will discuss later
– For arithmetic and logic instruction2) operand getting result (“destination”)
3) 1st operand for operation (“source 1”) 4) 2nd operand for operation (source 2”
• Syntax is rigid– 1 operator, 3 operands– Why? Keep hardware simple via regularity
ECE 15B Spring 2010
Addition and Subtraction of Integers• Addition in assembly– Example:
add $s0, $s1, $s2 (in MIPS)• Equivalent to: a = b + c (in C)• Where MIPS registers $s0, $s1, $s2 are associated with C
variables a, b, c
• Subtraction in Assembly– Example
Sub $s3, $s4, S5 (in MIPS)• Equivalent to: d = e - f (in C)• Where MIPS registers $s3, $s4, $s5 are associated with C
variables d, e, f
ECE 15B Spring 2010
Addition and Subtraction of Integers
• How do we do this? f = (g + h) – (i + j)Use intermediate temporary registers
add $t0, $s1, $s2 #temp = g + hadd $t1, $s3, $s4 #temp = I + j
sub $s0, $t0, $t1 #f = (g+h)-(i+j)
ECE 15B Spring 2010
Immediates
• Immediates are numerical constants• They appear often in code, so there are special
instructions for them• Add immediate:
addi $s0, $s1, 10 # f= g + 10 (in C)– Where MIPS registers $s0 and $s1 are
associated with C variables f and g– Syntax similar to add instruction, except
that last argument is a number instead of register
ECE 15B Spring 2010
Load and Store Instructions
ECE 15B Spring 2010
CPU Overview
ECE 15B Spring 2010
… with muxes Can’t just join wires
together Use multiplexers
… with muxes
ECE 15B Spring 2010
Memory Operands• Main memory used for composite data– Arrays, structures, dynamic data
• To apply arithmetic operations– Load values from memory into registers– Store result from register to memory
• Memory is byte addressed– Each address identifies an 8-bit byte
• Words are aligned in memory– Address must be a multiple of 4
• MIPS is Big Endian– Most-significant byte at least address of a word– c.f. Little Endian: least-significant byte at least address
ECE 15B Spring 2010
ECE 15B Spring 2010
Data Transfer: Memory to Register• MIPS load Instruction Syntax lw register#, offset(register#) (1) (2) (3) (4)
Where1) operation name
2) register that will receive value 3) numerical offset in bytes 4) register containing pointer to memory
lw – meaning Load Word32 bits or one word are loaded at a time
ECE 15B Spring 2010
Data Transfer: Register to Memory• MIPS store Instruction Syntax sw register#, offset(register#) (1) (2) (3) (4)
Where1) operation name
2) register that will be written in memory 3) numerical offset in bytes 4) register containing pointer to memory
sw – meaning Store Word32 bits or one word are stored at a time
Memory Operand Example 1
• C code:g = h + A[8];– g in $s1, h in $s2, base address of A in $s3
• Compiled MIPS code:– Index 8 requires offset of 32• 4 bytes per word
lw $t0, 32($s3) # load wordadd $s1, $s2, $t0
offset base register
ECE 15B Spring 2010
Memory Operand Example 2
• C code:A[12] = h + A[8];– h in $s2, base address of A in $s3
• Compiled MIPS code:– Index 8 requires offset of 32lw $t0, 32($s3) # load wordadd $t0, $s2, $t0sw $t0, 48($s3) # store word
ECE 15B Spring 2010
Registers vs. Memory
• Registers are faster to access than memory• Operating on memory data requires loads and
stores– More instructions to be executed
• Compiler must use registers for variables as much as possible– Only spill to memory for less frequently used
variables– Register optimization is important!
ECE 15B Spring 2010
Byte/Halfword Operations
• MIPS byte/halfword load/store– String processing is a common case
lb rt, offset(rs) lh rt, offset(rs)– Sign extend to 32 bits in rt
lbu rt, offset(rs) lhu rt, offset(rs)– Zero extend to 32 bits in rt
sb rt, offset(rs) sh rt, offset(rs)– Store just rightmost byte/halfword
Why do we need them? characters and multimedia data are expressed by less than 32 bits;
having dedicated 8 and 16 bits load and store instructions results in faster operation
ECE 15B Spring 2010
ECE 15B Spring 2010
Two’s Compliment RepresentationMultiply and Divide
Unsigned Binary Integers• Given an n-bit number
00
11
2n2n
1n1n 2x2x2x2xx
Range: 0 to +2n – 1 Example
0000 0000 0000 0000 0000 0000 0000 10112
= 0 + … + 1×23 + 0×22 +1×21 +1×20
= 0 + … + 8 + 0 + 2 + 1 = 1110
Using 32 bits 0 to +4,294,967,295
ECE 15B Spring 2010
2s-Complement Signed Integers
• Given an n-bit number0
01
12n
2n1n
1n 2x2x2x2xx
Range: –2n – 1 to +2n – 1 – 1 Example
1111 1111 1111 1111 1111 1111 1111 11002
= –1×231 + 1×230 + … + 1×22 +0×21 +0×20
= –2,147,483,648 + 2,147,483,644 = –410
Using 32 bits –2,147,483,648 to +2,147,483,647
ECE 15B Spring 2010
2s-Complement Signed Integers
• Bit 31 is sign bit– 1 for negative numbers– 0 for non-negative numbers
• –(–2n – 1) can’t be represented• Non-negative numbers have the same unsigned and
2s-complement representation• Some specific numbers– 0: 0000 0000 … 0000– –1: 1111 1111 … 1111– Most-negative: 1000 0000 … 0000– Most-positive: 0111 1111 … 1111
ECE 15B Spring 2010
Signed Negation• Complement and add 1– Complement means 1 → 0, 0 → 1
x1x
11111...111xx 2
Example: negate +2 +2 = 0000 0000 … 00102
–2 = 1111 1111 … 11012 + 1 = 1111 1111 … 11102
ECE 15B Spring 2010
Sign Extension
• Representing a number using more bits– Preserve the numeric value
• In MIPS instruction set– addi: extend immediate value– lb, lh: extend loaded byte/halfword– beq, bne: extend the displacement
• Replicate the sign bit to the left– c.f. unsigned values: extend with 0s
• Examples: 8-bit to 16-bit– +2: 0000 0010 => 0000 0000 0000 0010– –2: 1111 1110 => 1111 1111 1111 1110
ECE 15B Spring 2010
Integer Addition
• Example: 7 + 6
ECE 15B Spring 2010
Integer Subtraction
• Add negation of second operand• Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111–6: 1111 1111 … 1111 1010+1: 0000 0000 … 0000 0001
ECE 15B Spring 2010
Multiplication• Start with long-multiplication approach
1000× 1001 1000 0000 0000 1000 1001000
Length of product is the sum of operand lengths
multiplicand
multiplier
product
ECE 15B Spring 2010
Multiplication Hardware
Initially 0
ECE 15B Spring 2010
ECE 15B Spring 2010
Stopped here… will start next lecture from here
Optimized Multiplier• Perform steps in parallel: add/shift
One cycle per partial-product addition That’s ok, if frequency of multiplications is low
ECE 15B Spring 2010
Faster Multiplier• Uses multiple adders– Cost/performance tradeoff
Can be pipelined Several multiplication performed in parallel
ECE 15B Spring 2010
MIPS Multiplication
• Two 32-bit registers for product– HI: most-significant 32 bits– LO: least-significant 32-bits
• Instructions– mult rs, rt / multu rs, rt
• 64-bit product in HI/LO
– mfhi rd / mflo rd• Move from HI/LO to rd• Can test HI value to see if product overflows 32 bits
– mul rd, rs, rt• Least-significant 32 bits of product –> rd
ECE 15B Spring 2010
Division• Check for 0 divisor• Long division approach
– If divisor ≤ dividend bits• 1 bit in quotient, subtract
– Otherwise• 0 bit in quotient, bring down next
dividend bit
• Restoring division– Do the subtract, and if remainder goes <
0, add divisor back• Signed division
– Divide using absolute values– Adjust sign of quotient and remainder as
required
10011000 1001010 -1000 10 101 1010 -1000 10
n-bit operands yield n-bitquotient and remainder
quotient
dividend
remainder
divisor
ECE 15B Spring 2010
Division Hardware
Initially dividend
Initially divisor in left half
ECE 15B Spring 2010
Optimized Divider
• One cycle per partial-remainder subtraction• Looks a lot like a multiplier!– Same hardware can be used for both
ECE 15B Spring 2010
Faster Division
• Can’t use parallel hardware as in multiplier– Subtraction is conditional on sign of remainder
• Faster dividers (e.g. SRT devision) generate multiple quotient bits per step– Still require multiple steps
ECE 15B Spring 2010
MIPS Division
• Use HI/LO registers for result– HI: 32-bit remainder– LO: 32-bit quotient
• Instructions– div rs, rt / divu rs, rt– No overflow or divide-by-0 checking• Software must perform checks if required
– Use mfhi, mflo to access result
ECE 15B Spring 2010
ECE 15B Spring 2010
Conclusions
• In MIPS assembly language– Register replace C variables– One instruction (simple operation) per line– Simpler is faster
• Memory is byte-addressable, but lw and sw access one word at a time
• A pointer (used by lw and sw) is just a memory address, so we can add to it or subtract from it (using offset)
ECE 15B Spring 2010
Review
• Instructions so far:add, addi, submult, div, mfhi, mflo, lw, sw, lb, lbu, lh, lhu
• Registers so farC variables: $s0 - $s7Temporary variables: $t0 - $t9Zero: $zero