Upload
tejano
View
57
Download
1
Embed Size (px)
DESCRIPTION
CSCI-641/EENG-641 Computer Architecture. Khurram Kazi. Major sources of the slides for this lecture. http://fourier.eng.hmc.edu/e85/lectures/instruction/node7.html http://www.ece.ucdavis.edu/~vojin/CLASSES/EEC180B/Spring99/lab6.pdf http://fourier.eng.hmc.edu/e85/lectures/r3000-isa.html - PowerPoint PPT Presentation
Citation preview
New York Institute of Technology
Engineering and Computer Sciences
CSCI 641 – EENG 641 1
CSCI-641/EENG-641
Computer Architecture
Khurram Kazi
NYITMajor sources of the slides for this lecture
http://fourier.eng.hmc.edu/e85/lectures/instruction/node7.htmlhttp://www.ece.ucdavis.edu/~vojin/CLASSES/EEC180B/Spring99/lab6.pdfhttp://fourier.eng.hmc.edu/e85/lectures/r3000-isa.htmlDigital Design and Computer Architecture book by David Money Harris and Sarah L. Harris. Chapter 6 – Architecturehttp://www.hirstbrook.com/cod/Chapter2B.pdf
K Kazi CSCI 641/EENG 641 2
NYITAssembly language
Assembly language is the human readable representation of computer’s native language
Each instruction specifies both the operations to perform and the operands on which to operate
add is called the mnemonic and indicates what operation to perform
The operation is performed on b and c, the source operands, and results is written to a, the destination operand
Design Principle 1: Simplicity favors regularityInstructions with a consistent number of operands – in this case, two sources and
one destination – are easier to encode and handle in hardware. More complex high-level code translates into multiple MIPS instructions
K Kazi CSCI 641/EENG 641 3
High-level code MIPS Assembly code
a = b + c; add a, b, c
a = b – c; sub a , b, c
High-level code (complex)
MIPS Assembly code
a = b + c – d; sub t, c, d # t = c-d
add a , b, t # a=b+t
NYITAssembly language
To execute complex operation multiple assembly language instructions are performed
Design Principle 2: Make the common case fastThe MIPS instruction set makes the common case fast by including only simple,
commonly used instructions. The number of instructions is kept small so that the hardware required to decode the instructions and its operands can be simple, small and fast.
Less frequent more elaborate operations are performed using sequence of multiples simple instructions
K Kazi CSCI 641/EENG 641 4
High-level code MIPS Assembly code
a = b + c; add a, b, c
a = b – c; sub a , b, c
High-level code (complex)
MIPS Assembly code
a = b + c – d; sub t, c, d # t = c-d
add a, b, t # a=b+t
NYITAssembly language: Operands: Registers, Memory, and Constants
Instruction operates on Operands Variables a, b, and c, all are called operands. Computer operates on 0’s and 1’s Operands are stored in registers or memory, or they may be
constants stored in the instruction itself Computers use various locations to hold operands, to optimize
for speed and data capacity Registers are accessed quickly compared to memory Registers can hold very limited amount of data whereas
memories hold large amounts of data MIPS architecture uses 32 register, called register set or
register file
Design Principle 3: Smaller is faster
K Kazi CSCI 641/EENG 641 5
NYITTranslating high-level code to assembly
K Kazi CSCI 641/EENG 641 6
High level code Assembly language
a = b – c; # $s0=a, $s1=b, $s2=c, $s3=f, $s4=g, #s5=h,
f = (g + h) – (i + j)
#$s6=i, $s7=j
sub $s0, $s1, $s2 #a = b - c
add $t0, $s4, $s5 #$t0 = g + h
add $t1, $s6, $s7 # $t1 = i + j
sub $s3, $t0, $t1 # f = (g + h) – (i + j)
NYIT
K Kazi CSCI 641/EENG 641 7
MIPS register set
Name Number Use
$0 0 The constant value 0
$at 1 Assembler temporary
$v0 – $v1 2 – 3 Procedure return values
$a0 – $a3 4 – 7 Procedure arguments
$t0 – $t7 8 – 15 Temporary variables
$s0 – $s7 16 – 23 Saved variables
$t8 – $t9 24 – 25 Temporary variables
$k0 – $k1 26 – 27 Operating system (OS) temporaries
$gp 28 Global pointer
$sp 29 Stack pointer
$fp 30 Frame pointer
$ra 31 Procedure return address
NYITRegisters within MIPS Processor
Register file (RF): 32 registers ($0 through $31), each for a word of 32 bits (4 bytes); $0 always holds zero $sp (29) is the stack pointer (SP) which always points to
the top item of a stack in the memory; $ra (31) always holds the return address from a subroutine
The table in the previous shows the conventional usage of all 32 registers
K Kazi CSCI 641/EENG 641 8
NYITDescription of Register File
There are two read data buses, a_dout and b_dout, two read address buses, a_addr and b_addr. one write data bus, wr_dbus and one write address bus, wr_addr.
Each of these address buses is used to specify one of the 32 registers for either reading or writing.
The write operation takes place on the rising edge of the clk signal when the wr_en signal is logic 1.
The read operation, however, is not clocked - it is combinational. Thus, the value on the a_dout should always be the contents of the register specified by the a_addr bus.
Similarly, the value on the b_dout should always be the contents of the register specified by the b_addr bus.
So, with this register file, you can write to a register and read two registers simultaneously. It is also possible to read a single register on both of the read buses simultaneously.
It essence it is a 3-port memory element that allows two reads and one write simultaneously.
K Kazi CSCI 641/EENG 641 9
NYITMemory
K Kazi CSCI 641/EENG 641 10
ABCDEF78
F2EFAB01
ECEBAB01
12EBAB2A
Data
00000000
00000001
00000002
00000003
Word Address
Word 0
Word 1
Word 2
Word 3
AB CD EF 78
F2 EF AB 01
EC EB AB 01
12 EB AB 2A
00000000
00000004
00000008
0000000C
Word AddressData
0 1 2 3
4 5 6 7
8 9 A B
C D E F
MSB LSB
Byte Address
3 2 1 0
7 6 5 4
B A 9 8
F E D C
MSB LSB
BIG- ENDIAN LITTLE- ENDIAN
0 1 2 3 3 2 1 0Byte Address
NYITMemory
Compared to registerfile, memory is large and slow
MIPS uses byte-addressable memory MIPS architecture uses 32-bit memory address
and 32-bit data words Memory array is word-addressable, i.e., each 32-
bit data word has a unique 32-bit address MIPS uses load word instruction, lw, to read data
from memory into a register lw instrcution specifies the effective address in
memory as sum of base address and offset, e.g. lw $s0, 0($0) # read data word 0 into $s0 lw $s1, 4($0) # read data word 1 into $s1 lw $s2, 0xC($0) # read data word 3 into $s2
K Kazi CSCI 641/EENG 641 11
offset Base address
NYITMemory
MIPS uses store word instruction, sw, to write data from a register to a memory
sw instrcution specifies the effective address in memory as sum of base address and offset, e.g.
sw $s0, 0($0) # write $s0 to memory data word 0 sw $s1, 4($0) # write $s1 to memory data word 1 sw $s2, 0xC($0) # write $s2 to memory data word 3
K Kazi CSCI 641/EENG 641 12
offset Base address
NYIT
Instruction set: each instruction in the instruction set describes one particular CPU operation. Each instruction is represented in both assembly language by the mnimonics and machine language (binary) by a word of 32 bits subdivided into several fields.
rs – is short for “register source.” rt comes after rs alphabetically and usually indicates second register source.
rd – is short for “register destination.”
K Kazi CSCI 641/EENG 641 13
Instruction Set of MIPS Processor
shamt – shift and mix operationOp - Opcode
NYITInstruction Set of MIPS Processor: R-type instruction
Arithmetic/Logical Instructions in MIPSLogical operations are and, or, xor, and norR-type instructions operate bit-by-bit on
two source registers and the result is written to the destination address
and is used in masking bits (i.e. forcing unwanted bits to 0)
or is useful in combing bits from two registers
MIPS does not provide a NOT instruction, NOR can be used for NOT operation, e.g., A NOR $0 = not A
K Kazi CSCI 641/EENG 641 14
NYIT
K Kazi CSCI 641/EENG 641 15
Instruction Set of MIPS Processor: Machine code for R-type instruction
0 11 13 8
6 bits
0 34
5 bits 5 bits 5 bits 5 bits 6 bits
0 17 18 16 0 32
sub $t1, $t3, $t5
add $s0, $s1, $s2
000000 01011 01101 01000 00000 100010
000000 10001 10010 10000 00000 100000
0x016D4022
0x02328020
Field values Machine code
NYITInstruction Set of MIPS Processor: I-type instruction
Immediate type or I-type instruction use two register operands and one immediate operand.
Similar to R-type instructionOperation is solely defined by the opcode rs and imm are always used as source
operands rt is used as a destination for some
instructions, but never a source for others
K Kazi CSCI 641/EENG 641 16
NYIT
K Kazi CSCI 641/EENG 641 17
Instruction Set of MIPS Processor: Machine code for I-type instructions
sw $s1, 4($t1) 0xAD310004
Field valuesMachine code
op rs rt imm
6 bits 5 bits 16 bits5 bits
43 9 17 4
6 bits 5 bits 16 bits5 bits
35 0 10 32lw $t2, 32($0)
8 19 8 -12addi $t0, $32. -12
8 17 16 5addi $s0, $s1. 5
op rt immrs
0x8C0A0020
0x2268FFF4
0x22300005
I-type instructions – immediate type
NYITLoad word (lw) instruction
MIPS uses load word instruction, lw, to read a data word from memory into a register lw $s3 1($0) #read memory word 1 into $s3 lw instruction specifies effective address in memory as sum
of base address and an offset. Base address (written in parentheses in the instruction) is a
register Offset is constant (written before the parentheses) Base address is $0 and offset is 1 => instruction reads from
memory address 1. After instruction register S3 = F2F1AC07
K Kazi CSCI 641/EENG 641 18
Word address Data Word
00000003 40F30788 WORD 3
00000002 01EE2842 WORD 2
00000001 F2F1AC07 WORD 1
00000000 ABCDEF78 WORD 0
NYITlw instruction
K Kazi CSCI 641/EENG 641 19
rtrt is used as a destination in
this instruction
NYITstore word (sw) instruction
MIPS uses load word instruction, lw, to read a data word from memory into a register sw $s7 5($0) #write $s7 to memory word 5 sw instruction is used to write data from register to memory. Base address (written in parentheses in the instruction) is
the value stored in the register Offset is constant (written before the parentheses) Base address is $0 and offset is 5 => instruction writes data
from register $7 to memory word 5. Keep in mind that MIPS memory model is byte
addressable, not word addressable
K Kazi CSCI 641/EENG 641 20
NYIT
Classical five-stage RISC pipeline
K Kazi CSCI 641/EENG 641 21
NYIT
K Kazi CSCI 641/EENG 641 22
MIPS R3000 Instruction Set Summary
Name Example Comments
32 registers $0, $1, $2,..., $31
Fast location for data. In MIPS, data must be in registers to perform arithmetic. MIPS register $0 always equal 0. Register $1 is reserved for the assembler to handle pseudo instructions and large constants
230 memory words Memory[0], Memory[4],..., Memory[4293967292]
Accessed only by data transfer instructions. MIPS uses byte addresses, so sequential words differ by 4. Memory holds data structures, such as arrays, and spilled registers, such as those saved on procedure calls
NYIT
K Kazi CSCI 641/EENG 641 23
MIPS R3000 Instruction Set Summary
Category Instruction Example Meaning Comments
Arithmetic add add $1,$2,$3 $1 = $2 + $3 3 operands; exception possible
subtract sub $1,$2,$3 $1 = $2 - $3 3 operands; exception possible
add immediate addi $1,$2,100 $1 = $2 + 100 + constant; exception possible
add unsigned addu $1,$2,$3 $1 = $2 + $3 3 operands; exception possible
subtract unsigned subi $1,$2,$3 $1 = $2 - $3 3 operands; exception possible
add immediate unsigned addi $1,$2,100 $1 = $2 + 100 + constant; exception possible
Move from coprocessor register mfc0 $1,$epc $1 = $epc Used to get of Exception PC
NYIT
K Kazi CSCI 641/EENG 641 24
MIPS R3000 Instruction Set Summary
Category Instruction Example Meaning Comments
Logical and and $1,$2,$3 $1 = $2 & $3 3 register operands; Logical AND
or or $1,$2,$3 $1 = $2 | $3 3 register operands; Logical OR
and immediate and $1,$2,100 $1 = $2 & 100 Logical AND register, constant
or immediate or $1,$2,100 $1 = $2 | 100 Logical OR register, constant
shift left logical sll $1,$2,10 $1 = $2 << 10 Shift left by constant
shift right logical srl $1,$2,10 $1 = $2 >> 10 Shift right by constant
NYIT
K Kazi CSCI 641/EENG 641 25
MIPS R3000 Instruction Set Summary
Category Instruction Example Meaning Comments
Data transfer load word lw $1,(100)$2 $1 = Memory[$2+100] Data from memory to register
store word sw $1,(100)$2 Memory[$2+100] = $1 Data from memory to register
load upper immediate lui $1,100 $1 = 100 * 216 Load constant in upper 16bits
Conditional branch branch on equal beq $1,$2,100 if ($1 == $2) go to PC+4+100 Equal test; PC relative branch
branch on not equal bne $1,$2,100 if ($1 != $2) go to PC+4+100 Not equal test; PC relative
set on less than slt $1,$2,$3 if ($2 < $3) $1 = 1; else $1 = 0 Compare less than; 2`s complement
set less than immediate slti $1,$2,100 if ($2 < 100) $1 = 1; else $1 = 0 Compare < constant; 2`s complement
set less than unsigned sltu $1,$2,$3 if ($2 < $3) $1 = 1; else $1 = 0 Compare less than; natural number
set less than immediate unsigned sltiu $1,$2,100 if ($2 < 100) $1 = 1; else $1 = 0 Compare constant; natural number
Unconditional jump jump j 10000 goto 10000 Jump to target address
jump register j $31 goto $31 For switch, procedure return
jump and link jal 10000 $31 = PC + 4;go to 10000 For procedure call
NYITPipeline stages: 5 stages Fetch
Reads instructions from instruction memory
Decode Reads the source operands from the register file and
decodes the instructions to produce the control signals
Execute Performs a computation with the ALU
Memory Processor reads or write data memory
Writeback Processor writes the results to the register file, when
applicable
K Kazi CSCI 641/EENG 641 26
NYITPipelined logic of MIPS
K Kazi CSCI 641/EENG 641 27
NYITSnippet of MIPS simulation
K Kazi CSCI 641/EENG 641 28
Instruction immediateDest. reg
Write enable
Write addr
Write data
NYITMIPS Microarchitecture
Single-cycle microarchitecture Executes an entire instruction in one cycle
Multi-cycle microarchitecture Executes instructions in a series of shorter cycles Simpler instructions execute in fewer cycles than complicated
ones Reuse complicated hardware blocks such as adders Executes only one instruction at a time over multiple cycles
Pipelined microarchitecture Applies pipelining to the single-cycle microarchitecture Can execute several instructions simultaneously Improve throughput significantly All high-performance commercial processors use pipelining
K Kazi CSCI 641/EENG 641 29
NYITMIPS Microarchitecture
Many ways to measure performance of a computer system Intel and Advanced Micro Devices (AMD) both sell compatible
processors conforming to IA-32 architecture Intel offered higher clock frequencies than its competitors AMD’s Athlon, Intel’s main competitor, executed programs faster than Intel’s chips
at the same clock frequency
The only gimmick-free way to measure performance is by measuring the execution time of a program of interest to you
Computer that executes your program fastest has the highest performance
Next best thing would be to measure total execution time of a collection of programs Such collections of programs are called benchmarks and execution times of such
programs are commonly published
Execution time = (#instructions)(cycle/instruction)(seconds/cycle) Cycle per instruction, CPI is the number of clock cycles
required to execute an average instruction
K Kazi CSCI 641/EENG 641 30