Upload
tigabu-yaya
View
223
Download
3
Embed Size (px)
Citation preview
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 1101
Instruction Set Principles and ExamplesUNIT - 2
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 2101
Classification of Instruction SetArchitectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 3101
Instruction SetInstruction SetDesignDesi
gn
Multiple Implementations 8086 Pentium
ISAs eole MIPS-I MIPS-II MIPS-III MIPS-IMIPSM$M MIPS-amp2 MIPS-6
instruction set
software
hardware
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101
MIPS (originally an acronym for MicroprocessorMIPS
(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi
peline Stages)
MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and
MIPSamp define a control register set as ell as theinstruction set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101
Typical Processor Execution CycleT
ypical Processor Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in register or storage for later use
Determine successor instruction
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101
Instruction and Data Memory Unified or SeparateInstruction and Data Memor
y Unified or Separate
ADDSRACAD$RC$MPARampampamp
ampampamp
Programmers View
Computers View
CP
Memory
I$
Computer Program
(Instructions)
Princeton (on eumann) Architecture
$$$ ata and Instructions mied in same
unified memory
$$$ Program as data
$$$ Storage utili-ation
$$$ Single memory interface
+arard Architecture
$$$ ata Instructions in
separate memories
$$$ as advantages in certain high performance implementations
$$$ Can optimi-e each memory
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101
Classifying instruction set ArchitecturesClassif ying instruction set Architectures
There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)
ampSet of 1egisters (1egister$1egister2load$store)
The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one
operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 2101
Classification of Instruction SetArchitectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 3101
Instruction SetInstruction SetDesignDesi
gn
Multiple Implementations 8086 Pentium
ISAs eole MIPS-I MIPS-II MIPS-III MIPS-IMIPSM$M MIPS-amp2 MIPS-6
instruction set
software
hardware
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101
MIPS (originally an acronym for MicroprocessorMIPS
(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi
peline Stages)
MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and
MIPSamp define a control register set as ell as theinstruction set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101
Typical Processor Execution CycleT
ypical Processor Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in register or storage for later use
Determine successor instruction
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101
Instruction and Data Memory Unified or SeparateInstruction and Data Memor
y Unified or Separate
ADDSRACAD$RC$MPARampampamp
ampampamp
Programmers View
Computers View
CP
Memory
I$
Computer Program
(Instructions)
Princeton (on eumann) Architecture
$$$ ata and Instructions mied in same
unified memory
$$$ Program as data
$$$ Storage utili-ation
$$$ Single memory interface
+arard Architecture
$$$ ata Instructions in
separate memories
$$$ as advantages in certain high performance implementations
$$$ Can optimi-e each memory
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101
Classifying instruction set ArchitecturesClassif ying instruction set Architectures
There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)
ampSet of 1egisters (1egister$1egister2load$store)
The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one
operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 3101
Instruction SetInstruction SetDesignDesi
gn
Multiple Implementations 8086 Pentium
ISAs eole MIPS-I MIPS-II MIPS-III MIPS-IMIPSM$M MIPS-amp2 MIPS-6
instruction set
software
hardware
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101
MIPS (originally an acronym for MicroprocessorMIPS
(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi
peline Stages)
MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and
MIPSamp define a control register set as ell as theinstruction set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101
Typical Processor Execution CycleT
ypical Processor Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in register or storage for later use
Determine successor instruction
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101
Instruction and Data Memory Unified or SeparateInstruction and Data Memor
y Unified or Separate
ADDSRACAD$RC$MPARampampamp
ampampamp
Programmers View
Computers View
CP
Memory
I$
Computer Program
(Instructions)
Princeton (on eumann) Architecture
$$$ ata and Instructions mied in same
unified memory
$$$ Program as data
$$$ Storage utili-ation
$$$ Single memory interface
+arard Architecture
$$$ ata Instructions in
separate memories
$$$ as advantages in certain high performance implementations
$$$ Can optimi-e each memory
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101
Classifying instruction set ArchitecturesClassif ying instruction set Architectures
There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)
ampSet of 1egisters (1egister$1egister2load$store)
The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one
operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 4101
MIPS (originally an acronym for MicroprocessorMIPS
(originally an acronym for Microprocessorwithout Interlocked Pipeline Stages)without Interlocked Pi
peline Stages)
MIPS is a reduced instruction set computer (RISC)instruction set architecture (ISA) developed by MIPSTechnologies (formerly MIPS Computer Systems Inc)The early MIPS architectures ere $bit and laterversions ere amp$bit Multiple revisions of the MIPSinstruction set eist including MIPS I MIPS II MIPS IIIMIPS I MIPS MIPS and MIPSamp The currentrevisions are MIPS (for $bit implementations) andMIPSamp (for amp$bit implementations)++ MIPS and
MIPSamp define a control register set as ell as theinstruction set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101
Typical Processor Execution CycleT
ypical Processor Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in register or storage for later use
Determine successor instruction
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101
Instruction and Data Memory Unified or SeparateInstruction and Data Memor
y Unified or Separate
ADDSRACAD$RC$MPARampampamp
ampampamp
Programmers View
Computers View
CP
Memory
I$
Computer Program
(Instructions)
Princeton (on eumann) Architecture
$$$ ata and Instructions mied in same
unified memory
$$$ Program as data
$$$ Storage utili-ation
$$$ Single memory interface
+arard Architecture
$$$ ata Instructions in
separate memories
$$$ as advantages in certain high performance implementations
$$$ Can optimi-e each memory
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101
Classifying instruction set ArchitecturesClassif ying instruction set Architectures
There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)
ampSet of 1egisters (1egister$1egister2load$store)
The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one
operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 5101
Typical Processor Execution CycleT
ypical Processor Execution Cycle
Instruction
Fetch
Instruction
Decode
Operand Fetch
Execute
Result
Store
Next
Instruction
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in register or storage for later use
Determine successor instruction
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101
Instruction and Data Memory Unified or SeparateInstruction and Data Memor
y Unified or Separate
ADDSRACAD$RC$MPARampampamp
ampampamp
Programmers View
Computers View
CP
Memory
I$
Computer Program
(Instructions)
Princeton (on eumann) Architecture
$$$ ata and Instructions mied in same
unified memory
$$$ Program as data
$$$ Storage utili-ation
$$$ Single memory interface
+arard Architecture
$$$ ata Instructions in
separate memories
$$$ as advantages in certain high performance implementations
$$$ Can optimi-e each memory
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101
Classifying instruction set ArchitecturesClassif ying instruction set Architectures
There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)
ampSet of 1egisters (1egister$1egister2load$store)
The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one
operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 6101
Instruction and Data Memory Unified or SeparateInstruction and Data Memor
y Unified or Separate
ADDSRACAD$RC$MPARampampamp
ampampamp
Programmers View
Computers View
CP
Memory
I$
Computer Program
(Instructions)
Princeton (on eumann) Architecture
$$$ ata and Instructions mied in same
unified memory
$$$ Program as data
$$$ Storage utili-ation
$$$ Single memory interface
+arard Architecture
$$$ ata Instructions in
separate memories
$$$ as advantages in certain high performance implementations
$$$ Can optimi-e each memory
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101
Classifying instruction set ArchitecturesClassif ying instruction set Architectures
There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)
ampSet of 1egisters (1egister$1egister2load$store)
The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one
operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 7101
Classifying instruction set ArchitecturesClassif ying instruction set Architectures
There are four types of internal storages uses by theprocessor to store operands eplicitly and implicitly foreecution of a programStac0Accumulator Set of 1egisters (1egister$Memory)
ampSet of 1egisters (1egister$1egister2load$store)
The operands in stac0 architecture are implicitly on the topof the stac0 and in an accumulator architecture one
operand is implicitly the accumulator The general$purposeregister architectures (1egister$Memory and 1egister$1egister) have only eplicit operands either in registers ormemory locations
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 8101
asic Addressing Classesasic Addressing Classes
$eclinin cost of reisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 9101
perand locations for four instruction set architectureperand locations for four instruction set architectureclassesclasses
The arros indicate hether the operand is an input or the result of the A34 operation or both an input and result3ighter shades indicate inputs and the dar0 shade indicates the resultIn (a) a Top 5f Stac0 register (T5S) points to the top input operand
hich is combined ith the operand belo The first operand is removedfrom the stac0 the result ta0es the place of the second operand andT5S is updated to point to the result All operands are implicit In (b) the
Accumulator is both an implicit input operand and a result In (c) oneinput operand is a register one is in memory and the result goes to a
register All operands are registers in (d) and li0e the stac0 architecturecan be transferred to memory only via separate instructions6 push or popfor (a) and load or store for (d)
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 10101
Code Seuence for C$ACode Seuence for C$A
Stack Accumulator Register-memory Register-register
Push A (oa) A (oa) + A (oa) + A
Push A)) A)) amp+ (oa) 2
A)) Store C Store amp C A)) amp + 2
Pop C Store amp C
he code se-uence for C A for four classes of instruction setsamp7ote that the Add instruction has implicit operands for stac0 and accumulatorarchitectures and eplicit operands for register architectures It is assumedthat A 8 and C all belong in memory and that the values of A and 8 cannot bedestroyed
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 11101
Stacamp ArchitecturesStacamp Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 12101
Accumulator ArchitecturesAccumulator Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 13101
egister(Set Architectures egister(Set Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 14101
egister(to(egister )oad(Store Architectures egister(to(egister )oad(Store Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 15101
egister(to(Memory Architectures egister(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 16101
Memory(to(Memory ArchitecturesMemory(to(Memory Architectures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 17101
Instruction ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 18101
Instruction Set Architecture (ISA )Instruction Set Architecture (ISA )
To command a computer9s hardare you must spea0 its
language The ords of a machine9s language are called instructions and
its vocabulary is called instruction set
5nce you learn one machine language it is easy to pic0 upothers6 There are fe fundamental operations that all computers must provide
All designer have the same goal of finding a language that simplifies buildinthe hardare and the compiler hile maimi-ing performance andminimi-ing cost
3earning ho instructions are represented leads to discoveringthe secret of computing6 the stored$program concept
The MIPS instruction set is used as a case study
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 19101
Interface DesignInterface Design A good interface
3asts through many implementations (portability compatibility)
Is used in many different ays (generality) Provides convenient functionality to higher levels
Permits an efficient implementation at loer levels
Design decisions must take into account
Technology
Machine organi-ation
Programming languages
Compiler technology
5perating systems
Interface
imp
imp 0
imp 1
use
use
use
i m e
Cl if i I t ti S t A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 20101
Classifying Instruction Set Architectures Accumulator Architecturebull Common in early stored$program computers hen hardare as so epensivebull Machine has only one register (accumulator) involved in all math logical operationsbull All operations assume the accumulator as a source operand and a destination for theoperation ith the other operand stored in memory
lttended Accumulator Architecturebull edicated registers for specific operations eg stac0 and array inde registers added
bull The =gt= microprocessor is a an eample of of such special$purpose register arch
eneral$Purpose 1egister Architecturebull MIPS is an eample of such arch here registers are not stic0ing to play a single role
bull This type of instruction set can be further divided into6
bull Register-memory allos for one operand to be in memory
bull Register-register (load-store) demands all operands to be in registers
Machine 2 general3purposeregisters
Architecture style 4ear
Motorola =gtgt Accumulator Bamp
ltC A 1egister$memory memory$memory BB
Intel =gt= lttended accumulator B=
Motorola =gtgtgt 1egister$memory =gt
Intel =gt= 1egister$memory =
PoerPC 3oad$store
ltC Alpha 3oad$store
C C d d S k A hi
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 21101
Compact Code and Stack Architectures Dhen memory is scarce machines li0e Intel =gt= had variable$length
instructions to match varying operand specifications and minimi-e code si-e
Stac0 machines abandoned registers altogether arguing that it is hard for
compilers to use them efficiently
5perands are to be pushed on a stac0 from memory and the results have tobe popped from the stac0 to memory
5perations ta0e their operand by default from the top of the stac0 and insert
the results bac0 onto the stac0 Stac0 machines simplify compilers and lent themselves to a compact
instruction encoding but limit compiler optimi-ation (eg in math epressions)
Example A E 8 F CPush AddressC G TopETopFampH Stac0Top+EMemoryAddressC+
Push Address8 G TopETopFampH Stac0Top+EMemoryAddress8+add G Stac0Top$amp+EStac0Top+FStac0Top$amp+H TopETop$ampPop AddressA G MemoryAddressA+EStac0Top+H TopETop$amp
Compact code is important for heralded netor0 computers here programsmust be donloaded over the Internet (eg ava$based applications)
$th t f A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 22101
$ther types of Architecture igh$3evel$3anguage Architecture
bull In the gts systems softare as rarely ritten in high$level languages and virtuallyevery commercial operating system before 4ni as ritten in assembly
bull Some people blamed the code density on the instruction set rather than theprogramming language
bull A machine design philosophy as advocated ith the goal of ma0ing the hardaremore li0e high$level languages
bullThe effectiveness of high$level languages memory si-e limitation and lac0 of efficient
compilers doomed this philosophy to a historical footnote
1educed Instruction Set Architecture
bull Dith the recent development in compiler technology and epanded memory si-es lessprogrammers are using assembly level coding
bull Instruction set architecture became measurable in the ay compilers rather
programmable use them
bull 1ISC architecture favors simplifying hardare design over enriching the offered set of instructions relying on compilers to effectively use them to perform comple operations
bull irtually all ne architecture since = follos the 1ISC philosophy of fiedinstruction lengths load$store operations and limited addressing mode
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 23101
olution of Instruction Setsolution of Instruction SetsSingle Accumulator (EDSAC 1)
Accumulator F Inde 1egisters(anc$ester ark amp series 1)
Separation of Programming Model from Implementation
+igh3leel 5anguage ased Concept of a 6amily
( 1) ( 1+)
eneral Purpose 1egister Machines
Comple7 Instruction Sets 5oadStore Architecture
RISC
(axamp ntel + 1-) (CDC amp Cray 1 1-)
(SampSARCamp RSamp 0 0 01)
R i t M A hit t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 24101
2 memoryaddresses
Ma7amp num8erof operands
7amples
gt SPA1C MIPS PoerPC A3PA
Intel gt= Motorola =gtgtgt
A (also has operands format)
A (also has operands format)
Register3Memory Architectures
Eect o the numer o memor operands
M Add
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 25101
Memory AddressInterpreting Memory Addressing
The address of a ord matches the byte address of one of its amp bytes
The addresses of seJuential ords differ by amp (ord si-e in byte)
ords9 addresses are multiple of amp (alignment restriction)
Machines that use the address of the leftmost byte as the ord address iscalled Kig EndianK and those that use rightmost bytes called Kittle EndianK
Misalignment complicates memory access and causes programs to run sloer (Some machines does not allo misaligned memory access at all)
8yte ordering can be a problem hen echanging data among different machines 8yte addresses affects array inde calculation to account for ord addressing and offset ithin the ord
$89ectaddressed
Aligned at8yte offsets
Misaligned at8yte offsets
8yte ampB 7ever
alf ord gtamp B
Dord gtamp B
ouble ord gt ampB
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 26101
Addressing Modes
Addressing modes refer to ho to specify the location of anoperand (effective address)
Addressing modes have the ability to6
Significantly reduce instruction counts
Increase the average CPI
Increase the compleity of building a machine The A machine is used for benchmar0 data since it supports
ide range of memory addressing modes
Lamous addressing modes can be classified based on6
the source of the data into register immediate ormemory
the address calculation into direct and indirect An indeed addressing mode is usually provided to allo
efficient implementation of loops and array access
ample of Addressing Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 27101
7ample of Addressing ModesAddressamp mode 7ample Meaning hen used
1egister A 1amp 1 Regs2R+3 4 Regs2R+3 5
Regs2R)3Dhen a value is in a register
Immediate A 1amp G Regs2R+3 4 Regs2R+3 5 ) Lor constants
isplacement A 1amp gtgt (1) Regs2R+3 4 Regs2R+3 5em2 1 5 Regs2R13 3
Accessing local variables
1egister indirect A 1amp (1) Regs2R+3 4 Regs2R+3 5
em2Regs2R13 3 Accessing using a pointer or a
computed address
Indeed A 1amp (1 F 1) Regs2R+3 4 Regs2R+3 5em2Regs2R13 5
Regs2R-33
Sometimes useful in array
addressing6 1 E base of the
array6 1 E inde amount
irect or absolute A 1amp (gtgt)Regs2R+3 4 Regs2R+3 5
em2 11 3 Sometimes useful for accessingstatic dataH address constant
may need to be large
Memory indirect or
memory deferred
A 1amp (1) Regs2R+3 4 Regs2R+3 5em2em2Regs2R)3 33
If 1 is the address of the
pointer p then mode yields Np
Autoincrement A 1amp (1) F Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Regs2R-3 4 Regs2R-3 5 d
4seful for stepping through
arrays ithin a loop 1 points to
start of the arrayH each reference
increments 1 by d Auto decrement A 1amp $(1) Regs2R-3 4 Regs2R-3 6 d
Regs2R+3 4 Regs2R+3 5em2Regs2R-3 3
Same use as autoincrement
Autodecrement2increment can
also act as push2pop to
implement a stac0
Scaled A 1amp gtgt (1)
1+
Regs2R+3 4 Regs2R+3 5em21 5 Regs2R-3 5
Regs2R)3 7 d3
4sed to inde arrays
Add i M d f Si l P i
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 28101
Addressing Mode for Signal Processing
6ast 6ourier ransform
gt (gtgtgt) gt (gtgtgt)
(gtgt) amp (gtgt)
(gtgt) (gtgt)
(gt) (gt)
amp (gtgt) (gtgt)
(gt) (gt)
(gt) (gt)
B () B ()
Modulo addressing
Since SP deals ith continuous data streamscircular buffers are idely used
Circular or modulo addressing allos automaticincrement and decrement and resets pointerhen reaching the end of the buffer
Reerse addressing
1esulting address is the reverse order of thecurrent address
1everse addressing mode epedites theaccess hich other ise reJuires a number oflogical instructions or etra memory access
SP offers special addressing modes to better serve popular algorithms
Special features reJuires either hand coding or a compiler that uses such
features (74 ould not be a good choice)
$ ti f th C t + d
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 29101
$perations of the Computer +ardware
89$ere must certainly e instructions for performing t$efundamental arit$metic operations0
8ur0es oldstine and on 7eumann ampB
Assembly language is a symbolic representation of hat the processor actually understand
MIPS assembler allos only one instructions2line and ignore comments folloing G until end of line
7ample6
ranslation of a segment of a C program to MIPS assem8lyinstructions
C6 f E (g F h) $ (i F O)
MIPS6
add tgt g h G temp variable tgt contains Kg F hKadd t i O G temp variable t contains Ki F OKsub f tgt t G f E tgt $ t E (g F h) $ (i F O)
$ ti i th I t ti S t
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 30101
$perator type 7amples
Arithmetic and logical Integer arithmetic and logical operations6 add and subtract or
ata Transfer 3oads$stores (move instructions on machines ith memory addressing)
Control 8ranch Oump procedure call and return trap
System 5perating system call irtual memory management instructions
Lloating point Lloating point instructions6 add multiply
ecimal ecimal add decimal multiply decimal to character conversion
String String move string compare string search
raphics Piel operations compression2decompression operations
$perations in the Instruction Set
Arithmetic logical data transfer and control are almost standard categoriesfor all machines
System instructions are reJuired for multi$programming environmentsalthough support for system functions varies
ecimal and string instructions can be primitives eg I8M gt and the A
Support for floating point decimal string and graphics can be optionallysometimes provided via co$processor
Some machines rely on the compiler to synthesi-e special operations suchas string handling from simpler instructions
$ ti f M di lt Si l P
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 31101
$perations for Media lt Signal Processamp Single instruction multiple data (SIM) and vector instructions
are often supported in SPs hich are commonly used in
multimedia and signal processing applications
Partitioned Add (integer)
Perform multiple $bit addition on a amp$bit A34 since most data are narro
Increases A34 throughput for multimedia applications
Paired single operations (float)
Allo same register to be acting as to operands to the same operation
andy in dealing ith vertices and coordinates
Multiply and accumulate
ery handy for calculating dot products of vectors (signal processing) andmatri multiplication
6re-uency of $perations sage
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 32101
Rank =7=gt InstructionInteger Aerage
( total e7ecuted)
3oad
Conditional branch gt
Compare
amp Store
Add =
And B Sub
= Move register$register amp
Call
gt 1eturn
Total
6re-uency of $perations sage
Make the common case fast 8y focusing on these operationsMake the common case fast 8y focusing on these operations
The most idely eecuted instructions are the simple operations of aninstruction set
The folloing is the average usage in SPltCint on Intel =gt=
Control 6low Instructions
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 33101
Control 6low Instructions
ltump for unconditional change in the control flo
ranc$ for conditional change in the control flo
Procedure calls and returns
Data is ased on SEC on Alp$a
Destination Address Definition
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 34101
Destination Address Definition
1elative addressing rt the program counter proved to be the best choice for forard and bac0ard branching or Oumps (load address independent)
To allo for dynamic loading of library routines register indirect addressallos addresses to be loaded in special registers
(eg virtual functions in CFF and system calls in a case statement)
Data is ased SEC on Alp$a
Condition aluation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 35101
Condition aluation
Comparebranch can be efficient if maOorityof conditions are comparison ith -ero
Remem8er to focuson the common case
Remem8er to focuson the common case
8ased on SPltC on MIPS
6re-uency of ypes of Comparison
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 36101
6re-uency of ypes of Comparison
Data is ased on SEC on Alp$a
Different 8enchmark and machine set new design
priority
Different 8enchmark and machine set new design
priority
SPs support repeat instruction for for loops (vectors) using registers
Supporting Procedures
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 37101
Supporting Procedures ltecution of a procedure follos the folloing steps6
Store parameters in a place accessible to the procedure
Transfer control to the procedure
AcJuire the storage resources needed for the procedure Perform the desired tas0
Store the results value in a place accessible to the calling program
1eturn control to the point of origin
The hardare provides a program counter to trace instruction flo andmanage transfer of control
Parameter Passing
1egisters can be used for passing small number of parameters
A stac0 is used to spill registers of the current contet and ma0e room for
the called procedure to run and to allo for large parameters to be passed Storage of machine state can be performed by caller or callee
andling of shared variables is important to ensure correct semantics andthus reJuires clear specifications in the library interface
lobal variables stored in registers need careful handlinglobal variables stored in registers need careful handling
ype and Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 38101
ype and Sie of $perands The type of an operand is designated by encoding it in the instructionQs
operation code
The type of an operand eg single precision float effectively gives its si-e
Common operand types include character half ord and ord si-e integersingle$ and double$precision floating point
Characters are almost alays in ASCII and integers are in Qs complementand floating point in Iltltlt Bamp
The $bit 4nicode used in ava is gaining popularity due its support for the international character sets
Lor business applications some architecture support a decimal format in binary coded decimal (8C)
epending on the si-e of the ord the compleity of handling different operand types differs
SP offers fied point data types to support high precision floating pointarithmetic and to allo sharing single eponent for multiple numbers
Lor raphics applications verte and piel operands are added features
Sie of $perands
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 39101
ouble$ord data type is used for double$precision floating point operationsand address storage in machines ith a amp$bit ide address bus
Dords are used for integer operations and for $bit address bus machines
8ecause the mi in SPltC ord and double$ord data types dominates
Sie of $perands
LreJuency of reference by si-e based on SPltCgtgtgt on Alpha
Instruction Representation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 40101
Instruction Representation umans are taught to thin0 in base gt (decimal) but numbers may be
represented in any base ( in base gt E gt in binary or base )
7umbers are stored in computers as a series of high and lo electronicsignals (binary numbers)
8inary digits are called bits and considered the atom of computing
ltach piece of an instruction is a number and placing these numberstogether forms the instruction
Assembler translate the assembly symbolic instructions into machinelanguage instructions (machine code)
ltample6
Assembly6 add Rtgt Rs Rs
M2C language (decimal)6
M2C language (binary)6
Note MIPS compiler 8y default maps sBBs to regamp gt301 and tBBt to regamp =3E
gt gt gt gt gt gt gt gt gt gt gt gt gt gtgt gt gt gt gtgt gt gt gt gt gt gt b i t s b i t s b i t s b i t s b i t s b i t s
gt B gt= =
ncoding an Instruction Set
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 41101
ncoding an Instruction Set Instruction encoding affects the si-e of the compiled program and the
compleity of the CP4 implementation
The operation is typically specified in one field called opcode The addressing mode for the operand can be encoded ith the operation
or specified through a separate identifier in case of large number ofsupported modes
The architecture must balance beteen several competing factors6
esire to support as many registers and addressing modes as possible
ltffect of operand specification on the si-e of the instruction (program)
esire to simplify instruction fetching and decoding during eecution
Lied si-e instruction encoding simplify the CP4 design hile limiting theaddressing modes supported
An architect caring about the code si-e can use variable si-e encoding
A hybrid approach is to allo variability by supporting multiple$si-edinstruction
ncoding 7amples
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 42101
ncoding 7amples
MIPS Instruction format
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 43101
MIPS Instruction format Register3format instructions
op6 8asic operation of the instruction traditionally called opcoders6 The first register source operandrt 6 The second register source operandrd 6 The register destination operand it gets the result of the operations$mat 6 Shift amountfunct 6 This field selects the specific variant of the operation of the op field
Immediate3type instructions
Some instructions need longer fields than provided for large value constant
The $bit address means a load ord instruction can load a ord ithin a
region of plusmn
bytes of the address in the base register ltample6 l Rtgt (Rs) G Temporary register Rtgt gets A=+Instruction 6ormat op rs rt rd shamt funct address
add 1 gt reg reg reg gt 72A
sub 1 gt reg reg reg gt amp 72A
l I reg reg 72A 72A 72A address
s I amp reg reg 72A 72A 72A address
o p r s f u n c ts h a m tr dr t b i t s b i t s b i t s b i t s b i t s b i t s
o p r s a d d r e s sr t b i t s b i t s b i t s b i t s
he Stored Program Concepthe Stored Pro
gram Concept
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 44101
he Stored Program Concepthe Stored Program Concept 3earning ho instructions are represented leads to discovering
the secret of computing6 the stored$program concept
TodayQs computers are build on to 0ey principles 6 Instructions are represented as numbers
Programs can be stored in memory to beread or ritten Oust li0e numbers
he power of the concept
memory can contain6
the source code for an editor
the compiled m2c code for the editor
the tet that the compiled program is using
the compiler that generated the code
P r o c e s s o r
A c c o u n t i n g p r o g r a m( m a c h i n e c o d e )
lt d i t o r p r o g r a m( m a c h i n e c o d e )
C c o m p i l e r ( m a c h i n e c o d e )
P a y r o l l d a t a
8 o o 0 t e t
S o u r c e c o d e i n Cf o r e d i t o r p r o g r a m
M e m o r y
Compiling if3then3else in MIPS
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 45101
Compiling if3then3else in MIPS Assuming t$e fi=e =ariales famp gamp $amp iampand lt correspond to t$e fi=e registersgts t$roug$ gts+amp $at is t$e compilerS code for t$e folloing C ifstatement
if (i 44 lt) f 4 g 5 $ else f 4 g - $
i E E O
f E g U hf E g F h
lt l s e 6
lt i t 6
i E O i ne O
bne Rs Rsamp ltlse G go to ltlse if i ne O
add Rsgt Rs Rs G f E g F h (s0ipped if i ne O)
O ltit
ltlse6 sub Rsgt Rs Rs G f E g $ h (s0ipped if i E O)
ltit6
MIPS
ypical Compilation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 46101
ypical Compilation
Ma9or ypes of $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 47101
$ptimiation ame 7planation 6re-uency
+igh Fleel
Procedure integration
$t or near source leelamp machine indep
1eplace procedure call by procedure body 7M
5ocal
Common sub$ epressionelimination
Constant propagation
Stac0 height reduction
(ithin straight line code
1eplace to instances of the same computation bysingle copy
1eplace all instances of a variable that is assigned aconstant ith the constant
1earrange epression tree to minimi-e resourcesneeded for epression evaluation
=
7M
Glo8al
lobal common subepression elimination
Copy propagation
Code motion
Induction variable
elimination
$cross a ranch
Same as local but this version crosses branches
1eplace all instances of a variable A that has beenassigned (ie A E ) ith
1emove code from a loop that computes same value
each iteration of the loopSimplify2eliminate array Uaddressing calculationsithin loops
Machine3dependant
Strength reduction
Pipeline Scheduling
Depends on machine )nowledge
Many eamples such as replace multiply by aconstant ith adds and shifts
1eorder instructions to improve pipeline performance
7M
7M
Ma9or ypes of $ptimiation
ffect of Complier $ptimiation
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 48101
easurements taken on S
P r o g r a m a
n d C o m p i l e r $ p t i m i a t i
o n 5 e e l
e=el 6 non$optimi-ed code
e=el 16 local optimi-ation
e=el 6 global optimi-ation s2 pipelining
e=el 6 adds procedure integration
ffect of Complier $ptimiation
Compiler Support for Multimedia Instr
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 49101
IntelQs MM and PoerPC Altiec have small vector processing capabilitiestargeting Multimedia applications (to speed up graphics)
Intel added ne set of instructions called Streaming SIM lttension
A maOor advantage of vector computers is hiding latency of memory accessby loading multiple elements and then overlapping eecution ith data
transfer
ector computers typically have strided and2or gather2scatter addressing to
perform operations on distant memory locations Strided addressing allos memory access in increment larger than one
ather2scatter addressing is similar to register indirect mode here theaddress are stored instead of the data
Supporting vector operation ithout strided addressing such as IntelQs MMlimits the potential speedup
Such limited support for vector processing ma0es the use of vectori-ing compiler optimi-ation unpopular and restrict its scope to hand coded routines
Compiler Support for Multimedia Instramp
SIM instructions on MM and Altiec tend to be solutions not primitivesSIM instructions on MM and Altiec tend to be solutions not primitives
Starting a Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 50101
Starting a Program
A s s e m b l e r
A s s e m b l y l a n g u a g e p r o g r a m
C o m p i l e r
C p r o g r a m
3 i n 0 e r
lt e c u t a b l e 6 M a c h i n e l a n g u a g e p r o g r a m
3 o a d e r
M e m o r y
5 b O e c t 6 M a c h i n e l a n g u a g e m o d u l e 5 b O e c t 6 3 i b r a r y r o u t i n e ( m a c h i n e l a n g u a g e )
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
$ Place code data modules
symbolically in memory
$etermine the address of data instruction labels
$Patch both internal eternal ref
5bOect files for 4ni typically contains6
eader6 si-e position of components
Tet segment6 machine code
ata segment6 static and dynamic variables1elocation info6 identify absolute memory ref
Symbol table6 name location of labelsprocedures and variables
ebugging info6 mapping source to obOectcode brea0 points etc
5inker
5oading 7ecuta8le Program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 51101
R s p
R g p
gt gt amp gt gt gt gt gth e
gt
gt gt gt gt gt gt gt h e
T e t
S t a t i c d a t a
y n a m i c d a t a
S t a c 0B f f f f f f f
h e
gt gt gt = gt gt gth e
p c
1 e s e r v e d
5oading 7ecuta8le Program
To load an eecutable the operating systemfollos these steps6
1eads the eecutable file header todetermine the si-e of tet and data segments
Creates an address space large enough forthe tet and data
Copies the instructions and data from the
eecutable file into memory
Copies the parameters (if any) to the mainprogram onto the stac0
Initiali-es the machine registers and sets thestac0 pointer to the first free location
umps to a start$up routines that copies theparameters into the argument registers andcalls the main routine of the program
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 52101
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 53101
Instruction Set Design IssuesInstruction Set Design Issues
Instruction Set esign Issues 7umber of Addresses
Llo of Control
5perand Typesamp Addressing Modes
Instruction Types
Instruction Lormats
um+er of Addressesum+er of Addresses
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 54101
um+er of Addressesum+er of Addresses
Lour categories
$address machines$ for the source operands and one for the result
$address machines
$ 5ne address doubles as source and result
$address machine$ Accumulator machines
$ Accumulator is used for one source and result
gt$address machines
$ Stac0 machines
$ 5perands are ta0en from the stac0
$ 1esult goes onto the stac0
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 55101
um+er of Addresses cont-um+er of Addresses cont-
Three$address machines
To for the source operands one for the result
1ISC processors use three addresses
Sample instructions
add destsrc1src2
M(dest)=[src1]+[src2]
sub destsrc1src2
M(dest)=[src1]-[src2]
mult destsrc1src2
M(dest)=[src1][src2]
Three addresses
Operand 1 Operand 2 Result
Example a = b + c
Three-address instruction formats are not common because they reuire a
relatiely lon instruction format to hold the three address references
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 56101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
mult TCD T = CD
add TTB T = B+CD
sub TTE T = B+CD-E
add TTF T = B+CD-E+Fadd ATA A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 57101
um+er of Addresses cont-um+er of Addresses cont-
To$address machines
5ne address doubles (for source operand result)
3ast eample ma0es a case for it
$ Address T is used tice
Sample instructions
load destsrc M(dest)=[src]
add destsrc M(dest)=[dest]+[src]
sub destsrc M(dest)=[dest]-[src]
mult destsrc M(dest)=[dest][src]
Two Addresses
One address doubles as operand and resultExample a = a + b
The t$o-address formal reduces the space reuirement but also
introduces some a$$ardness To aoid alterin the alue of an
operand a ampOE instruction is used to moe one of the alues to a
result or temporary location before performin the operation
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 58101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
load TC T = C
mult TD T = CD
add TB T = B+CD
sub TE T = B+CD-Eadd TF T = B+CD-E+F
add AT A = B+CD-E+F+A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 59101
um+er of Addresses cont-um+er of Addresses cont-
5ne$address machines 4se special set of registers called accumulators
$ Specify one source operand receive the result
Called accumulator machines
Sample instructions
load addr accum = [addr]
store addr M[addr] = accumadd addr accum = accum + [addr]
sub addr accum = accum - [addr]
mult addr accum = accum [addr]
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 60101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statementA C H D F 6 A
ltJuivalent code6
load C load C to accum
mult D accum = CD
add B accum = CD+B
sub E accum = B+CD-Eadd F accum = B+CD-E+F
add A accum = B+CD-E+F+A
store A store accum cotets A
um+er of Addresses cont -um+er of Addresses
cont -
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 61101
um+er of Addresses cont-um+er of Addresses cont-
Vero$address machines
Stac0 supplies operands and receives the result$ Special instructions to load and store use an address
Called stac0 machines (lt6 Pgtgtgt 8urroughs 8gtgt)
Sample instructions
us addr us([addr])
o addr o([addr])
add us(o + o)
sub us(o - o) mult us(o o)
um+er of Addresses cont -um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 62101
um+er of Addresses cont-um+er of Addresses cont-
ltample
C statement
A C H D F 6 A
ltJuivalent code6
us E sub
us C us F
us D add
Mult us A
us B add
add o A
)oadStore Architecture)oadStore Architecture
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 63101
)oadStore Architecture)oadStore Architecture
Instructions epect operands in internal processor registers Special 35A and ST51lt instructions move data beteen registers
and memory
1ISC uses this architecture
1educes instruction length
()
)oadStore Architecture cont-)oadStore Architecture
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 64101
)oadStore Architecture cont-)oadStore Architecture cont-
Sample instructionsload $daddr $d = [addr]
store addr$s (addr) = $s
add $d$s$samp $d = $s + $sampsub $d$s$samp $d = $s - $samp
mult $d$s$samp $d = $s $samp
um+er of Addresses cont-um+er of Addresses
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 65101
um+er of Addresses cont-um+er of Addresses cont-
ampleC statement
A = B + C D E + F + A
1uialent co)eload $B mult $amp$amp$
load $ampC add $amp$amp$
load $D sub $amp$amp$
load $E add $amp$amp$
load $F add $amp$amp$
load $A store A$amp
0lo1 of Control 0lo1 of Control
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 66101
0lo1 of Control 0lo1 of Control
efault is seJuential flo
Several instructions alter this defaulteecution
8ranches$ 4nconditional
$ Conditional
$ elayed branches Procedure calls
$ elayed procedure calls
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 67101
0lo1 of Control cont-0lo1 of Control cont-
8ranches
4nconditional
$ Absolute address
$ PC$relative
U Target address is specified relative to PC contents U 1elocatable code
ltample6 MIPS
$ Absolute address
9 target
$ PC$relative
8 target
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 68101
0lo1 of Control cont- -
e entium e R
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 69101
lo1 o Co t ol co t- -
8ranches
Conditional
$ ump is ta0en only if the condition is met
To types
$ Set$Then$ump
U Condition testing is separated from branching U Condition code registers are used to convey the condition test
result
U Condition code registers 0eep a record of the status of the last A34 operation such as overflo condition
$ ltample6 Pentium codecm AB comare A ad B
e taret um e0ual
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 70101
- -
$ Test$and$ump
U Single instruction performs condition testing and branching
$ ltample6 MIPS instruction
be0 $src$srcamptaret
umps to target if 1src E 1src
elayed branching
Control is transferred after eecuting the instruction thatfollos the branch instruction
$ This instruction slot is called delay slot Improves efficiency
ighly pipelined 1ISC processors support
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 71101
- -
Procedure calls Lacilitate modular programming
1eJuire to pieces of information to return
$ ltnd of procedure U Pentium
uses ret instruction
U MIPS
uses 9r instruction
$ 1eturn address U In a (special) register
MIPS allos any general$purpose register
U 5n the stac0
Pentium
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 72101
- -
0lo1 of Control cont-0lo1 of Control
cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 73101
- -
elay slot
Parameter PassingParameter Passin
g
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 74101
gg
To basic techniJues 1egister$based (eg PoerPC MIPS)
$ Internal registers are used U Laster
U 3imit the number of parameters U 1ecursive procedure
Stac0$based (eg Pentium)
$ Stac0 is used U More general
2 perand Types2
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 75101
p yp
Instructions support basic data types
Characters Integers
Lloating$point
Instruction overload
Same instruction for different data types
ltample6 Pentium mo1 A2address loads a 3-bt 1alue
mo1 Aaddress loads a -bt 1alue
mo1 EAaddress loads a amp-bt 1alue
perand Types
perand Types
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 76101
Separate instructions
Instructions specify the operand si-e
ltample6 MIPS
lb $destaddress loads a b4te
l $destaddress loads a al5ord( bts)
l5 $destaddress loads a 5ord
(amp bts)
ld $destaddress loads a double5ord
( bts)imilar instruction store
3 Addressing Modes3 Addressin
g Modes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 77101
o the operands are specified
5perands can be in three places
$ 1egisters U 1egister addressing mode
$ Part of instruction U Constant
U Immediate addressing mode
U All processors support these to addressing modes
$ Memory U ifference beteen 1ISC and CISC
U CISC supports a large variety of addressing modes
U 1ISC follos load2store architecture
4 Instruction Types4 Instruction T
ypes
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 78101
Several types of instructions
ata movement$ Pentium6 mo1 destsrc
$ Some do not provide direct data movement instructions
$ Indirect data movement
add $dest$src6 $dest = $src+6
Arithmetic and 3ogical
$ Arithmetic U Integer and floating$point signed and unsigned U add subtract multiply divide
$ 3ogical U andB orB notB 7or
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 79101
Condition code bits
S6 Sign bit (gt E F E $)
6 Vero bit (gt E non-ero E -ero)
$6 5verflo bit (gt E no overflo E overflo)
C6 Carry bit (gt E no carry E carry)
ltample6 Pentium
cm coutamp comare cout to amp
subtract amp rom cout
e taret um e0ual
Instruction Types cont-Instruction T
ypes cont-
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 80101
Llo control and I25 instructions
$ 8ranch
$ Procedure call
$ Interrupts
I25 instructions$ Memory$mapped I25
U Most processors support memory$mapped I25
U 7o separate instructions for I25
$ Isolated I25 U Pentium supports isolated I25
U Separate I25 instructions
Ao7ort read from an IO ort
out o7ortA rte to an IO ort
5 Instruction 0ormats5 Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 81101
To types
Lied$length$ 4sed by 1ISC processors
$ $bit 1ISC processors use $bits ide instructions U ltamples6 SPA1C MIPS PoerPC
ariable$length
$ 4sed by CISC processors
$ Memory operands need more bits to specify
5pcode
MaOor and eact operation
Examples of Instruction 0ormatsExam
ples of Instruction 0ormats
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 82101
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 83101
ISC e)uce) Instruction Set Computer 3
ersus
CISC Comple Instruction Set Computer3
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 84101
0
RISC s CISCRISC s CISC
The underlying philosophy of 1ISC machines is that asystem is better able to manage program eecutionhen the program consists of only a fe differentinstructions that are the same length and reJuire thesame number of cloc0 cycles to decode and eecute
1ISC systems access memory only ith eplicit loadand store instructions
In CISC systems many different 0inds of instructionsaccess memory ma0ing instruction length variableand fetch$decode$eecute time unpredictable
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 85101
The difference beteen CISC and 1ISC becomesevident through the basic computer performanceeJuation6
1ISC systems shorten eecution time by reducingthe cloc0 cycles per instruction
CISC systems improve performance by reducing thenumber of instructions per program
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 86101
(
The simple instruction set of 1ISC machinesenables control units to be hardired for maimumspeed
The more comple$$ and variable$$ instruction set of
CISC machines reJuires microcode$based controlunits that interpret instructions as they are fetchedfrom memory This translation ta0es time
Dith fied$length instructions 1ISC lends itself topipelining and speculative eecution
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 87101
mo1 a8 6 mo1 b8 6 mo1 c8
Be add a8 b8 loo Be
Consider the the program fragments6
The total cloc0 cycles for the CISC version might be6(amp mo1s c4cle) + ( mul 6 c4cles) = amp c4cles
Dhile the cloc0 cycles for the 1ISC version is6
( mo1s c4cle) + ( adds c4cle) + ( loos c4cle) = c4cles
Dith 1ISC cloc0 cycle being shorter 1ISC gives usmuch faster eecution speeds
mo1 a8 6 mo1 b8 mul b8 a8
CISC RISC
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 88101
8ecause of their load$store ISAs 1ISC architecturesreJuire a large number of CP4 registers
These register provide fast access to data duringseJuential program eecution
They can also be employed to reduce the overheadtypically caused by passing parameters tosubprograms
Instead of pulling parameters off of a stac0 the
subprogram is directed to use a subset of registers
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 89101
3
This is horegisters canbe overlappedin a 1ISCsystem
The currentindo pointer (CDP) pointsto the activeregister
indo
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 90101
34
It is becoming increasingly difficult to distinguish1ISC architectures from CISC architectures
Some 1ISC systems provide more etravagantinstruction sets than some CISC systems
Some systems combine both approaches The folloing to slides summari-e the
characteristics that traditionally typify the differencesbeteen these to architectures
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 91101
31
RISC Multiple reister sets4
Three operan)s perinstruction4
Parameter passinthrouh reister5in)o5s4
Sinle-ccle
instructions4 7ar)5ire)
control4
7ihl pipeline)4
CISC Sinle reister set4
ne or t5o reisteroperan)s per
instruction4 Parameter passin
throuh memor4
Multiple ccle
instructions4 Microproramme)
control4
(ess pipeline)4ontinued
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 92101
32
RISC Simple instructions
fe5 in num9er4
ie) lenth
instructions4 Compleit in
compiler4
nl 29ADT9$E
instructions accessmemor4
e5 a))ressin mo)es4
CISC Man comple
instructions4
aria9le lenth
instructions4 Compleit in
microco)e4
Man instructions can
access memor4
Man a))ressinmo)es4
RISC s CISCRISC s CISC
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 93101
RISC s CISCRISC s CISC
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 94101
Summar
Instruction Set Design IssuesInstruction Set Desi
gn Issues
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 95101
g
Instruction set )esin issues inclu)e here are operan)s store)lt
- reisters memor stac= accumulator
7o5 man eplicit operan)s are therelt
- 0 + 2 or amp
7o5 is the operan) location specifie)lt
- reister imme)iate in)irect 4 4 4
hat tpe gt sie of operan)s are supporte)lt
- 9te int float )ou9le strin ector4 4 4
hat operations are supporte)lt
- a)) su9 mul moe compare 4 4 4
More A+out 6eneral Purpose egistersMore A+out 6eneral Pu
rpose egisters
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 96101
h )o almost all ne5 architectures usePslt
eisters are much faster than memor eencache3
- eister alues are aaila9le imme)iatel
- hen memor isnt rea) processor must 5aitBstall3
eisters are conenient for aria9le storae
- Compiler assins some aria9les Dust to reisters
- More compact co)e since small fiel)s specifreisters
compare) to memor a))resses3Registers Cache
MemoryProcessor Disk
7hat perations are eeded7hat
perations are eeded
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 97101
3
Arithmetic E (oical
Inteer arithmetic A$$ SU MU(T $I S7IT
(oical operation AN$ NT
$ata Transfer - cop loa) store
Control - 9ranch Dump call return
loatin Point A$$ MU( $I 3 Same as arithmetic 9ut usuall ta=e 9ier operan)s
$ecimal - A$$$ CNT
Strin - moe compare search
raphics F piel an) erte compressionG)ecompression operations
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 98101
Stacamps Architecture Pros and ConsStacamps Architecture Pros and Cons
Pros oo) co)e )ensit implicit top of stac=3
(o5 har)5are re1uirements
as to 5rite a simpler compiler for stac= architectures
Cons Stac= 9ecomes the 9ottlenec=
(ittle a9ilit for parallelism or pipelinin
$ata is not al5as at the top of stac= 5hen nee) so a))itionalinstructions li=e TP an) SAP are nee)e)
$ifficult to 5rite an optimiin compiler for stac= architectures
Accumulators Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 99101
Accumulators Architecture Pros and Cons
Pros U ery lo hardare reJuirements
U ltasy to design and understand
Cons U Accumulator becomes the bottlenec0
U 3ittle ability for parallelism or pipelining U igh memory traffic
Memory Memory Architecture Pros and Cons
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 100101
Memory3Memory Architecture Pros and Cons
Pros U 1eJuires feer instructions (especially if operands)
U ltasy to rite compilers for (especially if operands)
Cons U ery high memory traffic (especially if operands)
U ariable number of cloc0s per instruction
U Dith to operands more data movements are reJuired
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers
7232019 Chapter 2 Advanced computer Architecture
httpslidepdfcomreaderfullchapter-2-advanced-computer-architecture 101101
Memory3Register Architecture Pros and Cons
Pros U Some data can be accessed ithout loading first
U Instruction format easy to encode
U ood code density
Cons U 5perands are not eJuivalent (poor orthogonal)
U ariable number of cloc0s per instruction U May limit number of registers