15
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing Intel x86, VAX Evolved from humans … sort of … RISC – reduced instruction set computing MIPS, Sparc Reaction to CISC VLIW – very long instruction word No big commercial successes Idea can be dated back to Alan Turing, 1946 (really) Superscalar MIPS, Pentium, Pentium Pro, Apple G4 Evolved from RISC and VLIW EPIC – Explicitly parallel instruction set computing IA-64 -> Itanium Evolution of VLIW and SuperScalar

Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from

Embed Size (px)

Citation preview

Ted Pedersen – CS 3011 – Chapter 10 1

A brief history of computer architectures

• CISC – complex instruction set computing– Intel x86, VAX– Evolved from humans … sort of …

• RISC – reduced instruction set computing– MIPS, Sparc– Reaction to CISC

• VLIW – very long instruction word– No big commercial successes– Idea can be dated back to Alan Turing, 1946 (really)

• Superscalar– MIPS, Pentium, Pentium Pro, Apple G4– Evolved from RISC and VLIW

• EPIC – Explicitly parallel instruction set computing– IA-64 -> Itanium– Evolution of VLIW and SuperScalar

Ted Pedersen – CS 3011 – Chapter 10 2

Instruction set architecture

• Defines the interface between the software and the hardware.

• Manifested in assembly language.

• When we talk about instructions, we are talking about assembly language instructions.

• High level language is compiled into assembly language which is then translated into machine language (10010101010101).

• Can only understand how a processor works by understanding its instruction set architecture. Can only understand that by knowing the assembly language.

Ted Pedersen – CS 3011 – Chapter 10 3

CISC Design Philosophy

• PREMISE: Provide a rich instruction set that will allow for simpler compilers and smaller, faster machine language programs.

• RATIONALE:

– A compiler generates assembly language instructions that correspond to instructions written in a high level language. The closer the assembly language corresponds to the high level language, the less work the compiler will have to do.

– There will not be a huge explosion in size when you translate from high level language to assembly. This uses less memory and makes the resulting program small. Also, the assembly language instructions can be implemented directly in hardware making them fast.

Ted Pedersen – CS 3011 – Chapter 10 4

CISC design philosophy

• REALITY:– Complex assembly language instructions are hard for a compiler

to exploit, since the compiler must identify the cases where it can be used versus those cases where another rich instruction could be used.

– Most programs, regardless of the language they are written in, consist of fairly simple operations (e.g., assignment).

– Complex assembly language instructions are hard to implement in hardware, often they are simulated in microcode.

– A rich instruction set leads to many different opcodes and instruction formats, making instructions longer.

– A rich instruction set demands a complex control unit in the processor.

• VERDICT:– Most popular and widely used ISA in the history of recorded time

is based on CISC (Intel x86).

Ted Pedersen – CS 3011 – Chapter 10 5

CISC characteristics

• Rich instruction set

– Many different instructions

– Many different formats

– Many different sized instructions

• Memory to memory operations supported (e.g., operate on two values in memory and store result in memory)

• Microcode! Not all rich instructions can be implemented in hardware due to high cost of chip real estate. Provide a layer of microcode between assembly language and hardware.

Ted Pedersen – CS 3011 – Chapter 10 6

RISC design philosophy

• PREMISE: Provide a simple, consistent instruction set that can easily take advantage of compiler and hardware optimizations.

• RATIONALE:

– Most instructions that a program executes are fairly simple (e.g., assignment). Neither the hardware nor the compiler should worry about directly implementing complex operations that just don’t occur that often.

– Simple instructions can all be implemented directly in hardware, this will eliminate the need for microcode.

– Simple instructions are easier to decode, thus reducing the size and complexity of the control unit.

– Simple instructions are easier to run in parallel than are complex ones.

Ted Pedersen – CS 3011 – Chapter 10 7

RISC design philosophy

• REALITY:

– CISC processors have adopted RISC methodology where they can. CISC instructions converted by microcode to run on RISC-style hardware.

• VERDICT:

– Very difficult to compare RISC and CISC directly. Each borrow from the other, and characteristics of workload greatly affect performance.

Ted Pedersen – CS 3011 – Chapter 10 8

RISC characteristics

• All instructions are executed in hardware, no layer of microcode between hardware and assembly language.

• Instructions are easy to decode – this means a small number of operations with a fixed instruction size and very limited number of formats.

• Limited memory access – only allow load and store operations to directly access memory. Do not computations in memory.

• Provide many registers, and a very good compiler that can optimize their utilization.

• Take advantage of instruction level parallelism. Start and finish one instruction per cycle. Attain CPI of 1!

– Pipelining

– Instruction Level Parallelism

Ted Pedersen – CS 3011 – Chapter 10 9

CISC versus RISC

• CISC – Intel x86

– Instruction length : 1 – 17 bytes

– Addressing modes : > 15

– Instruction formats : > 15

• RISC – MIPS

– Instruction length : 4 bytes

– Addressing modes : 5

– Instruction formats : 3

• CISC: each instruction is complex, takes a fair number of cycles to finish. There are fewer instructions in an assembly language pgm.

• RISC: each instruction is simple, take fewer cycles to finish. There are more instructions in an assembly language pgm.

Ted Pedersen – CS 3011 – Chapter 10 10

Cycle time and clock rate

• Cycle time – interval between clock pulses

– Instructions are synchronized to the clock

– Usually expressed in nano-seconds

• Clock Rate – the more famous measure

– Simply the inverse of cycle time

– Usually expressed in MhZ or (now) GhZ

Ted Pedersen – CS 3011 – Chapter 10 11

Don’t fall in love with a clock rate

• Suppose I have a G4 processor than runs at 500 MhZ and a Pentium III processor that runs at 700 MhZ. Which will run my program the fastest?

• You don’t know. Need more info. How many cycles will your program take on each of these architectures. They are different, so the underlying assembly language code will be different.

• You can directly compare clock rates for the same processor. A 1 GhZ Pentium III will run your program faster than a 500 MhZ Pentium III.

Ted Pedersen – CS 3011 – Chapter 10 12

Pipelining is what modern processors do…

• Start and finish one instruction per clock cycle.

• Have multiple instructions running in processor at the same time

– Instruction level parallelism

Ted Pedersen – CS 3011 – Chapter 10 13

SuperScalar Design Philosophy

• A natural extension of RISC philosophy, except now the goals is to start and finish more than one instruction per clock cycle. Attain CPI < 1!

• Execute several simple operations at a time.

• Must provide hardware duplication to support this degree of parallelism, and the hardware must also be able to schedule instructions to exploit this. The hardware deals with hazards.

ADD A, B, C

ADD C, A, D

Can’t run in parallel, 2nd add depends on 1st.

Ted Pedersen – CS 3011 – Chapter 10 14

VLIW design philosophy

• The objective is still to exploit instruction level parallelism and achieve CPI < 1.

• Execute several simple instructions in parallel.

• HOWEVER, now the optimizations required are moved out of the hardware and into the compiler as much as possible.

• A number of simple RISC-like instructions are assembled into a single “instruction word” that contains enough instruction level parallelism to keep the hardware busy.

• The resulting hardware should be simpler than RISC hardware, since forwarding, load hazards, and some branch prediction is done in the compiler rather than in the the hardware.

Ted Pedersen – CS 3011 – Chapter 10 15

EPIC

• Combines aspects of both VLIW and SuperScalar.

• Intel/HP are developing a new 64 bit architecture for desktop processors (IA-64)

– The first processor using this architecture is called Itanium, shipping soon (?). Used to be known as Merced.

• Relies upon very good compilers packaging several instructions that can run in parallel into a single instruction.