21
Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 [email protected] 4/28/2008 1 Computer Architecture & Design (6200) Class Presentation

Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 [email protected] 4/28/20081 Computer Architecture

Embed Size (px)

Citation preview

Page 1: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

Manish KulkarniDepartment of Electrical and Computer

EngineeringAuburn University, Auburn, AL 36849

[email protected]

4/28/2008 1Computer Architecture & Design (6200)

Class Presentation

Page 2: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

OverviewWhat is CISC and Why to learn?HistoryArchitecture

Typical x86 designCharacteristics & Addressing modes

CISC Vs RISCExample Programs

The Performance EquationFAQsRecent Developments & Future Scope ResourcesQuestions

4/28/2008 2Computer Architecture & Design (6200)

Class Presentation

Page 3: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

What is CISC?Definition: Pronounced "sisk" and standing for Complex

Instruction Set Computer, is a Microprocessor Architecture that aims at achieving complex operations with single instructions and favors the richness of the instruction set (typically as many as 200 unique instructions) over the speed with which individual instructions are executed.

4/28/2008Computer Architecture & Design (6200)

Class Presentation 3

Why should I know about CISC?Today’s computers still use processors which are based on

CISC designs It has been a prominent architecture since 1978Most Emerging Processor designs combine features of CISC

and RISC to create better designs.

Page 4: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

Generation First introduced Prominent Consumer CPU brands

linear / physical address space Notable (new) features

1 (IA-16) 1978 Intel 8086, Intel 8088 16-bit / 20-bit (segmented) first x86 microprocessors

2 1982 Intel 80186, Intel 80188, NEC V20 see above

hardware for fast address calculations, fast mul/div etc

2 1982 Intel 80286 16-bit (30-bit virtual) / 24-bit (segmented)

MMU, for protected mode and a larger address space

3 (IA-32) 1985 Intel386, AMD Am386 32-bit (46-bit virtual) / 32-bit

32-bit instruction set, MMU with paging

4 1989 Intel486 see aboveRISC-like pipelining, integrated FPU, on-chip cache

5 1993 Pentium, Pentium MMX see above superscalar, 64-bit databus, faster FPU, MMX

5/6 1996 Cyrix 6x86, Cyrix MII see above register renaming, speculative execution

6 1995 Pentium Pro, AMD K5 see above / 36-bit physical (PAE)

μ-op translation, PAE (not K5), integrated L2 cache (not K5)

History

Continued….

4/28/2008 4Computer Architecture & Design (6200)

Class Presentation

Page 5: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

Generation First introduced Prominent Consumer CPU brands

linear / physical address space Notable (new) features

6 1997 AMD K6/-2/3, Pentium II/III see above L3-cache support, 3D Now,

SSE

7 1999 Athlon, Athlon XP see abovesuperscalar FPU, wide design (up to three x86 instr./clock)

7 2000 Pentium 4 see abovedeeply pipelined, high frequency, SSE2, hyper-threading

6/7-M 2003 Pentium M see above optimized for low power

8 (x86-64) 2003 Athlon 64 64-bit / 40-bit physical in first impl.

x86-64 instruction set, on-die memory controller

8 2004 Prescott see above very deeply pipelined, very high frequency, SSE3

9 2006 Intel Core, Intel Core 2 see above (some are 32-bit only)

low power, multi-core, lower clock frequency

10 2007-2008 AMD Phenom see above

monolithic quad-core, 128 bit FPUs, SSE4a Hyper Transport 3, native memory controller, on-die L3 cache

Continued….

4/28/2008 5Computer Architecture & Design (6200)

Class Presentation

Page 6: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 6

A typical x86 Architecture

Intel 8086 Architecture, the 1st member of x86 family

Page 7: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 7

o CISC are Mostly Von Neumann Architecture (There are few exceptions)o Same bus for program memory, data memory, I/O, registers, etco Generally Micro-coded ,Variable length instructionso Segmentation is possible with Segment Register s like DS, ES and an offset which can be common to all segments.o Many powerful instructions are supported, making the assembly language programmer’s job much easier.o Physical Memory Extension Possible

o Register Addressing Modeo Memory Addressing Modeso Displacement Only Addressing Modeo Register Indirect Addressing Modeso Indexed Addressing Modeso Based Indexed Addressing Modeso Based Indexed Plus Displacement Addressing

Page 8: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 8

Main Memory

General Purpose Registers

ALU

Page 9: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 9

Consider following task of Multiplication

15

20

Operands:

M[2:3] = operand 1 (15)M[5:2] = operand 2(20)

Task : Multiplication

Result:

M[2:3] <= result

Page 10: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

The CISC Approach Instruction :

MULT 2:3, 5:2

Operations:1. Loads the two operands into separate

registers2. Multiplies the operands in the

execution unit3. Then stores the product in the some

temporary register4. Stores value back to memory location

2:3

4/28/2008Computer Architecture & Design (6200)

Class Presentation 10

• MULT is what is known as a "complex instruction." • Operates directly on the computer's memory banks • Does not require the programmer to explicitly call any loading or storing functions. • closely resembles a command in a higher level language.

e.g. a ‘C’ statement "a = a * b."

Page 11: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 11

The RISC Approach Instructions :

LW A, 2:3LW B, 5:2MULT A, BSW 2:3, A

Operations:1. Load operand1 into register A2. Load operand2 into register B3. Multiply the operands in the

execution unit and store result in A4. Store value of A back to memory

location 2:3

• These set of Instructions is known as a “Reduced Instructions." • Cannot Operate directly on the computer's memory banks • Requires the programmer to explicitly call any loading or storing functions. • RISC processors only use simple instructions that can be executed within one clock cycle

Page 12: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

CISC RISCPrimary goal is to

complete a task in as few lines of assembly as possible

Emphasis on hardwareIncludes multi-clock

complex instructionsMemory-to-memory:

"LOAD" and "STORE"incorporated in instructions

Small code sizesHigh cycles per secondVariable length

Instructions

Primary goal is to speedup individual instruction

Emphasis on softwareSingle-clock,

reduced instruction onlyRegister to register:

"LOAD" and "STORE"are independent instructions

Large code sizesLow cycles per secondEqual length instructions

which make pipelining possible

4/28/2008Computer Architecture & Design (6200)

Class Presentation 12

Page 13: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 13

The following equation is commonly used for expressing a computer's performance ability:

The CISC approach • minimizes the number of instructions per program (2) • sacrificing the number of cycles per instruction. (1)

RISC does the opposite • reduces the cycles per instruction (1)• sacrificing number of instructions per program (2)

1 2

Page 14: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 14

Which one is faster?

Well, it is commonly accepted that RISC ISA's should make computers faster. The main reason why is because RISC computers figure out more words in a shorter amount of time due to pipelining. So why isn't my computer a RISC?

• CISC ISA's were implemented in the first personal computers• With more people buying computers, CISC isa's became more prominent• Software (especially OS) was developed and "translated" so that personal computers speaking x86 would be able to interact with its users• Because there was so much software written for computers "speaking" x86, people continued to buy those computers. • If we tried to switch to another ISA, we would not have all of the software choices we have now.

Page 15: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 15

So why would someone want to develop another ISA?

• x86 (and CISC) make poor use of the faster hardware we have now. • Another problem with x86 is that people have been trying to make it faster for a long time, at least 20 years, and after a while you have found most of the ways to speed the computer up significantly

Why don't we just switch to RISC?

• Although it is not used on your desktop PC, RISC ISA's are implemented in many mainframe computers. • Programmers have been trying to make RISC faster for a long time, and they have found many of the areas in which it is able to be sped up significantly.

Page 16: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 16

Where are we running into problems speeding up RISC and CISC?

We are running into problems with speeding up the computer in 2 areas1.Branching Decisions and predictions consume good amount of processing time2.Access to memory to fetch instruction and data

So What we are going to do?

Page 17: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 17

o The terms RISC and CISC have become less meaningful with the continued evolution of both CISC and RISC designs and implementations. o Modern x86 processors also decode and split more complex instructions into a series of smaller internal "micro-operations" which can thereby be executed in a pipelined (parallel) fashion, thus achieving high performance on a much larger subset of instructions.o Attempts have been made to combine features of both RISC and CISC to develop a new approach o Intel has teamed up with Hewlett-Packard to design a new type of ISA. They are calling it IA-64 (Intel Architecture 64)

Page 18: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 18

What is IA-64? • IA-64 is a new instruction set architecture. • IA-64 seeks to address: branch delays and memory latency.

What main principles is IA-64 designed around? • IA-64 seeks to exploit instruction level parallelism to the highest degree.• Intel and HP have called their method of exploiting this parallelism in IA-64 EPIC (Explicitly Parallel Instruction Computing). • EPIC simulates parallelism by having the compiler find what instructions can be executed in parallel and "explicitly" package them for the CPU.

How does IA-64 help with branch delays? • IA-64 takes a unique approach of prediction to reduce the consequences of branch delays. • The compiler can append a predicate to any instruction it chooses. The compiler will append predicates to instructions that depend on the outcome of a branch in order to help reduce branch penalties.

Page 19: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 19

How does IA-64 deal with memory latency issues? • Memory latency occurs because CPU processing speed is significantly faster than the speed of fetching data from memory. • IA-64 suggests a new way to eliminate some memory latency problems, speculative loading.

IA-64 Realities: • "A study in ISCA '95 by S. Malhlke, et. al. demonstrated that predication removed over 50% of the branches and 40% of the mis-predicted branches from several popular benchmark programs." ( http://www.hp.com/esy/technology/ia_64/products/isapress.html )• IA-64 lack compatibility with Intel x86 and HP PA-RISC architectures, so this additional compatibility logic will take lot of die space.• Presently, the compilers are in experiment phase and IA-64 has no OS support.

Page 20: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 20

o http://www.pctechguide.com/glossary/WordFind.php?

wordInput=CISC

o http://www.cs.umd.edu/class/fall2001/cmsc411/projects/IA64/

o

http://cse.stanford.edu/class/sophomore-college/projects-00/risc/risccis

c/index.html

o http://en.wikipedia.org/wiki/Complex_instruction_set_computer

o http://en.wikipedia.org/wiki/X86

o http://arstechnica.com/cpu/4q99/risc-cisc/rvc-6.html

Page 21: Manish Kulkarni Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 mmk0002@auburn.edu 4/28/20081 Computer Architecture

4/28/2008Computer Architecture & Design (6200)

Class Presentation 21