Computer Architecture CPSC 350 E. J. Kim. Course Contents

Preview:

Citation preview

Computer ArchitectureCPSC 350

E. J. Kim

Course Contents

Course Contents

• Organization of a computer • Assembly language • Design of a computer• Verilog• Future architectures

How does the course fit into the curriculum?

ELEN 220 Intro to Digital Design

ELEN 248 Intro to DGTL Sym Design

CPSC 350 Computer Architecture

CPSC 483 Computer Sys

Design

CPSC 4xx compiler OS

Syllabus

• Two exams 50%• Assignments and quizzes 20%• Projects 30%

Course Information

• Labs• MIPS (Assembly Programming), Verilog

(HDL)

• Projects• Project 1: MIPS • Projects 2 & 3: Verilog (Datapath Design)

• TA• Lei Wang• Peer Teacher: Molly Hicks

Course Information [contd…]

• Course Webpage• http://faculty.cs.tamu.edu/ejkim/

Courses/cpsc350

• CS Accounts• Use your CS accounts to turnin and

check any email regarding course

Computer Architecture

Some people define computer architecture as the combination of• the instruction set architecture

(what the executable can see of the hardware, the functional interface)

• and the machine organization(how the hardware implements the instruction set architecture)

What is Computer Architecture ?

I/O system

Instr. Set Proc.

Compiler

OperatingSystem

Applications

Digital Design

Circuit Design

Firmware

Many levels of abstraction

Datapath & Control

Layout

Instruction set architecture

Machine organization

Factors affecting ISA ???

ComputerArchitecture

Technology ProgrammingLanguages

OperatingSystems

History

Applications

Cleverness

ISA: Critical Interface

instruction set

software

hardware

Examples: 80x86 50,000,000 vs. MIPS 5500,000 ???

The Big Picture

Control

Datapath

Memory

Processor

Input

Output

Since 1946 all computers have had 5 components!!!

Technology Trends

• Processor• logic capacity: about 30% per year• clock rate: about 20% per year

• Memory• DRAM capacity: about 60% per year (4x every 3

years)• Memory speed: about 10% per year• Cost per bit: improves about 25% per year

• Disk• capacity: about 60% per year• Total use of data: 100% per 9 months!

• Network Bandwidth• Bandwidth increasing more than 100% per year!

i4004

i8086

i80386

Pentium

i80486

i80286

SU MIPS

R3010

R4400

R10000

1000

10000

100000

1000000

10000000

100000000

1965 1970 1975 1980 1985 1990 1995 2000 2005

Tra

nsis

tors

i80x86

M68K

MIPS

Alpha

° In ~1985 the single-chip processor (32-bit) and the single-board computer emerged

° In the 2002+ timeframe, these may well look like mainframes compared single-chip computer (maybe 2 chips)

DRAMYear Size1980 64 Kb1983 256 Kb1986 1 Mb1989 4 Mb1992 16 Mb1996 64 Mb1999 256 Mb2002 1 Gb

uP-Name

Microprocessor Logic DensityDRAM chip capacity

Technology Trends

Technology Trends

Smaller feature sizes – higher speed, density

ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)

Technology Trends

Number of transistors doubles every 18 months

(amended to 24 months)

ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)

Levels of Representation

High Level Language Program

Assembly Language Program

Machine Language Program

Control Signal Specification

Compiler

Assembler

Machine Interpretation

temp = v[k];

v[k] = v[k+1];

v[k+1] = temp;

lw $15, 0($2)lw $16, 4($2)sw $16, 0($2)sw $15, 4($2)

0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111

ALUOP[0:3] <= InstReg[9:11] & MASK

Execution Cycle

Instruction

Fetch

Instruction

Decode

Operand

Fetch

Execute

Result

Store

Next

Instruction

Obtain instruction from program storage

Determine required actions and instruction size

Locate and obtain operand data

Compute result value or status

Deposit results in storage for later use

Determine successor instruction

What next?

• How does better technology help to improve performance?

• How can we quantify the gain?

• The MIPS architecture• MIPS instructions• First contact with assembly language

programming

The Role of Performance

Performance

• Response time: time between start and finish of the task (aka execution time)

• Throughput: total amount of work done in a given time

Question

• Suppose that we replace the processor in a computer by a faster model• Does this improve the response time? • How about the throughput?

Question

• Suppose we add an additional processor to a system that uses separate processors for separate tasks. • Does this improve the response time?• Does this improve the throughput?

CPU Performance

• CPU Execution time for a program = CPU Clock Cycles for a program X Clock Cycle time

= CPU Clock Cycles for a program /Clock rate

CPU Performance Example

• A program runs in 10 secs on computer A with a 4 GHz clock. We try to build computer B, that runs this program in 6 secs where computer B requires 1.2 times as many clock cycles as computer A for this program. What clock rate should we target?

CPU Performance

• CPU clock cycles= instruction count X CPI

• CPU time = instruction count X CPI X Clock

cycle time

inst count

CPI

Cycle time

Cycles Per Instruction (Throughput)

“Instruction Frequency”

CPI = (CPU Time * Clock Rate) / Instruction Count = Cycles / Instruction Count

“Average Cycles per Instruction”

j

n

jj I CPI TimeCycle time CPU

1

Count nInstructio

I F where F CPI CPI j

j

n

jjj

1

Example: Calculating CPI

Typical Mix of instruction typesin program

Base Machine (Reg / Reg)

Op Freq Cycles CPI(i) (% Time)

ALU 50% 1 .5 (33%)

Load 20% 2 .4 (27%)

Store 10% 2 .2 (13%)

Branch 20% 2 .4 (27%)

1.5

Performance

Relative Performance

(Absolute) Performance

" X is n times faster than Y" means

n =

MIPS as a performance measure

• MIPS = Instruction count/ (Execution time X 10^6)

Ex P. 269

Amdahl’s Law

The execution time after making an improvement to the system is given by

Exec time after improvement = I/A + E

I = execution time affected by improvementA = amount of improvementE = execution time unaffected

Amdahl’s Law

Suppose that program runs 100 seconds on a machine and multiplication instructions take 80% of the total time. How much do I have to improve the speed of multiplication if I want my program to run 5 times faster?

20 seconds = 80 seconds/n + 20 seconds=> it is impossible!

Recommended