55
DTT 201: Computer Architecture Lecture 01 Subject Outline Name: HASLIZA HASHIM Email: [email protected] HP No: 019-3426312

Chapter_1 DTT 210

Embed Size (px)

Citation preview

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 1/55

DTT 201: Computer Architecture

Lecture 01

Subject Outline

Name: HASLIZA HASHIM

Email: [email protected]

HP No: 019-3426312

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 2/55

Content & Learning Objectives

Content²Historical development of the computer

architecture

²Computer internal organisation, data

representation methods and operation of amicroprocessor-based system, includingspecification of hardware

Learning Objectives

²Able to relate computer organisation & architecture to contemporary computer designissues

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 3/55

Assessment

This subject has the following assessmentcomponents.

Assessment Items Percentage of FinalMark

Test 10%

Assignment 30%

Quiz 20%

Examination 40%

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 4/55

Textbook and other Resources

William Stallings, Computer Organisationand Architecture: Designing forPerformance. Prentice Hall, 2000

Englander, Irv, The Architecture of 

Computer Hardware and SystemSoftware. Wiley, 2000

K.F. Ibrahim, PC Operation and Repair,Prentice Hall, 2002

www.intel.com

www.ibm.com

www.pcguide.com

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 5/55

DTT 201: Computer Architecture

Chapter 1

Introduction

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 6/55

Architecture & Organization 1

Architecture is those attributes visible tothe programmer

²Instruction set, number of bits used for datarepresentation, I/O mechanisms, addressing

techniques.²e.g. will computer have a multiply instruction?

Organization is how features areimplemented

²Control signals, interfaces, memorytechnology.

²e.g. will instruction be implemented by amultiply unit or is it done by repeated

addition?

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 7/55

Microprocessor 

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 8/55

Function of Microprocessor ?

A microprocessor incorporates most orall of the functions of a computer's centralprocessing unit (CPU) on a singleintegrated circuit (IC, or microchip).

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 9/55

Architecture & Organization 2

All Intel x86 family share the same basicarchitecture

The IBM System/370 family share thesame basic architecture

This gives code compatibility

²At least backwards

Organization differs between differentversions

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 10/55

Structure & Function

Structure is the way in which componentsrelate to each other

Function is the operation of individualcomponents as part of the structure

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 11/55

Function

All computer functions are:²Data processing

²Data storage All computer must has the capability of a storage device for

the external environment eitherbeing written to or had from

Move data between itself & outside world

Computer act either as some/destination

²Data movement

²Control

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 12/55

Functional View

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 13/55

Operations (a) Data movement

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 14/55

Operations (b) Storage

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 15/55

Operation (c) Processing from/to storage

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 16/55

Operation (d)

Processing from storage to I/O

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 17/55

Computer 

Main

Memory

Input

Output

Systems

Interconnection

Peripherals

Communication

lines

CentralProcessing

Unit

Computer 

Structure ² Top Level

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 18/55

Computer  Arithmeticand

Logic Unit

Control

Unit

Internal CPU

Interconnection

Registers

CPU

I/O

Memory

SystemBus

CPU

Structure ² The CPU

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 19/55

CPU

Control

Memory

Control Unit

Registers and

Decoders

Sequencing

Logic

Control

Unit

 ALU

Registers

InternalBus

Control Unit

Structure - The Control Unit

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 20/55

ENIAC - background

Electronic Numerical Integrator AndComputer

Eckert and Mauchly

University of Pennsylvania

Trajectory tables for weapons

Started 1943

Finished 1946

²Too late for war effort Used until 1955

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 21/55

ENIAC - details

Decimal (not binary) 20 accumulators of 10 digits

Programmed manually by switches

18,000 vacuum tubes

30 tons

15,000 square feet

140 kW power consumption

5,000 additions per second

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 22/55

ENIAC - cont

Vacuum tubes

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 23/55

von Neumann/Turing

Stored Program concept Main memory storing programs and data

ALU operating on binary data

Control unit interpreting instructions frommemory and executing

Input and output equipment operated bycontrol unit

Princeton Institute for Advanced Studies²IAS

Completed 1952

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 24/55

Structure of von Neumann machine

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 25/55

IAS - details

Memory -> 1000 storage x 40 bit words²Binary number

²Number word -> a sign bit & 39 bit value

²Instruction word -> 2 x 20 bit instructions

Set of registers (storage in CPU)

²Memory Buffer Register

²Memory Address Register

²Instruction Register

²Instruction Buffer Register

²Program Counter²Accumulator

²Multiplier Quotient

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 26/55

Structure of IAS ² 

detail

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 27/55

Commercial Computers

1947 - Eckert-Mauchly ComputerCorporation

UNIVAC I (Universal Automatic Computer)

US Bureau of Census 1950 calculations

Became part of Sperry-Rand Corporation

Late 1950s - UNIVAC II

²Faster

²More memory

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 28/55

IBM

Punched-card processing equipment 1953 - the 701

²IBM¶s first stored program computer

²Scientific calculations

1955 - the 702²Business applications

Lead to 700/7000 series

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 29/55

Transistors

Replaced vacuum tubes Smaller

Cheaper

Less heat dissipation

Solid State device

Made from Silicon (Sand)

Invented 1947 at Bell

Labs William Shockley et al.

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 30/55

Transistor Based Computers

Second generation machines NCR & RCA produced small transistor

machines

IBM 7000

DEC - 1957

²Produced PDP-1

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 31/55

Microelectronics

Literally - ³small electronics´  A computer is made up of gates, memory

cells and interconnections

These can be manufactured on a

semiconductor e.g. silicon wafer

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 32/55

Generations of Computer 

Vacuum tube - 1946-1957 Transistor - 1958-1964

Small scale integration - 1965 on²Up to 100 devices on a chip

Medium scale integration - to 1971²100-3,000 devices on a chip

Large scale integration - 1971-1977²3,000 - 100,000 devices on a chip

Very large scale integration - 1978 -1991²100,000 - 100,000,000 devices on a chip

Ultra large scale integration ± 1991 -²Over 100,000,000 devices on a chip

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 33/55

Moore·s Law

Increased density of components on chip Gordon Moore ± co-founder of Intel

Number of transistors on a chip will double everyyear

Since 1970¶s development has slowed a little²Number of transistors doubles every 18 months

Cost of a chip has remained almost unchanged

Higher packing density means shorter electricalpaths, giving higher performance

Smaller size gives increased flexibility

Reduced power and cooling requirements

Fewer interconnections increases reliability

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 34/55

Growth in CPU Transistor Count

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 35/55

IBM 360 series

1964 Replaced (& not compatible with) 7000

series

First planned ³family´ of computers

²Similar or identical instruction sets

²Similar or identical O/S

²Increasing speed

²Increasing number of I/O ports (i.e. more

terminals)

²Increased memory size

²Increased cost

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 36/55

DEC PDP-8

1964 First minicomputer (after miniskirt!)

Did not need air conditioned room

Small enough to sit on a lab bench

$16,000

²$100k+ for IBM 360

Embedded applications & OEM

BUS STRUCTURE ± Omnibus (96 separatesignal paths to carry control, address anddata signals)

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 37/55

DEC - PDP-8 Bus Structure

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 38/55

Intel

1971 - 4004²First microprocessor

²All CPU components on a single chip

²4 bit

Followed in 1972 by 8008²8 bit

²Both designed for specific applications

1974 - 8080

²Intel¶s first general purpose microprocessor

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 39/55

Techniques built into processor 

Branch prediction²Predicts which branches of instructions are

likely to be processed

²Buffer pre-fetched instructions

Data flow analysis²Create an optimised schedule of instructions

which are dependant on other¶s results

²Prevent delay

Speculative execution²Execute instructions in advance and holds the

results in temporary locations

²Keep execution engines busy by executing

needed instructions

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 40/55

Performance Balance

Processor speed increased Memory capacity increased

Memory speed lags behind processorspeed

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 41/55

Logic and Memory Performance Gap

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 42/55

Solutions

Increase number of bits retrieved at onetime

²Make DRAM ³wider´ rather than ³deeper´ 

Change DRAM interface

²Include cache

Reduce frequency of memory access

²More complex cache and cache on chip

Increase interconnection bandwidth

²High speed buses

²Hierarchy of buses

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 43/55

Approaches to increase processor speed

Increase hardware speed of processor²Fundamentally due to shrinking logic gate size

± More gates, packed more tightly, increasing clockrate

± Propagation time for signals reduced

Increase size and speed of caches

²Dedicating part of processor chip

± Cache access times drop significantly

Change processor organization andarchitecture

²Increase effective speed of execution

²Parallelism

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 44/55

Problems from Clock Speed and Logic

Density

Power

²Power density increases with density of logic and clockspeed

²Dissipating heat

RC delay

²Speed at which electrons flow limited by resistance andcapacitance of metal wires connecting them

²Delay increases as RC product increases

²Wire interconnects thinner, increasing resistance

²Wires closer together, increasing capacitance

Memory latency

²Memory speeds lag processor speeds

Solution:²More emphasis on organizational and architectural

approaches to improve performance

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 45/55

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 46/55

Strategies to increase performance

Strategy 1: Increased Cache Capacity²Typically two or three levels of cache

between processor and main memory

²Chip density increased

±More cache memory on chip+Faster cache access

²Pentium chip devoted about 10% of chip area to cache

²Pentium 4 devotes about 50%

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 47/55

Strategies to increase performance - cont

Strategy 2: More Complex Execution Logic²Enable parallel execution of instructions

²Pipeline works like assembly line

±Different stages of execution of different

instructions at same time along pipeline

²Superscalar allows multiple pipelineswithin single processor

±Instructions that do not depend on one

another can be executed in parallel

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 48/55

Diminishing Returns

Internal organization of processors isexceedingly complex

²Further increases in this direction is small

Benefits from cache are reaching limit

Increasing clock rate runs into powerdissipation problem

²Some fundamental physical limits are beingreached

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 49/55

New Approach ² Multiple Cores

Multiple processors on single chip²With large shared cache

Within a processor, increase in performanceproportional to square root of increase incomplexity

If software can use multiple processors, doublingnumber of processors almost doublesperformance

So, use two simpler processors on the chiprather than one more complex processor

With two processors, larger caches are justified²Power consumption of memory logic less than

processing logic

Example: IBM POWER4²Two cores based on PowerPC

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 50/55

Pentium Evolution (1)

8080²first general purpose microprocessor

²8 bit data path

²Used in first personal computer ± Altair

8086²much more powerful

²16 bit

²instruction cache, prefetch few instructions

²8088 (8 bit external bus) used in first IBM PC

80286

²16 Mbyte memory addressable²up from 1Mb

80386²32 bit

²Support for multitasking

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 51/55

Pentium Evolution (2)

80486²sophisticated powerful cache and instruction

pipelining

²built in maths co-processor

Pentium²Superscalar

²Multiple instructions executed in parallel

Pentium Pro²Increased superscalar organization

²Aggressive register renaming

²branch prediction

²data flow analysis

²speculative execution

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 52/55

Pentium Evolution (3)

Pentium II²MMX technology

²graphics, video & audio processing

Pentium III²Additional floating point instructions for 3D graphics

Pentium 4²Note Arabic rather than Roman numerals

²Further floating point and multimedia enhancements

Itanium²64 bit

²see chapter 15 Itanium 2

²Hardware enhancements to increase speed

See Intel web pages for detailed information onprocessors

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 53/55

PowerPC

1975, 801 minicomputer project (IBM) RISC

Berkeley RISC I processor 1986, IBM commercial RISC workstation product,

RT PC.²Not commercial success²Many rivals with comparable or better performance

1990, IBM RISC System/6000²RISC-like superscalar machine²POWER architecture

IBM alliance with Motorola and Apple²Resulted in implementing PowerPC architecture

PowerPC architecture derived from POWERarchitecture

Result from PowerPC architecture²Superscalar RISC²Apple Macintosh

²Embedded chip applications

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 54/55

PowerPC Family (1)

601:²Quickly to market. 32-bit machine

603:²Low-end desktop and portable

²32-bit

²Comparable performance with 601

²Lower cost and more efficient implementation

604:²Desktop and low-end servers

²32-bit machine

²Much more advanced superscalar design²Greater performance

620:²High-end servers

²64-bit architecture

8/8/2019 Chapter_1 DTT 210

http://slidepdf.com/reader/full/chapter1-dtt-210 55/55

PowerPC Family (2)

740/750:²Also known as G3

²Two levels of cache on chip

G4:

²Increases parallelism and internal speed G5:

²Improvements in parallelism and internalspeed

²64-bit organization