29
Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture • Prof. Mark Franklin: [email protected] • Course Assistants: – Drew Frank: [email protected] • Required Book: “Heuring & Jordan” 2 nd Edition • Optional Book: “Intro. VHDL” Yalamanchili Read: Academic Integrity Statement. Course Web Site: http://www.cse.wustl.edu/~jbf/cse362.d/cse362. html

Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: [email protected] Course Assistants: –Drew Frank: [email protected]

Embed Size (px)

Citation preview

Page 1: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

CS, CoE, EE 362Digital Computers II: Architecture

• Prof. Mark Franklin: [email protected]

• Course Assistants: – Drew Frank: [email protected]

• Required Book: “Heuring & Jordan” 2nd Edition

• Optional Book: “Intro. VHDL” Yalamanchili

• Read: Academic Integrity Statement.• Course Web Site:

http://www.cse.wustl.edu/~jbf/cse362.d/cse362.html

Page 2: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Four Key Questions

• What components must every computer have ?

• How can computers be described, specified and evaluated ?

• What constitutes computer architecture (hardware, software, firmware, algorithms, etc.) ?

• How does technology effect computer architecture (chip size, feature size, power, pin density, etc) ?

Page 3: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Essential Computer Components• Processor: interpret/execute instructions.

• Memory: store instructions & data.

• Communication Device(s): communicate with outside world, I/O.

Processor

ControlUnit

ALU

Memory Input/Output

Classic Computer Architecture (SISD: Single Instruction Stream-Single Data Stream)

Page 4: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Architecture Components

• INSTRUCTION SET DESIGN: Programmer visible instruction set Algorithm, compiler, OS design, algorithmic complexity

• HIGH LEVEL COMPONENT ORGANIZATION: Memory system, bus structure, processor design, branch handling, pipelining, execution algorithms, instructions/second, clocks/instruction.

• HARDWARE: Detailed logic design, packaging VLSI & Logic design CAD algorithms speed, area, power, …

Page 5: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

ALU ALUALUALU

Interconnection Network

Data Memory Unit

Program Control Unit

ProgramMemory

Input / Output

(SIMD) Single Instruction Stream – Multiple Data Stream Architecture

Page 6: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Performance Expression: Amdahl’s Law

/ Efficiency

present processors ofnumber

/)1(

1

speedup eachieveabl maximum

10;lysequential performed

bemustthatoperationsoffraction

pSE

p

pffS

S

f

f

nn

n

n

Page 7: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Amdahl’s LawIt does no good to have many processors if there is notenough parallelism. What portion of a computation can be sequential if we want the processors to be used at 50 percent efficiency ? ( S = p/2 )

.processors of

number theof inverse the toalproportion bemust

processing sequential todevotedn computatio the

offraction the,efficiencyconstant amaintain To

1

1

21

/)1(

12/

pf

fpf

pffp

n

nn

nn

Page 8: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Generalize Amdahl’s Law

Speedupoverall =ExTimeold

ExTimenew

=

1

(1 - Fractionenhanced) + Fractionenhanced

Speedupenhanced

Example: “Suppose a program runs in 100 seconds on a machine. Multiply operations are responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?”

What about 5 times faster?

PRINCIPAL: Make the common case fast!

Page 9: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Computer Market Partitioning(costs are for processor, not system)

• Desktop Computing ($100 - $1,000):– Price-performance

• Servers: ($200 - $2,000)– Availability (reliability + effectiveness)– Scalability– Throughput

• Embedded Computers: ($0.20 - $1,000)– Real-time performance– Power and memory minimization– Cost minimization– Interface with special purpose logic; use of processor

cores

Page 10: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

HLL (e.g., C, C++, Perl) vs Machine/Assembly Language (AL)

• HLL Pros: – Easier to express algorithms due to higher level constructs

(e.g., For, Case, Arithmetic expressions, objects, etc.)– Type checking (Hardware for type checking ?).– Some memory allocation checking.

• Assembly Language Pros:– More control over ISA more speed, less memory– More control over I/O

• Combination is often best for embedded systems: HLL calling AL .

Page 11: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Example: HLL AL Mapping

• b = c + d*e • LOAD R1, d• LOAD R2, e• LOAD R3, c• MPY R4, R2, R1• ADD R5, R4, R3• STORE R5, b

HLL AL

Page 12: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Buses: I• A set of path(s) (wires) connecting on-chip

or off-chip modules. – Serial bus: transmit one bit at a time– Parallel bus: transmits many bits

simultaneously • Generally time-shared.• Generally has separate data & control paths.• Typically has a separate bus controller or

arbiter that decides which modules can use the bus at any given time.

Page 13: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Buses: II• Some common buses:

– On-chip: AMBA, Wishbone, (generally not standard)– Off-chip: PCI Bus Family),

• ---------------- 32bit transfer 64bit transfer• 33-MHz PCI 133 MB/sec 266 MB/sec• 66-MHz PCI 266 MB/sec 532 MB/sec• 100-MHz PCI-X ------------ 800 MB/sec• 133-MHz PCI-X ------------ 1 GB/sec• PCI-e(xpress) serial, 1 lane 500 MB/sec

• PCI-e(xpress) serial, 4 lanes 2 GB/sec– Off-chip: Other buses - SCSI, IDE, Infiniband

• Common issues: Arbitration, congestion.• Logical equivalence between buses, multiplexers

and switches.

Page 14: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Bandwidth Requirements

Page 15: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Bandwidth Trend

Page 16: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Simple Queuing Theory View of Buses

• Bus is a shared resource and can be viewed as a server in a queuing system.

• Modules attached to the bus present inputs (i.e., requests) to the server (or Bus) and are queued up if the server is busy.

BUS

CPU

I/O

Memory

Server

Queue

Page 17: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Basic Queueing Theory

• Utilization: % time a server is busy• Average Queue Length: Avg # of jobs in queue.• Average System Delay (latency): Avg time from job

entry into, to job departure from system.• Arrival Time Distribution: Poisson Distribution of

arrival times (exponential interarrival times).• Service Time Distribution: Exponentially distributed

service times.• Queue Charactericstics: Infinite length; FIFO service

discipline.

Page 18: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Basic Queueing Results

1...

...

)..

.

.

TimeWaitingSystemAvg

LSysteminNumAvg

LLengthQueueAvg

rateservice

ratearrivalnUtilizatio

q

Page 19: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Basic Queueing Results

1 0

1 0

M/M/1 M/M/1

Qu

eue

Len

gth

Wai

tin

g

Tim

e

1/

Page 20: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Computer Generations

• 1: 1950 - 1959 Vacuum Tubes

• 2: 1960 - 1968 Transistors

• 3: 1969 - 1977 Integrated Circuit

• 4: 1978 - 2005 LSI-Large Scale Integration; VLSI-Very LSI

• 5: 2005 - 20?? ULSI-Ultra LSI; parallel processing

Page 21: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Technology: How we make a chip (roughly)

Page 22: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Integrated Circuit Cost

Cost.per.waferCost.per.die = ----------------------------------- (Dies.per.wafer) x (Yield)

Wafer.areaDies.per.wafer = ------------------- (approximate) Die.area

1Yield = ---------------------------------------------- (empirical observation) (1 + (Defects.per.area)x(die.area/2))2

Typical: Die area = 1.5 cm x 1.5 cm; Wafer Diameter = 10 inches; Defects.per.cm2 = 1.7; Yield = 50 %

Page 23: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

TECHNOLOGY TRENDS

• Semiconductors:– Transistor Density: +50%/year, quadruple in 4 years.– Die Size: +10 - 25%/year

• IC Logic Technology: – Transistors per Chip: +50 - 60%/year– Device Speed: +30%/year– Wire/Communications Speed: ~constant (Cu vs Al)

• Magnetic Disk Technology: – Density: +25 - 60% / year– Access Time: +35% / 10 years (8 ms).

Page 24: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Feature and Die Size

Page 25: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Wafer Size12-inch wafer

Page 26: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

SILICON & MAGNETIC DENSITIES

Page 27: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Processor Performance GainsP

erfo

rman

ce (

x V

AX

-10

/780

)

Page 28: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

Processor Cost Trends with Time

Page 29: Mark Franklin, S06 CS, CoE, EE 362 Digital Computers II: Architecture Prof. Mark Franklin: jbf@cse.wustl.edu Course Assistants: –Drew Frank: ajf1@cec.wustl.edu

Mark Franklin, S06

SILICON & MAGNETIC DENSITIES