25
CIS 570 Advanced Computer Systems University of Massachusetts Dartmouth Instructor: Dr. Michael Geiger Fall 2008 Lecture 1: Fundamentals of Computer Design

CIS 570 Advanced Computer Systems University of Massachusetts Dartmouth Instructor: Dr. Michael Geiger Fall 2008 Lecture 1: Fundamentals of Computer Design

Embed Size (px)

Citation preview

CIS 570Advanced Computer

SystemsUniversity of Massachusetts Dartmouth

Instructor: Dr. Michael Geiger

Fall 2008

Lecture 1: Fundamentals of Computer Design

9/3/08 M. Geiger CIS 570 Lec. 1 2

Outline Syllabus & course policies Changes in computer architecture What is computer architecture? Design principles

9/3/08 M. Geiger CIS 570 Lec. 1 3

Syllabus notes Course web site (still under construction):

http://www.cis.umassd.edu/~mgeiger/ cis570/f08.htm

TA: To be determined My info:

Office: Science & Engineering, 221C Office hours: M 1:30-2:30, T 2-3:30, Th 2:30-4 E-mail: [email protected]

Course text: Hennessy & Patterson’s Computer Architecture: A Quantitative Approach, 4th ed.

9/3/08 M. Geiger CIS 570 Lec. 1 4

Course objectives To understand the operation of modern

microprocessors at an architectural level. To understand the operation of memory and

I/O subsystems and their relation to overall system performance.

To understand the benefits of multiprocessor systems and the difficulties in designing and utilizing them.

To gain familiarity with simulation techniques used in research in computer architecture.

9/3/08 M. Geiger CIS 570 Lec. 1 5

Course policies Prereqs: CIS 273 & 370 or equivalent Academic honesty

All work individual unless explicitly stated otherwise (e.g., final projects)

May discuss concepts (e.g., how does Tomasulo’s algorithm work) but not solutions

Plagiarism is also considered cheating Any assignment or portion of an assignment

violating this policy will receive a grade of 0 More severe or repeat infractions may incur

additional penalties, up to and including a failing grade in the class

9/3/08 M. Geiger CIS 570 Lec. 1 6

Grading policies Assignment breakdown:

Problem sets: 20% Simulation exercises: 10% Research project (including report &

presentation): 20% Midterm exam: 15% Final exam: 25% Quizzes & participation: 10%

Late assignments: 10% per day

9/3/08 M. Geiger CIS 570 Lec. 1 7

Topic schedule Computer design fundamentals Basic ISA review Architectural simulation Uniprocessor systems

Advanced pipelining—exploiting ILP & TLP Memory hierarchy design Storage & I/O

Multiprocessor systems Memory in multiprocessors Synchronization Interconnection networks

9/3/08 M. Geiger CIS 570 Lec. 1 8

Changes in computer architecture Old Conventional Wisdom: Power is free, Transistors expensive New Conventional Wisdom: “Power wall” Power expensive, Xtors

free (Can put more on chip than can afford to turn on)

Old CW: Sufficiently increasing Instruction Level Parallelism via compilers, innovation (Out-of-order, speculation, VLIW, …)

New CW: “ILP wall” law of diminishing returns on more HW for ILP

Old CW: Multiplies are slow, Memory access is fast New CW: “Memory wall” Memory slow, multiplies fast

(200 clock cycles to DRAM memory, 4 clocks for multiply) Old CW: Uniprocessor performance 2X / 1.5 yrs New CW: Power Wall + ILP Wall + Memory Wall = Brick Wall

Uniprocessor performance now 2X / 5(?) yrs Sea change in chip design: multiple “cores”

(2X processors per chip / ~ 2 years) More simpler processors are more power efficient

9/3/08 M. Geiger CIS 570 Lec. 1 9

Uniprocessor performance

1

10

100

1000

10000

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

Pe

rfo

rma

nce

(vs

. V

AX

-11

/78

0)

25%/year

52%/year

??%/year

From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, October, 2006

• VAX : 25%/year 1978 to 1986• RISC + x86: 52%/year 1986 to 2002• RISC + x86: ??%/year 2002 to present

9/3/08 M. Geiger CIS 570 Lec. 1 10

Chip design changes Intel 4004 (1971): 4-bit

processor,2312 transistors, 0.4 MHz,

10 micron PMOS, 11 mm2 chip

RISC II (1983): 32-bit, 5 stage pipeline, 40,760 transistors, 3 MHz, 3 micron NMOS, 60 mm2 chip

125 mm2 chip, 0.065 micron CMOS = 2312 RISC II+FPU+Icache+Dcache

9/3/08 M. Geiger CIS 570 Lec. 1 11

From ILP to TLP & DLP (Almost) All microprocessor companies

moving to multiprocessor systems Embedded domain is the lone holdout

Single processors gain performance by exploiting instruction level parallelism (ILP)

Multiprocessors exploit either: Thread level parallelism (TLP), or Data level parallelism (DLP)

What’s the problem?

9/3/08 M. Geiger CIS 570 Lec. 1 12

From ILP to TLP & DLP (cont.) We’ve got tons of infrastructure for single-processor

systems Algorithms, languages, compilers, operating systems,

architectures, etc. These don’t exactly scale well

Multiprocessor design: not as simple as creating a chip with 1000 CPUs Task scheduling/division Communication Memory issues Even programming moving from 1 to 2 CPUs is

extremely difficult Not strictly computer architecture, but it can’t

happen without architects

9/3/08 M. Geiger CIS 570 Lec. 1 13

CIS 570 Approach How are we going to address this change?

Start by going through single-processor systems Study ILP and ways to exploit that Delve into memory hierarchies for single processors Talk about storage and I/O systems We may touch on embedded systems at this point

Then, we’ll look at multiprocessor systems Discuss TLP and DLP Talk about how multiprocessors affect memory design Cover interconnection networks

9/3/08 M. Geiger CIS 570 Lec. 1 14

What is computer architecture?

Classical view: instruction set architecture (ISA) Boundary between hardware and software Provides abstraction at both high level and low level

instruction set

software

hardware

9/3/08 M. Geiger CIS 570 Lec. 1 15

ISA vs. Computer Architecture Modern issues aren’t in instruction set design

“Architecture is dead” … or is it? Computer architecture now encompasses a

larger range of technical issues Modern view: ISA + design of computer

organization & hardware to meet goals and functional requirements Organization: high-level view of system Hardware: specifics of a given system

Function of complete system now the issue

9/3/08 M. Geiger CIS 570 Lec. 1 16

The roles of computer architecture … as David Patterson sees it, anyway Other fields borrow ideas from architecture Anticipate and exploit advances in

technology Develop well-defined, thoroughly tested

interfaces Quantitative comparisons to determine when

goals are reached Quantitative principles of design

9/3/08 M. Geiger CIS 570 Lec. 1 17

Goals and requirements What goals might we want to meet?

Performance Power Price Dependability

We’ll talk about how to quantify these as needed throughout the semester Primarily focus on performance (both

uniprocessor & multiprocessor systems) and dependability (mostly storage systems)

9/3/08 M. Geiger CIS 570 Lec. 1 18

Design principles1. Take advantage of parallelism

2. Principle of locality

3. Focus on the common case

4. Amdahl’s Law

5. Generalized processor performance

9/3/08 M. Geiger CIS 570 Lec. 1 19

1. Take advantage of parallelism Increasing throughput of server computer via multiple processors

or multiple disks Detailed HW design

Carry lookahead adders uses parallelism to speed up computing sums from linear to logarithmic in number of bits per operand

Multiple memory banks searched in parallel in set-associative caches

Pipelining: overlap instruction execution to reduce the total time to complete an instruction sequence. Not every instruction depends on immediate predecessor

executing instructions completely/partially in parallel possible Classic 5-stage pipeline:

1) Instruction Fetch (Ifetch), 2) Register Read (Reg), 3) Execute (ALU), 4) Data Memory Access (Dmem), 5) Register Write (Reg)

9/3/08 M. Geiger CIS 570 Lec. 1 20

2. Principle of locality The Principle of Locality:

Program access a relatively small portion of the address space at any instant of time.

Two Different Types of Locality: Temporal Locality (Locality in Time): If an item is referenced, it

will tend to be referenced again soon (e.g., loops, reuse) Spatial Locality (Locality in Space): If an item is referenced, items

whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access)

Last 30 years, HW relied on locality for memory perf. Guiding principle behind caches

To some degree, guides instruction execution, too (90/10 rule)

P MEM$

9/3/08 M. Geiger CIS 570 Lec. 1 21

3. Focus on the common case In making a design trade-off, favor the frequent case over the

infrequent case E.g., Instruction fetch and decode unit used more frequently than

multiplier, so optimize it 1st E.g., If database server has 50 disks / processor, storage

dependability dominates system dependability, so optimize it 1st Frequent case is often simpler and can be done faster than the

infrequent case E.g., overflow is rare when adding 2 numbers, so improve

performance by optimizing more common case of no overflow May slow down overflow, but overall performance improved by

optimizing for the normal case What is frequent case and how much performance improved by

making case faster => Amdahl’s Law

9/3/08 M. Geiger CIS 570 Lec. 1 22

4. Amdahl’s Law

enhanced

enhancedenhanced

new

oldoverall

Speedup

Fraction Fraction

1

ExTimeExTime

Speedup

1

Best you could ever hope to do:

enhancedmaximum Fraction - 1

1 Speedup

enhanced

enhancedenhancedoldnew Speedup

FractionFraction ExTime ExTime 1

9/3/08 M. Geiger CIS 570 Lec. 1 23

5. Processor performance

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle Inst Count CPI Clock Rate

Program X

Compiler X (X)

Inst. Set. X X

Organization X X

Technology X

9/3/08 M. Geiger CIS 570 Lec. 1 24

Next week Review of ISAs (Appendix B) Review of pipelining basics (Appendix A) Discussion of architectural simulation

9/3/08 M. Geiger CIS 570 Lec. 1 25

Acknowledgements This lecture borrows heavily from David

Patterson’s lecture slides for EECS 252: Graduate Computer Architecture, at the University of California, Berkeley

Many figures and other information are taken from Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th ed unless otherwise noted