43
Com puterA rchitecture CSE 3322 Lecture8 TEST 1 – Tuesday M arch 3 Lectures1 -8, Ch 1,2 W eb Site crystal.uta.edu/~cse3322

TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Embed Size (px)

Citation preview

Page 1: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Computer Architecture CSE 3322Lecture 8

TEST 1 – Tuesday March 3

Lectures 1 - 8, Ch 1,2

Web Sitecrystal.uta.edu/~cse3322

Page 2: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

TEST 1 – Tuesday March 3Lectures 1 - 8, Ch 1,2

• HW Due Feb 24– 1.4.1 p.60– 1.4.4 p.60– 1.4.6 p.60– 1.5.2 p.60-61– 1.5.4 p.61– 1.5.5 p.61

Page 3: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

CPI

Invest Resources where time is Spent!

CPI = Clock Cycles / Instruction Count = (CPU Time * Clock Rate) / Instruction Count

“Average clock cycles per instruction”

CPU Time = Instruction Count x CPI / Clock Rate = Instruction Count x CPI x Clock Cycle Time

Average CPI = SUM of CPI (i) * I(i) for i=1, n Instruction Count

Average CPI = SUM of CPI(i) * F(i) for i = 1, nF(i) is the Instruction Frequency

Page 4: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time ALU 50% 1

Load 20% 5

Store 10% 3

Branch 20% 2

Page 5: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5

Load 20% 5 1.0

Store 10% 3 .3

Branch 20% 2 .4

2.2 = CPI ave

Page 6: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5

Load 20% 5 1.0

Store 10% 3 .3

Branch 20% 2 .4

2.2 = CPI ave

CPU Time(i) = Instr Cnt(i) * CPI(i) * Clk Cycle TimeCPU Time Inst Cnt * CPI ave * Clk Cycle Time

% Time = F(i) * CPI(i) / CPI ave

Page 7: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5 23%

Load 20% 5 1.0 45%

Store 10% 3 .3 14%

Branch 20% 2 .4 18%

2.2 = CPI ave

CPU Time(i) = Instr Cnt(i) * CPI(i) * Clk Cycle TimeCPU Time Inst Cnt * CPI ave * Clk Cycle Time

% Time = F(i) * CPI(i) / CPI ave

Page 8: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5 23%

Load 20% 5 1.0 45%

Store 10% 3 .3 14%

Branch 20% 2 .4 18%

2.2 = CPI ave

How much faster would the machine be if a better data cachereduced the average load time to 2 cycles?

Page 9: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5 23%

Load 20% 5 (2) 1.0 (.4) 45%

Store 10% 3 .3 14%

Branch 20% 2 .4 18%

2.2 (1.6)

How much faster would the machine be if a better data cachereduced the average load time to 2 cycles?

2.2/1.6 = 1.375

CPU Time = Inst Cnt * CPI ave * Clk Cycle Time

Page 10: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5 23%

Load 20% 5 1.0 45%

Store 10% 3 .3 14%

Branch 20% 2 .4 18%

2.2

How much faster would the machine be if a better data cachereduced the average load time to 2 cycles? CPI = 1.6

How does this compare with using branch prediction to shave a cycle off the branch time?

Page 11: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5 23%

Load 20% 5 1.0 45%

Store 10% 3 .3 14%

Branch 20% 2 (1) .4 (.2) 18%

2.2 (2.0)

How much faster would the machine be if a better data cachereduced the average load time to 2 cycles? CPI = 1.6

How does this compare with using branch prediction to shave a cycle off the branch time? CPI = 2.0

Page 12: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 .5 23%

Load 20% 5 1.0 45%

Store 10% 3 .3 14%

Branch 20% 2 .4 18%

2.2

How much faster would the machine be if a better data cachereduced the average load time to 2 cycles? CPI = 1.6

How does this compare with using branch prediction to shave a cycle off the branch time? CPI = 2.0

What if two ALU instructions could be executed at once?

Page 13: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Example (RISC processor)

Typical Mix

Base Machine (Reg / Reg)

Op Freq Cycles F(i)CPI(i) % Time

ALU 50% 1 (.5) .5 (.25) 23%

Load 20% 5 1.0 45%

Store 10% 3 .3 14%

Branch 20% 2 .4 18%

2.2 (1.95)

How much faster would the machine be if a better data cachereduced the average load time to 2 cycles? CPI = 1.6

How does this compare with using branch prediction to shave a cycle off the branch time? CPI = 2.0

What if two ALU instructions could be executed at once? CPI=1.95

Page 14: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

A compiler designer is trying to decide between two code sequences for a particular machine. Based on the hardware implementation, there are three different classes of instructions:

Class A has 1 cycle Class B has 2 cycles Class C has 3 cycles

The first code sequence has 5 instructions: 2 of A, 1 of B, and 2 of C

The second sequence has 6 instructions: 4 of A, 1 of B, and 1 of C.

Which sequence will be faster? How much?What is the CPI for each sequence?

Page 15: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

A compiler designer is trying to decide between two code sequences for a particular machine. Based on the hardware implementation, there are three different classes of instructions:

Class A has 1 cycle Class B has 2 cycles Class C has 3 cycles

The first code sequence has 5 instructions: 2 of A, 1 of B, and 2 of C 2*1+1*2+2*3 = 10 The second

sequence has 6 instructions: 4 of A, 1 of B, and 1 of C. 4*1+1*2+1*3 = 9 Which sequence will be faster? How much?

What is the CPI for each sequence?

Page 16: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

A compiler designer is trying to decide between two code sequences for a particular machine. Based on the hardware implementation, there are three different classes of instructions:

Class A has 1 cycle Class B has 2 cycles Class C has 3 cycles

The first code sequence has 5 instructions: 2 of A, 1 of B, and 2 of C 2*1+1*2+2*3 = 10 The second sequence has 6 instructions: 4 of A, 1 of B, and 1 of C. 4*1+1*2+1*3 = 9 Which sequence will be

faster? How much? 10 / 9 = 1.11What is the CPI for each sequence?

Page 17: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

A compiler designer is trying to decide between two code sequences for a particular machine. Based on the hardware implementation, there are three different classes of instructions:

Class A has 1 cycle Class B has 2 cycles Class C has 3 cycles

The first code sequence has 5 instructions: 2 of A, 1 of B, and 2 of C 2*1+1*2+2*3 = 10 The second sequence has 6 instructions: 4 of A, 1 of B, and 1 of C. 4*1+1*2+1*3 = 9 Which sequence will be faster?

How much? 10 / 9 = 1.11What is the CPI for each sequence? 10/5 = 2

9/6 = 1.5

Page 18: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

MIPS = Instruction Count

Execution time x 10 6

A popular performance metric is MIPS, the numberof millions of instructions per second.

For a given program,

Page 19: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

MIPS = Instruction Count

Execution time x 10 6

A popular performance metric is MIPS, the numberof millions of instructions per second.

For a given program,

1. Cannot compare if instruction set is different

Page 20: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

MIPS = Instruction Count

Execution time x 10 6

A popular performance metric is MIPS, the numberof millions of instructions per second.

For a given program,

1. Cannot compare if instruction set is different2. Highly dependent on the program

Page 21: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

MIPS = Instruction Count

Execution time x 10 6

A popular performance metric is MIPS, the numberof millions of instructions per second.

For a given program,

1. Cannot compare if instruction set is different2. Highly dependent on the program3. Can be inversely proportional to performance

Page 22: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions: Class A has 1 cycle,Class B has 2 cycles, Class C has 3 cycles

Instruction counts ( billions)Code from A B CCompiler 1 5 1 1Compiler 2 10 1 1

• Which sequence will be faster according to MIPS?• Which sequence will be faster according to execution

time?

MIPS example

Page 23: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions:

Class A Class B Class C

CPI 1 2 3 Instruction counts ( billions)

Code from A B C TotalCompiler 1 5 1 1 7Compiler 2 10 1 1 12 CPU cycles Exec Time MIPSCompiler 1 5+1x2+1x3=10 billionCompiler 2

MIPS example

Page 24: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions:

Class A Class B Class C

CPI 1 2 3 Instruction counts ( billions)

Code from A B C TotalCompiler 1 5 1 1 7Compiler 2 10 1 1 12 CPU cycles Exec Time MIPSCompiler 1 10 billionCompiler 2 15 billion

MIPS example

Page 25: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions:

Class A Class B Class C

CPI 1 2 3 Instruction counts ( billions)

Code from A B C TotalCompiler 1 5 1 1 7Compiler 2 10 1 1 12 CPU cycles Exec Time MIPSCompiler 1 10 billion 1010x10-8=100Compiler 2 15 billion

MIPS example

Page 26: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions:

Class A Class B Class C

CPI 1 2 3 Instruction counts ( billions)

Code from A B C TotalCompiler 1 5 1 1 7Compiler 2 10 1 1 12 CPU cycles Exec Time MIPSCompiler 1 10 billion 100 secCompiler 2 15 billion 150 sec

MIPS example

Page 27: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions:

Class A Class B Class C

CPI 1 2 3 Instruction counts ( billions)

Code from A B C TotalCompiler 1 5 1 1 7Compiler 2 10 1 1 12 CPU cycles Exec Time MIPSCompiler 1 10 billion 100 sec 7x103/100Compiler 2 15 billion 150 sec

MIPS example

Page 28: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions:

Class A Class B Class CCPI 1 2 3

Instruction counts ( billions)Code from A B C TotalCompiler 1 5 1 1 7Compiler 2 10 1 1 12 CPU cycles Exec Time MIPSCompiler 1 10 billion 100 sec 70Compiler 2 15 billion 150 sec 12x103/150

MIPS example

Page 29: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions:

Class A Class B Class C

CPI 1 2 3 Instruction counts ( billions)

Code from A B C TotalCompiler 1 5 1 1 7Compiler 2 10 1 1 12 CPU cycles Exec Time MIPSCompiler 1 10 billion 100 sec 70Compiler 2 15 billion 150 sec 80

MIPS example

Page 30: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

• Performance best determined by running a real application– Use programs typical of expected workload– Or, typical of expected class of applications

e.g., compilers/editors, scientific applications, graphics, etc.

Benchmarks

Page 31: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

• Performance best determined by running a real application– Use programs typical of expected workload– Or, typical of expected class of applications

e.g., compilers/editors, scientific applications, graphics, etc.

• Small benchmarks– nice for architects and designers– easy to standardize– can be abused

Benchmarks

Page 32: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

SPEC CPU 2006

• SPEC - System Performance Evaluation Cooperative• 12 Integer Benchmarks• 17 Floating Point Benchmarks• Fig 1.20 P.49

Page 33: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Execution Time After Improvement =

Execution Time Unaffected +

( Execution Time Affected / Amount of Improvement )

Amdahl's Law

Page 34: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Execution Time After Improvement =

Execution Time Unaffected + ( Execution Time Affected / Amount of Improvement )

• Example:Suppose a program runs in 100 seconds on a

machine, with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?

Amdahl's Law

Page 35: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Execution Time After Improvement =

Execution Time Unaffected + ( Execution Time Affected / Amount of Improvement )

• Example:Suppose a program runs in 100 seconds on a machine,

with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster? Improved Time = (100 – 80) + 80/n = 100/4

Amdahl's Law

Page 36: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Execution Time After Improvement =

Execution Time Unaffected + ( Execution Time Affected / Amount of Improvement )

• Example:Suppose a program runs in 100 seconds on a machine, with

multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster? Improved Time = (100 – 80) + 80/n = 100/4

20 + 80/n = 25 80 = 5n ; n = 16

Amdahl's Law

Page 37: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Execution Time After Improvement =

Execution Time Unaffected + ( Execution Time Affected / Amount of Improvement )

• Example:Suppose a program runs in 100 seconds on a

machine, with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?

How about making it 5 times faster?

Amdahl's Law

Page 38: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Execution Time After Improvement =

Execution Time Unaffected + ( Execution Time Affected / Amount of Improvement )

• Example:Suppose a program runs in 100 seconds on a machine,

with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?

How about making it 5 times faster?Improved Time = (100 –80) + 80/n = 100/5

Amdahl's Law

Page 39: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

Execution Time After Improvement =

Execution Time Unaffected + ( Execution Time Affected / Amount of Improvement )

• Example:Suppose a program runs in 100 seconds on a machine, with multiply

responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?

How about making it 5 times faster?Improved Time = (100 –80) + 80/n = 100/5 20 + 80/n = 20

80/n = 0 Impossible!

Amdahl's Law

Page 40: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

• Performance is specific to a particular program/s

– Total execution time is a consistent summary of performance

Remember

Page 41: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

• Performance is specific to a particular program/s– Total execution time is a consistent summary of

performance

• For a given architecture performance increases come

from:– increases in clock rate (without adverse CPI affects)

Remember

Page 42: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

• Performance is specific to a particular program/s– Total execution time is a consistent summary of

performance

• For a given architecture performance increases come from:– increases in clock rate (without adverse CPI affects)– improvements in processor organization that lower CPI

Remember

Page 43: TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p.60-61 –1.5.4 p.61 –1.5.5 p.61

• Performance is specific to a particular program/s– Total execution time is a consistent summary of performance

• For a given architecture performance increases come from:– increases in clock rate (without adverse CPI affects)– improvements in processor organization that lower CPI– compiler enhancements that lower CPI and/or instruction

count

Remember