Upload
briana-laureen-barker
View
217
Download
1
Embed Size (px)
Citation preview
Computer Architecture CS 154
Where software and hardware finally meet
Dr. Franklin
What is Computer Architecture?Program software
Write compilers
Design assembly language
Design processor
Optimize layout, circuits, etc
Design transistortechnology
Architecture
What is Computer Architecture?Program software
Write compilers
Design assembly language
Design processor
Optimize layout, circuits, etc
Design transistortechnology
Architecture
This class!!
Coming together – the basics
• What do high-level instructions get compiled down to?
• How do you build a basic machine?
Coming together – the basics
• What do high-level instructions get compiled down to?
• How do you build a basic machine?
Hardware optimization
• What do high-level instructions get compiled down to?
• How do you build a basic machine?
• How do architects specialize the hardware to run programs quickly?
Hardware optimization
• What do high-level instructions get compiled down to?
• How do you build a basic machine?
• How do architects specialize the hardware to run programs quickly?
Software optimization
• What do high-level instructions get compiled down to?
• How do you build a basic machine?
• How do architects specialize the hardware to run programs quickly?
• How do programmers optimize programs to run quickly?
Software optimization
• What do high-level instructions get compiled down to?
• How do you build a basic machine?
• How do architects specialize the hardware to run programs quickly?
• How do programmers optimize programs to run quickly?
CS 154 Topics
• How do you build a basic machine?
• How do architects specialize the hardware to run programs quickly?
• How do programmers optimize programs to run quickly?
Architecture
• Must understand software– Programs have certain characteristics– Optimize design to take advantage of char.
• Must understand hardware– Hardware design complexity– Ease of programming– Performance– Power
Looking smart for your friends and family
Where is computing going?
Technology Trends: Memory Capacity (Single-Chip DRAM)
• Now 1.4X/yr, or 2X every 2 years.• 8000X since 1980!
Technology Trends: Microprocessor Complexity
Moore’s Law
2X transistors/chipEvery 1.5 years
Alpha 21264: 15 millionPentium Pro: 5.5 millionPowerPC 620: 6.9 millionAlpha 21164: 9.3 millionSparc Ultra: 5.2 million
Moore’s Law
Athlon (K7): 22 Million
Itanium 2: 41 Million
Technology Trends: Processor Performance
1.5X/yr
Intel P4 2000 MHz(Fall 2001)
year
Per
form
ance
mea
sure
This curve has now flattened out - that is why we are seeing multicore
Technology Trends Summary
• Technology trend• 2X every 2.0 years in memory size;
every 1.0 year in disk capacity; every 1.5 years in processor
complexity (Moore’s Law)More processors per chip each
generation
The Architecture Walls
• Memory Wall
• ILP Wall
• Power Wall
The Architecture Walls• Memory Wall – Processor speed kept
increasing, memory did not as quickly, so processor is often idle waiting for memory
• ILP Wall – There are not enough independent instructions for the processor to get real work done when one instruction needs to wait for another (or memory or whatever)
• Power Wall – Solving the above two walls requires too much power, and we don’t have cooling technology to dissipate that much heat.
Beginning of the multi-core era• Multi-core chips
– Place multiple processors on a single die
• Because– They can communicate very quickly– Much higher potential throughput– Less power per area than accelerating single
thread
• But– You need parallel programs (or multiple
programs) to exploit
The next frontier• GPU – Graphics processing unit
– Specialized hardware for graphics– Optimized to run the same thing on many pieces
of data (i.e. pixels)
• Why?– They are mature technology, driven by gaming– Low power parallel processing
• Barrier– Limited programming model– Not appropriate for a lot of programs (i.e. servers)
Performance
• Not an absolute• Depends on application characteristics
– Graphics– General-Purpose desktop– Scientific apps– Servers
• Rapidly changing technology– DRAM speed, chip density, etc.
• This is the focus of our class
What is Computer Architecture?Program software
Write compilers
Design assembly language
Design processor
Optimize layout, circuits, etc
Design transistortechnology
Architecture
This class!!
Why do I care?!?I’m 3 levels above.
But I’m CS
• Why do I have to learn about hardware?(I hear you ask)
But I’m CS
• Why do I have to learn about hardware?(I hear you ask)
• Hardware is optimized to take advantage of particular program characteristics
But I’m CS
• Why do I have to learn about hardware?(I hear you ask)
• Hardware is optimized to take advantage of particular program characteristics
• If your software is different, it can get atrocious performance
But I’m CS
• Why do I have to learn about hardware?(I hear you ask)
• Hardware is optimized to take advantage of particular program characteristics
• If your software is different, it can get atrocious performance
• You must understand general architecture to program for it.
But I’m CS
• Why do I have to learn about hardware?(I hear you ask)
• Hardware is optimized to take advantage of particular program characteristics
• If your software is different, it can get atrocious performance
• You must understand general architecture to program for it.
• In an ideal world, compilers would do this for you. (We live in the real world)
Which is faster?
R1 = A[5];
B[6] = R1
R3 = R0 + R2
R5 = R4 – R3
R7 = R0 + R6
C[7] = R7
R1 = A[5];
R3 = R0 + R2
R7 = R0 + R6
B[6] = R1
R5 = R4 – R3
C[7] = R7
Which is faster in C/Java?
for(i=0;i<n;i++) for(j=0;j<n;j++) A[j][i] = i*j+7;
for(i=0;i<n;i++) for(j=0;j<n;j++) A[i][j] = i*j+7;
What data structure should I use?
• Array or linked structure?
• Does it change often?
• Does it get searched often?
What data structure should I use?
• Array or linked structure?
• Does it change often? – yes – then linked nodes
• Does it get searched often?
What data structure should I use?
• Array or linked structure?
• Does it change often? – yes – then linked nodes
• Does it get searched often?– yes – then array
General Class Info
• When, where and who– Website:
http://www.cs.ucsb.edu/~franklin/154/154.html– Professor: Diana Franklin, franklin@cs– TA: Michael, Nadav, Shivapriya
• Office Hours:– Franklin: MTWR, 3:30-4:30, – HFH 1115– TA:
Grading Policy• Grading
– Labs: 0-5% (0.5% for each attended)– Projects: 25-30% – Quizzes: 10% – Midterms: 25% – Final: 35%
• Plagiarism– You may discuss the design of programming
assignments– You may not show or look at any other group’s
code• Come to office hours!!!• Look at example code from class!!!
– Plagiarism will result in an F in the class and reporting to Judicial Affairs for further action.
Curve
• Individual tests and assignments are not curved
• Curving only occurs at the end to offset grading that is too harsh
Projects• 2 or 3 students per group• Discussions focus on skills for project• Projects build on each other
– Don’t get behind – you have fair warning
– The expectation is that everyone completes all projects properly (as opposed to in the past, where you could get one bad grade and have others not depend on it)
Discussion group
• Piazza– join this week
– Announcements will be made here
– Do not post code or partial solutions EVER, even to ask for help as to what is wrong• Post those privately!
Exams
• 2 MiniExams – 1 side of 1 page notes
• 2 Midterms – 2 sides of 1 page notes
• 1 Final – 2 sides of 2 pages of notes
• if your weighted average on exams < 60% (straight scale), and is well below the class average, you may receive an F
Learning a new ISALearn the syntax, semantics of:
• Arithmetic operations
• Control operations
• Memory operations
High-Level MIPS
• Arithmetic: All computation occurs in registers
• Branches: Two-step process – calculate then branch
• Memory: Move data between registers (for computation) and memory (huge)
MIPS Registers – 32 registersName Reg Number Usage Preserved
across call?
$zero 0 The constant 0 Yes
$v0-$v1 2-3 Function results No
$a0-$a3 4-7 Arguments No
$t0-$t7 8-15 Temporaries No
$s0-$s7 16-23 Saved Yes
$t8-$t9 24-25 More temporaries No
$gp 28 Global pointer Yes
$sp 29 Stack pointer Yes
$fp 30 Frame pointer Yes
$ra 31 Return address Yes
Page 140, Figure 3.13
Operation # meaning
add $2,$3,$5 # $2 <- $3 + $5
sub $2,$3,$5 # $2 <- $3 - $5
addu $2,$3,$5 # $2 <- $3 + $5
slt $2, $3, $5 # if ($3 < $5) $2 <- 1#else $2 <- 0
Arithmetic “R-Format”
• Two input registers – rs & rt
• One output register - rd
Arithmetic “I-format”
• One input register –
• One hard-coded constant -
• One output register -
Operation # comment
addi $2, $3, 8 # $2 <- $3 + 8
andi $2, $3, 10 # $2 <- $3 & 10
slti $2, $3, 7 # if ($3 < 7) $2 <- 1 #else $2 <- 0
Branches
• goto loop• if (i < 100) goto loop
Operation # comment
beq $3,$2,loop # if ($3 == $2) goto loop
bne $3,$2, loop # if ($3 != $2) goto loop
jr $3 # goto $3
j loop # goto loop
jal function # goto function, store return address in $ra
Operation # comment
lw $2, 32($3) # $2 <- M[32 +$3]
sw $2, 16($3) # M[16 +$3] <- $2
Load/Store Instructions
• Displacement addressing mode • Register indirect is Displacement with 0 offset• lw = load word (4 bytes)
Let’s do a code example
int sum = 0;
for(i=0;i<n;i++)sum += A[i];
1. Split apart the parts of the for loop
2. Translate the regular code
3. Insert branches
4. Translate memory operations
int sum = 0;
for(i=0;i<n;i++)sum += A[i];
• int sum = 0;
• i = 0;
• if !(i < n) -> skip loop
• sum += A[i]
• i++
• if (i < n) -> loop again
• int sum = 0;• i = 0;• if !(i < n) -> skip loop• sum += A[i]• i++• if (i < n) -> loop again
• $t0 -> sum, $t1 -> i• assume &A[0] is in $a0, n
is in $a1• addi $t0, $0, 0• add $t1, $0, $0• slt $t2, $t1, $a1• beq $t2, $0, skiploop• loop: sll $t2, $t1, 2• add $t3, $t2, $a0• lw $t2, 0 ($t3)• add $t0, $t0, $t2• addi $t1, $t1, 1 • slt $t2, $t1, $a1• bne $t2, $0, loop• skiploop:
• sum += A[i]
• load A[i]• add it to sum
• sll $t2, $t1, 2• add $t3, $t2, $a0• lw $t2, 0 ($t3)