58
CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Embed Size (px)

Citation preview

Page 1: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

CSC 3210Computer Organization and

Programming

Chapter 2

SPARC Architecture

Dr. Anu Bourgeois1

Page 2: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Introduction

• SPARC is a load/store architecture

• Registers used for all arithmetic and logical operations

• 32 registers available at a time

• Uses only load and store instructions to access memory

2

Page 3: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Registers

• Registers are accessed directly for rapid computation

• 32 registers – divided into 4 sets-- Global: %g0-%g7 -- Out: %o0 - %o7

-- In: %i0 - %i7 -- Local: %l0 - %l7

• %g0 – always returns 0

• %o6, %o7, %i6, %i7 – do not use

• Register size = 32 bits each

3

Page 4: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Table of RegistersGlobal registers Out registers Local registers Out registers

Register Synonym Register Synonym Register Synonym Register Synonym

%g0* %r0 %o0 %r8 %l0 %r16 %i0 %r24

%g1 %r1 %o1 %r9 %l1 %r17 %i1 %r25

%g2 %r2 %o2 %r10 %l2 %r18 %i2 %r26

%g3 %r3 %o3 %r11 %l3 %r19 %i3 %r27

%g4 %r4 %o4 %r12 %l4 %r20 %i4 %r28

%g5 %r5 %o5 %r13 %l5 %r21 %i5 %r29

%g6 %r6 %o6 %r14,%sp

%l6 %r22 %i6, %fp

%r30

%g7 %r7 %o7# %r15 %l7 %r23 %i7^ %r31

4

* -- Always discards writes and returns zero# -- Called subroutine return address^ -- Subroutine return address

Page 5: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

SPARC Assembler

• SPARC assembler as: 2-pass assembler

• First pass: – Updates location counter without paying

attention to undefined labels for operands– Defines label symbol to location counter

• Second pass:– Values substituted in for labels– Ignores labels followed by colons

5

Page 6: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Assembly Language Programs• Programs are line based• Use mnemonics which generate machine

code upon assembling• Statements may be labeled• Comments: ! or /* … */

/* instructions to add and to subtract the contents of %o0 and %o1 */

start: add %o0, %o1, %l0 !l0=o0+o1

sub %o0, %o1, %l1 !l1=o0-o1

6

Page 7: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Psuedo-ops

• Statements that do not generate machine code– e.g. Data defininitions, statements to provide the

assembler information

• Generally start with a period

a: .word 3

• Can be labeled

.global main

main:

7

Page 8: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Compiling Code – 2 step process• C compiler will call as and produce

the object files

• Object files are the machine code

• Next calls the linker to combine .o files with library routines to produce the executable program – a.out

8

Page 9: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Compiling a C program

%gcc -S program.c : produces the .s assembly language file

%gcc expr.s –o expr : assembles the program and produces the executable file

NOTE: You will only do this for the 1st assignment

9

Page 10: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Start of Execution

• C compiler expects to start execution at an address main

• The label must be at the first statement to execute and declared to be global

.global main main: save %sp, -96, %sp

• save instruction provides space to save registers for the debugger

10

Page 11: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Macros

• If we have macros defined, then the program should be a .m file

• We can expand the macros to produce a .s file by running m4 first

% m4 expr.m > expr.s

% gcc expr.s –o expr

11

Page 12: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

SPARC Instructions

• 3 operands: 2 source operands and 1 destination operand

• Source registers are unchanged

• Result stored in destination register

• Constants : -4096 ≤ c < 4096op regrs1, regrs2, regrd

op regrs1, imm, regrd

12

Page 13: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Sample Instructions

clr regrd

Clears a register to zero

mov reg_or_imm, regrd

Copies content of source to destination

add regrs1, reg_or_imm, regrd

Adds oper1 + oper2 destination

sub regrs1, reg_or_imm, regrd

Subtracts oper1 - oper2 destination13

Page 14: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Multiply and Divide

• No instruction available in SPARC

• Use function call instead

• Must use %o0 and %o1 for sources and %o0 holds result

mov b, %o0 mov b, %o0

mov c, %o1 mov c, %o1

call .mul call .div

a = b * c a = b ÷ c14

Page 15: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Instruction Cycle

• Instruction cycle broken into 4 stages:Instruction fetch Fetch & decode instruction, obtain any

operands, update PC

Execute Execute arithmetic instruction, compute branch target address, compute memory address

Memory access Access memory for load or store instruction; fetch instruction at target

of branch instruction

Store results Write instruction results back to register file

15

Page 16: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Pipelining

• SPARC is a RISC machine – want to complete one instruction per cycle

• Overlap stages of different instructions to achieve parallel execution

• Can obtain a speedup by a factor of 4

• Hardware does not have to run 4 times faster – break h/w into 4 parts to run concurrently

16

Page 17: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Pipelining• Sequential: each h/w stage idle 75% of the time.

timeex = 4 * i

• Parallel: each h/w stage working after filling the pipeline. timeex = 3 + i

17

Page 18: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Data Dependencies – Load Delay Problem

load [%o0], %o1

add %o1, %o2, %o2

18

Page 19: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Branch Delay Problem• Branch target address not available until after

execution of branch instruction

• Insert branch delay slot instruction

19

Page 20: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Branch delays

• Try to place an instruction after the branch that is useful – can also use nop

• The instruction following a branch instruction will always be fetched

• Updating the PC determines which instruction to fetch next

20

Page 21: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

cmp %l0, %l1

bg next

mov %l2, %l3

sub %l3, 20, %l4

Condition true:

branch to next

Condition false:

continue to sub

cmp bg mov

???

bg

execute

mov

fetch

21

F E M W

F E M W

F E M W

F E M W

Determine if branch

taken

Update if true

Target PC

Fetch instruction from memory[PC]

Update PCPC++

Obtain operands

Page 22: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Actual SPARC Code: expr.m

22

Page 23: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Expanding Macros• After running through m4: %m4 expr.m > expr.s

• Produce executable: %gcc expr.s – expr

• Execute file: %./expr

23

Page 24: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

The Debugger – gdb

• Used to verify correctness, and find bugs

• Can also execute a program, stop execution at any point and single-step execution

• After assembling the program and placing the output into expr, launch gdb: %gdb expr

• To run code in gdb, type “r”:

(gdb) r

24

Page 25: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

gdb Commands• Can be set at any address to stop execution in order to check

status of program and registers• To set a breakpoint at a label:

(gdb) b mainBreakpoint 1 at 0x106a8(gdb)

• Typing “c” continues execution until it reaches the next breakpoint or end of code

• Can print contents of a register

(gdb) p $l1$2 = -8(gdb)

• Best way to learn is by practice

25

Page 26: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots

• The call instruction is called a delayed control transfer instruction : changes address from where future instructions will be fetched

• The following instruction is called a delayed instruction, and is located in the delay slot

• The delayed instruction is executed before the branch/call happens

• By using a nop for the delay slot – still wasting a cycle

• Instead, we may be able to move the instruction prior to the branch instruction into the delay slot.

26

Page 27: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots

• Move sub instructions to the delay slots to eliminate nop instructions .global main

main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1

call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor

mov %o0, %l1 !store it in y

ret ! end the program restore

27

Page 28: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots

• Executing the mov instruction, while fetching the sub instruction .global main

main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1

call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor

mov %o0, %l1 !store it in y

ret ! end the program restore

28

EXECUTE FETCH

Page 29: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots

• Now executing the sub instruction, while fetching the call instruction .global main

main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1

call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor

mov %o0, %l1 !store it in y

ret ! end the program restore

29

EXECUTE FETCH

Page 30: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots• Now executing the call instruction, while fetching the sub instruction .global main

main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1

call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor

mov %o0, %l1 !store it in y

ret ! end the program restore

• Execution of call will update the PC to fetch from mul routine, but since sub was already fetched, it will be executed before any instruction from the mul routine

30

EXECUTE FETCH

Page 31: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots• Now executing the sub instruction, while fetching from the mul

routine .global main

main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1 call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y

ret ! end the program restore ……

.mul:save …..……

31

EXECUTE

FETCH

Page 32: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots• Now executing the save instruction, while fetching the next instruction

from the mul routine .global main

main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1 call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y

ret ! end the program restore ……

.mul:save …..……

32

EXECUTE FETCH

Page 33: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

Filling Delay Slots• While executing the last instruction of the mul routine, will come back to

main and fetch the call .div instruction

.global main main:

save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call .mul sub %l0, 7, %o1 !(x - 7) into %o1

call .div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor

mov %o0, %l1 !store it in y

ret ! end the program restore ……

.mul:save …..……

33EXECUTE

FETCH

At this point %o0 has the result from the multiply routine – this is the first operand for the divide routine

The subtract instruction will compute the 2nd operand before starting execution of the divide routine

Page 34: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

34

2.9 BranchingInstructions for testing and branching:

2.9.1 Testing

The information about the state of execution of an instruction is saved in the following flags:Z zero whether the result was zeroN negative whether the result was negativeV overflow whether the result was too large for the registerC carry whether the result generated a carry out

Special add and sub instructions:‘cc’ is appended to the mnemonic, and the instruction sets condition codes Z, N, V, and C to save the state of execution.

E.g. addcc regrs1, reg_or_imm, regrd

subcc regrs1, reg_or_imm, regrd

Page 35: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

35

2.9.2 Branches• Branch instructions are similar to call instructions.• They will specify the label of the destination instruction. • These too are delayed control transfer instructions.

Branch instructions test the condition codes in order t determine if the branching condition exists:

b_{icc} label

where bicc stands for one of the branches testing the integer condition codes.

Page 36: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

36

Table of signed number branches

Assembler

Mnemonic

Unconditional

Branches

ba Branch always, goto

bn Branch never

Assembler

Mnemonic

Signed Arithmetic

Branches

bl Branch on less than zero

ble Branch on less or equal to zero

be Branch on equal to zero

bne Branch on not equal to zero

bge Branch on greater or equal to zero

bg Branch on greater than zero

Page 37: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

37

Page 38: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

38

Page 39: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

39

Page 40: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

40

Page 41: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

41

Page 42: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

42

2.10 Control statements

2.10.1 While :The condition of a while loop is to be evaluated before the loop is executed, and if the condition is not met, the loop, including the first instruction of the loop, is not to be executed.Consider the C equivalent of the while loop:

While ( a <= 17){

a = a += b;c++;

}

Page 43: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

43

Page 44: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

44

Page 45: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

45

Page 46: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

46

Page 47: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

47

Annulled Conditional Branches:

-Branch is taken if condition is true, otherwise, if condition is false, then branch is annulled-Delay slot is still fetched in either case, but the execution is what is annulled, causing a wasted cycle when false

Page 48: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

48

2.10.2 Do

Consider a Do loop:

Page 49: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

49

Page 50: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

50

2.10.3 For

For structure in C:

For ( ex1; ex2;, ex3 ) st

Express the above definition as:

ex1;

While ( ex2 ) {

st

ex3

}

Page 51: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

51

Thus the translation of for (a=1; a<= b; a++)

c *= a;would be:

Page 52: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

52

2.10.4 If Then

The statement following the relational expression is to be branched over if the condition is not true. To accomplish this, we need to logically complement the sense of the branch, following the relational expression evaluation, before the code for the statement.

Table of complements of the branches

Condition Complement

bl bge

ble bg

be bne

bne be

bge bl

bg ble

Page 53: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

53

For example, to translate

Page 54: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

54

Page 55: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

55

2.10.5 If Else

An if-else statement allows us to do a letter with regard to filling the delay slot.

Consider:

If ((a+b) >= c) {

a += b;

c++;

} else {

a -= b;

C--;

}

C += 10;

Page 56: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

56

We will complement initial test to branch over and then code to the else code if the condition is false.

Page 57: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

57

Page 58: CSC 3210 Computer Organization and Programming Chapter 2 SPARC Architecture Dr. Anu Bourgeois 1

58