Lect3-1 Lecture 3: control transfer instructions ICT Software & Data processing Postacademic...

Preview:

Citation preview

Lect3-1

Lecture 3: control transfer instructions

ICT Software & Data processing Postacademic course in ICT

Part I: The hardware-software interface Module 1: Computer architecture

Prof. Koen De BosschereElectronics DeptGhent University

Lect3-2

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-3

Jumps

• Unconditional jumps• Conditional jumps• Computed jumps

Lect3-4

Unconditional Jumps

i1jmp 24

i3i4i5i6i7i8

jmp 20i10

jmp address

1014181c2024282c3034

Lect3-5

Conditional jumps

i1jle 24

i3i4

jmp 2ci6i7i8i9

i10

i1jle 24

i3i4

jmp 2c

i6i7

i8i9

i10Basic blocksBasic blocks

1014181c2024282c3034

Lect3-6

Jump conditions (1)instruction jump

jz jump if zero

jc jump if carryjo jump if overflowjs jump if sign

jnz jump if not zero

jnc jump if not carryjno jump if not overflowjns jump if not sign

Lect3-7

Jump conditions (2)instruction jumpjg jnle jump if greater jge jnl jump if greater or equal jl jnge jump if lessjle jng jump if less or equalje jump if equalja jnbe jump if abovejae jnb jump if above or equaljb jnae jump if belowjbe jna jump if below or equal

bina

ry2’

s co

mpl

emen

t

Lect3-8

Static vs. Computed address

jmp 100 mov ebx,100

jmp ebx

Lect3-9

Absolute vs. relative address• Absolute

jump to address n

• Relativejump n bytes further/back

i1jmp 24

i3i4i5i6i7i8

jmp pc-10i10

1014181c2024282c3034

Lect3-10

Position independent code

i1jmp 24

i3i4i5i6i7i8

jmp pc-10i10

1014181c2024282c3034

i1jmp 24

i3i4i5i6i7i8

jmp pc-10i10

1014181c2024282c3034383c

2c

Lect3-11

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-12

Loop instruction

i1i2

mov ecx,5 i4i5i6i7i8

loop 1ci10

1014181c2024282c3034

loop address

Decrement ecx

Jump to address if ecx <> 0

loop

Lect3-13

Programmed loop

i1i2

mov ecx,5 i4i5i6i7i8

sub ecx,1jnz 1c

1014181c2024282c3034

loop

Lect3-14

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-15

Function call

ca5-15

i1i2

call 34i4i5i6i7i8i9

i10ret

1014182226303438424650

i1i2

call 34

i1i2

call 34 i7i8i9

i10ret

i7i8i9

i10ret

i4i5i6

i4i5i6

Lect3-16

Function call and return int fivefold(int n) { if (n > 0) return n * 5; else return 0; } int g; main() { g = fivefold(6); }

int fivefold(int n) { if (n > 0) return n * 5; else return 0; } int g; main() { g = fivefold(6); }

Lect3-17

Codefivefold: cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

????????eax

????????ebx

????????edx

00000108esp

????????

????????

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314:319:

0000030feip

s=? z=?

Lect3-18

Codefivefold: cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

00000006eax

????????ebx

????????edx

00000108esp

????????

????????

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314:319:

00000314eip

s=? z=?

push 319jmp 300

Lect3-19

Codefivefold: cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

00000006eax

????????ebx

????????edx

00000104esp

????????

00000319

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314`319:

00000300eip

s=? z=?

Lect3-20

Codefivefold: cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

00000006eax

????????ebx

????????edx

00000104esp

????????

00000319

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314:319:

00000302eip

s=0 z=0

Lect3-21

Codefivefold: cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

00000006eax

????????ebx

????????edx

00000104esp

????????

00000319

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314:319:

00000307eip

s=0 z=0

Lect3-22

Codefivefold: cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

00000006eax

00000005ebx

????????edx

00000104esp

????????

00000319

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314:319:

0000030ceip

s=0 z=0

Lect3-23

Codefivefold : cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

0000001eeax

00000005ebx

00000000edx

00000104esp

????????

00000319

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314:319:

0000030eeip

s=0 z=0

Lect3-24

Codefivefold : cmp eax,0 jg positive xor eax, eax retpositive: mov ebx, 5 imul ebx ret

main: mov eax, 6 call fivefold mov g, eax

0000001eeax

00000005ebx

00000000edx

00000108esp

????????

00000319

????????

100

104

108

300:302:304:306:

307:30c:30e:

30f:314:319:

00000319eip

s=0 z=0

Lect3-25

Saving registersfivefold : push edx cmp eax,0 jg positive xor eax, eax pop edx retpositive: mov ebx, 5 imul ebx pop edx ret

main: mov eax, 6 call fivefold mov g, eax

0000001eeax

00000005ebx

????????edx

00000108esp

edx

0000031c

????????

100

104

108

300:301:303:305:307:308

309:30e:310:311:

312:317:31c:

????????0fc

s=0 z=0

Lect3-26

Sav

ing

reg

iste

rsfivefold: push edx push ebx cmp eax,0 jg positive xor eax, eax pop ebx pop edx retpositive: mov ebx, 5 imul ebx pop ebx pop edx ret

main: mov eax, 6 call fivefold mov g, eax

0000001eeax

????????ebx

????????edx

????????0f8

edx

ret address

????????

100

104

108

ebx0fc

s=0 z=0

Lect3-27

Control Flow Graph

fivefold: push edx push ebx cmp eax,0 jg positive

positive: mov ebx, 5 imul ebx pop ebx pop edx ret

xor eax, eax pop ebx pop edx ret

Lect3-28

Control Flow Graph

positive: push edx push ebx mov ebx, 5 imul ebx pop ebx pop edx ret

xor eax, eax ret

fivefold: cmp eax,0 jg positive

Lect3-29

Control Flow Graph

positive: push edx mov edx, 5 imul edx pop edx ret

xor eax, eax ret

fivefold: cmp eax,0 jg positive

Lect3-30

Parameter Passing via stackfivefold: mov eax,[esp+4] cmp eax,0 jg positive xor eax, eax ret 4positive: mov ebx, 5 imul ebx ret 4

main: push 6 call fivefold mov g, eax

ret address

00000006

????????

100

104

108

????????0fc

esp

Lect3-31

Local Variables int fivefold(int n) { int result;

if (n > 0) result = n * 5; else result = 0; return result; } int g; main() { g = fivefold(6); }

int fivefold(int n) { int result;

if (n > 0) result = n * 5; else result = 0; return result; } int g; main() { g = fivefold(6); }

Lect3-32

Local Variablesfivefold: sub esp,4 cmp eax,0 jg positive xor eax,eax mov [esp], eax jmp endpositive: mov ebx, 5 imul ebx mov [esp], eaxend: mov eax,[esp] add esp,4 retmain: mov eax, 6 call fivefold mov g, eax

????????

ret address

????????

100

104

108

????????0fc

esp

Lect3-33

Control Flow Graph

fivefold: sub esp,4 cmp eax,0 jg positive

positive: mov ebx, 5 imul ebx mov [esp],eax

xor eax, eax mov [esp],eax jmp end

end:

mov eax,[esp]

add esp,4 ret

Lect3-34

Complete Picturefivefold: push ebx push edx sub esp, 4 mov eax, [esp+10] cmp eax,0 jg positive xor eax,eax mov [esp], eax jmp endpositive: mov ebx, 5 imul ebx mov [esp], eaxend: mov eax,[esp] add esp, 4 pop edx pop ebx ret 4

main: push 6 call fivefold mov g, eax

resultedxebx

ret address00000006

espesp+4esp+8esp+c

esp+10

stack frame

Lect3-35

Stack frames

p1

p2

p3

p4

Activation tree

Lect3-36

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-37

Interrupts• Jump to a routine via a number instead

of an address

• The addresses of the routines are stored in an address table (the so-called vector table)

• Used to catch errors, or as an interface to the operating system

Lect3-38

Interrupts

i1i2i3i4i5i6i7i8

int 2i10i11

int 3

t

01234

i1i2i3i4i5i6

Interrupt routine 3

Vector table

Lect3-39

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-40

System operations

• Controlling the machine: manipulation of the processor state– interrupts on/off– changing the privilege level– halt instruction– switching from big-endian to little-endian– memory management (caches, virtual

memory, etc.)

Lect3-41

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-42

MIPS & MFLOPS• MIPS: Million instructions per second• MFLOPS: Million floating point operations per

second• Problems:

– Depends on the architecture (multiply accumulate = 1 or 2 instructions?)

– Depends on de the program

Lect3-43

Benchmark programs

• The only reliable performance metrics = execution time

• Always mention the program + input?

• Ideally: your own application

• Difficult to realize (porting)

• That’s why ‘typical programs’ are used: benchmarks

Lect3-44

Benchmark programs• Whetstone, Dhrystone• EEMBC, Mediabench• TPC-benchmarks• SPEC (Standard Performance

Evaluation Corporation): Spec92, Spec95, Spec2000, Spec2006 (SpecInt, SpecFp, SpecRate)

• Ziff-Davis

Lect3-45

# MHz  Processor  int peak    MHz  Processor  fp peak 

1   3400  Pentium 4 EE 1705    1500  Itanium 2 2161

2   3200  Pentium 4 Xeon 1563    1700  POWER4+ 1776

3   2200  Opteron 1477    3400  Pentium 4 EE 1561

4   2200  Athlon 64 FX 1447    2200  Opteron 1514

5   3200  Pentium 4 E 1421    1150  Alpha 21364 1482

6   3000  Pentium 4 Xeon MP 1408    3200  Pentium 4 E 1441

7   1500  Itanium 2 1404    2200  Athlon 64 FX 1423

8   3400  Pentium 4 1393    1250  Alpha 21264C 1365

9   2000  Athlon 64 1335    1320  SPARC64 V 1350

10   1700  POWER4+ 1158    3200  Pentium 4 Xeon 1347

11   2200  Athlon XP 1080    2800  Pentium 4 1327

12   1250  Alpha 21264C 928    3000  Pentium 4 Xeon MP 1283

13   1350  SPARC64 V 905    1300  POWER4 1281

14   1150  Alpha 21364 877    2000  Athlon 64 1250

15   1300  POWER4 848    1200  UltraSPARC III Cu 1118

16   2000  Athlon MP 766    1280  UltraSPARC IIIi 1063

17   1200  UltraSPARC III Cu 722    2200  Athlon XP 982

18   1280  UltraSPARC IIIi 704    833  Alpha 21264B 784

19   1000  Pentium M 687    800  Itanium 701

20   875  PA-RISC 8700+ 678    875  PA-RISC 8700+ 674

Lect3-46

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-47

Instruction decoding

05 02 00

0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

w word

1

data2

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

opcodeadd immediate to accumulator

0 0 0 0 0 1 0

add ax,2

Lect3-48

Instruction descriptionSUB subtract O D I T S Z A P C 001010dwoorrrmmm disp * * * * * *

ea=514 sub r,r sub r,m sub m,r8088 3 13+ea 24+ea8086 3 9+ea 16+ea80286 2 7 780386 2 7 680486 1 2 3Pentium 1 2 3

Lect3-49

Overview• Jumps• Loops • Procedure call and return• Interrupts• System operations• Measuring performance• Instruction decoding• Compilers, linkers, and loaders

Lect3-50

objectfileobjectfiles

Program development

source compilercompiler objectfile

linkerlinkerlibraries

executable file

Lect3-51

Compiler

• Different phases– Lexical analysis: a, 12, then, (, while– Syntactic analysis: if (a>b) then …– Semantic analysis: type-checking– Optimization – Code generation– Scheduling

Lect3-52

Object file

objectfile

instructions

extra information

global variables

Lect3-53

Linker

linkerlinker

Lect3-54

Loader

loaderloader

stack

heap

memory

Dynamically allocated memory