Assembly Language - UNBowen/courses/2253-2017/slides/03-arm-assembly.pdf · Assembly Language Some insane ... Book Fig 1.12 ... v4 DCDU 40 If v1 is at 3000, then v2 starts at 3001

Assembly LanguageCS2253

Owen Kaser, UNBSJ

Assembly Language

● Some insane machine-code programming● Assembly language as an alternative● Assembler directives● Mnemonics for instructions

Machine-Code Programming (or, Why Assemblers Keep Us Sane)

● Compute 10+9+8+7+6+5+4+3+2+1– Put the constant 0 into R1

– Put the constant 10 into R2

– Add R1 and R2, put the result into R1

– Subtract the constant 1 from R2 and set the status flags

– If the Z flag is not set, reset the PC to contain the address of the 3rd instruction above.

● Let's try to make some machine code.

Put 0 into R1

● There's a Move instruction, or you could subtract a register from itself, or EOR a register with itself, or... let's use Move.

● Book Fig 1.12●

● cond = 1110 means unconditional● S=0 means don't affect status flags● I=1 means constant; opcode = 1101 for Move● Rn = ???? say 0000; Rd = 0001 for R1● bits 8-11: 0000 Rotate RIGHT by 0*2 ● bits 0-7: 0x00 = 0x00● So machine code is

1110 00 1 1101 0 0000 0001 0000 00000000 = 0xE3A01000

Put 10 into R2

.●

● cond = 1110 means unconditional● S=0 means don't affect status flags● I=1 means constant; opcode = 1101 for Move● Rn = ???? say 0000; Rd = 0010 for R2● bits 8-11: 0000 (rotate right by 2*0 ) bits 0-7: 0x0A● So machine code is

1110 00 1 1101 0 0000 0010 0000 00001010 = 0xE3A0200A

●Add R1 and R2, put result into R1

● Same basic machine code format as Move

● cond = 1110 for “always” ; I=0 (not constant)● opcode = 0100 for ADD; S=0 (no flag update)● Rn = R1, Rd = R1● shifter_operand = 0x002 for R2 unmolested● Having fun yet??● 1110 00 0 0100 0 0001 0001 0000 0000 0010 = 0xE0811002

●Subtract 1 from R2, result into R2

● Same basic machine code format as Move

● cond = 1110 for “always” ; I=1 (constant)● opcode = 0010 for Subtract; S=1 (yes flag update)● Rn = R2, Rd = R2● shifter_operand = 0x001 for 1 rotated right 0 positions ● 1110 00 1 0010 1 0010 0010 0000 0000 0001 = 0xE2522001

Maybe Rinse and Repeat

● If the Z flag is not set, we want go back 2 instructions before this one.

● book Fig 3.2

● cond = 0001 means “when Z flag is not set”● L=0 means “don't Link” (Link changes R14)● signed offset should be -4. The PC is already 2 instructions ahead

of this one, and we want to go back 2 more than that.● 0001 101 0 111111111111111111111100 = 0x1AFFFFFC● Are you REALLY having fun yet ??

How'd you know the cond codes?

How'd You Know the Shifter Magic?

An Assembler

● Rather than making you assemble together all the various bit fields that make up a machine instruction, let's make a program do that.

● You are responsible for breaking the problem down into individual instructions, which will be given human friendly names (mnemonics).

● You give these instruction names to the assembler, along with various other directives (aka pseudo-ops) that control how the assembler does its job.

● It is responsible for producing the binary machine code.● It also produces symbol table information needed by a

subsequent linker program, if you write a multi-module program.

Assembly Language● You communicate with the assembler via assembly

language (mix of mnemonics, directives, etc.)● Assembly language is line-oriented.● A line consists of

– an optional label in column 1

– an optional instruction or directive (and any arguments)

– an optional comment (after a ; )

● Example:

here b here ; create infinite loop.● “here” is a label that marks a place● b is a branch instruction, forces the PC to a new location

(here).

The Bad News

● Anyone who creates an assembler gets to define their own assembly language (ignoring manufacturer's suggestions). Dialects?

● Textbook shows code for Keil and Code Composer Studio. But we use Crossware's assembler, which is yet another dialect and it's hard to find documentation on it.

● Textbook talks about “Old ARM format” and “UAL format”. Crossware is a mixture (more old).

Our Program in Assembly

mymain mov r1,#0 ← mymain is the label

mov is the instruction

# precedes the constant

; nice comment, eh?

mov r2,#10 ; put 10 into r2 (bad comment)

myloop add r1, R1, r2 ← case insensitive for reg names

subs r2, r2, #1 ← final s means to affect flags

bne myloop ← condition is “ne” (z flag false)

sticky b sticky ← so we don't fall out of pgm

end ← directive to assembler: you're done

;don't use “end”; it seems to be buggy in Crossware

Register Names

● r0 to r15 (alias R0 to R15)● SP or sp, aliases for R13● LR or lr, aliases for R14● PC or pc, aliases for R15● cpsr or CPSR (the status registers etc)● spsr or SPSR, apsr or APSR (later)● not s0-s3 or a1-a4 (unlike book page 63)

Popular Assembler Directives

● Textbook Section 4.4 describes the set of directives supported by the Keil assembler and the TI assembler.

● Our Crossware assembler is different than both (but closer to Keil).

● Let's look at directives to– set aside memory space for variables/arrays

– define a block of code or data

– give a symbolic name to a value

Directive to Set Aside Memory

● The SPACE directive tells the assembler to set aside a specified number of bytes of memory. These locations will be initialized to 0.

● Usually have a label, since you need a name to refer to the allocated memory.

● Example– myarray SPACE 100

– myarr2 SPACE 100*4 ←constant expression's ok

● Later, instructions can load and store things into the chunks of memory by referring to the names used.

● If myarray starts at address 1234, myarr2 starts at 1234+100

Use of SPACE

● An assembly language programmer uses SPACE for the same reasons that a Java programmer uses an array.

Directives for Memory Variables

● Use DCB to declare an initialized byte variable.● DCW for initialized halfword, DCD for word.● Example

myvar1 DCB 50 ← decimal constant myvar2 DCB 'x' ← ASCII code of 'x' myvar3 DCB 0x55 + 3 ← constant expression

● If myvar1 ends up being at address 1234, then myvar2 will be at 1235 and myvar3 at 1236

Alignment

● DCW assumes you want the memory variable to start at a multiple of 2 (“halfword aligned”)

● DCD assumes you want alignment to a multiple of 4.

● To achieve this, assembler will insert padding.● If you really want to set aside a word without

padding, use DCDU. The “U” is for unaligned.● There's also DCWU.

Alignment Example

v1 DCB 10

v2 DCW 20

v3 DCB 30

v4 DCD 40

If v1 is at address 3000, then

v2 starts at 3002 (1 byte of padding)

v3 is at 3004

v4 starts at 3008 (3 bytes padding)

v1 DCB 10

v2 DCWU 20

v3 DCB 30

v4 DCDU 40

If v1 is at 3000, then

v2 starts at 3001

v3 is at 3003

v4 starts at 3004 (aligned by luck)

More Alignment Control

● Keil assembler has an ALIGN directive that can force alignment to the next word boundary (inserting 0-3 bytes of padding).

● In Crossware, the directive takes a numeric argument. So ALIGN 4 (or ALIGN 8)

DCB with Several Values

● You can use DCB with several comma-separated values● Several consecutive memory locations are set aside. A label

names the first of them.● Example: foo DCB 1,2,3,4● We can access the location initialized to 3 as “foo+2”● A quoted string is equivalent to a comma separated list of ASCII

values.

DCB “XY” is same as DCB 'X','Y' or DCB 88,89● DCW and DCD can also take a comma-separated list.● Common use: make a small initialized table.

DCB: Signed or Unsigned?

● DCB's argument must be in the range -128 to +255.● -ve values are 2's complement● +ve values are treated as unsigned● So DCB -1, 255 is same as

DCB 255, 255● Similarly DCW's arguments in range -32768 to

+65535.● DCD from -231 to +232-1

AREA directive

● In general, an assembly language program can have several blocks of data and several blocks of code. And it can be written in several different source-code files.

● The AREA directive marks the beginning of a new block. You give it a new name and specify its type. – eg AREA fred,code

– You can go back to a previous area by using an old name

● A tool called a linker runs after the assembler to put your various sections (and any library routines you need) into a single program.

● Much more on linkers later in the course

AREA Example

AREA mycode,code

foo add R1, R2, R3

add R4, R5, #10

AREA mydata, data

var1 dcb “cs2253”

AREA mycode ← continues mycode where it left off

add R6, R7, R8

This feature allows for us to show our data declarations near the code that uses them (maybe good software engineering), even if the different sections end up being far apart in memory.

Memory picture on board...

Code in Data, Data in Code

● Q: Is this allowed; if so, what does it do?

AREA mycode, CODE

starthere add R1, R2, R3

DCD 0x1234567 ; this line is fishy

add R2, R3, R4

AREA mydata, DATA

var1 DCD 1234

var2 add R2, R3, R4 ; this line is also fishy

var3 DCB “hello world”,0

Operators in expressions

add R4, R5, #10 ↔ add R4, R5, #3+3+3*1+1● Both of the above generate the same single

machine-code instruction.● The + and * operators are just requests to the

assembler to do a little bit of math when it processes the line. No runtime effect.

● Other operators supported by Crossware are | and & (bitwise AND and OR). Also >> and <<.

● I can't find XOR, mod (unlike Keil and CCS on page 75)

EQU: Give a Symbolic Name

● The EQU directive is used to give a symbolic name to an expression. Use it to make code easier for humans.

● Example

fred DCB 20, 200, “Frederick Wu”

fred_age EQU fred+0

fred_height EQU fred+1

fred_name EQU fred+2

Subsequent instructions can load data from fred_height rather than the more cryptic fred+1.

But to the assembler, both loads will be equivalent.

Directives Crossware May Lack

● Compared to Keil and CCS, our Crossware assembler does not appear to support some directives. I can't find good documentation, so maybe they exist under a different name :(

– ENTRY

– RN

– LTORG, though we do have the “LDR rx,=” construct (eg textbook page 72)

– SETS

● Also, the SECTION directive only takes attributes CODE and DATA. Not the others in textbook Table 4.3.

● Crossware does support macros and conditional assembly, advanced topics for later in the course.

A Few Instructions

● Assembler directives are great, but the main thing in assembly language is to specify instructions (and then get the assembler to generate the associated machine codes)

● So far (from the loop example) we know– add

– sub

– b

– mov

A Few More Instructions (Table 4.1)

● These are math-ish instructions:– RSB – reverse subtract

– ADC, SBC – add/subtract with carry

– RSC – reverse subtract with carry

– MVN – move “negative” (a bitwise NOT)

– AND, ORR, EOR, BIC – bitwise logical operations

– MUL, SMULL, UMULL – various * ops

– MLA, SMLAL, UMLAL – multiply/accumulate.

Mnemonics

● A mnemonic is “a memory aid”.● It’s hard to remember the bit pattern associated

with a machine operation.● As a memory aid, we have human-friendly

names like ADD, SUB etc.● They are our mnemonics.

From Reference

Example: Swapping

● Java swap of v1 and v2:

temp = v1; v1 = v2; v2 = temp;● Naive ARM swap of r1 and r2

mov r3, r1

mov r1, r2

mov r2, r3

● Clever swap avoids trashing r3 (book p 53):

eor r1, r1, r2

eor r2, r1, r2

eor r1, r1, r2

● Book “Hacker's Delight” is full of this kind of trick.

Example: 64-Bit Addition

● Assume r1 contains the high 32 bits of value X and r2 contains the low 32 bits

● Assume r3 contains the high 32 bits of Y and r4 contains the low 32 bits.

● Want result in r5 (high bits) and r6 (low bits)

ADDS r6, r2, r4 ; add low words [affect flags]

ADC r5, r1, r3 ; add high words

Computing Your Grade

● Test was out of 80. Prof told you how many points you lost (put the number into R1). Figure out what your grade out of 80 was:

RSB R2, R1, #80● Now your grade is in R2.

Constant Operands● Most instructions have register values or

constants as the operands● (Exception: Load and store instructions – later)● All 8-bit constants are okay● As are all constants of the form

RotateRight( v, 2*amt)

where v is an 8-bit value and amt from 0 to 15.● So 0xAB is ok

– so is 0xAB0 ( 0xAB with a 28 bit rotate right)

– so is 0xB000000A (0xAB with a 4-bit rotate right)

Why This Weirdness

● Studies show that most constants are small.● Among larger constants, bit-masks containing a

small chunk of mixed bits are common (surrounded by zeros)

● Similar bitmasks that are mostly 1s can be handled by using the MVN instruction

● A RISC architecture with 32-bit instructions isn't long enough to encode an arbitrary 32-bit constant. So just allow the most common ones.

● Assembler complains if you use a constant that cannot fit this weirdness.

Machine Instruction With Constant

The Barrel Shifter's Place

Shifted Register Operands

● If the second operand is a register value, the barrel shifter can modify it as it travels down the B bus.

● Barrel shifter is capable of LSL (logical left shift)– LSR (logical shift right)

– ASR (arithmetic shift right)

– ROR (rotate right)

– RXX (33 bit ROR using carry between MSB and LSB)

● No modification desired? Shift by 0 positions!● Carry flag is involved (but the new carry value is

not necessarily written into the status register)

ARM Shifts and Rotates

How Much Shifting● With RRX, it appears the register can only be shifted

by one position.● With others, you can shift 0 to 31 positions

– Either as a constant (“immediate”)

– Or by the least significant 5 bits of a register

● There are separate machine code formats for these cases.– Bit 4 distinguishes the cases

– Bits 5 & 6 say what kind of shift/rotate

– Bits 11 to 7 involve which register, or the constant

Machine Encoding (from Ref Man)

● Below, shift field is 00 for LSL, 01 for LSR, 10 for ASR, 11 for ROR. RRX also 11 with count of 0 (and rotates only one position).

Example

● Machine code to take R1, logical left shift it by 3 positions, result in R2

● Assembly language: MOV R2, R1, LSL #3● It’s the “immediate shift” format:

– Bits 27, 26, 25 and 4 are all 0

● Bits 11 to 7 are 00011 (for the #3)● Bits 3 to 0 are 0001 (since R1 is being shifted)● Bits 5 & 6 are 00 to select the LSL kind of shift● Unconditional, bits 31 to 28 are 1110; MOV opcode 1101● So: 1110 00 0 1101 0 ???? 0010 00011 00 0 0001 = 0xE1A02181

Setting Conditions

● Any of the data-processing instructions so far can optionally affect the flags.

● At the machine-code level, bit 20 (called S) controls this: S=1 means to set the flags

● In assembly language, you append an S on the mnemonic. ADDS instead of ADD

● Also, there are some instructions whose sole purpose is to set flags: they don’t change any of R0 to R15.

● Compare (CMP, CMN) and Test (TST, TEQ) instructions.

● Let’s add 1+2+3+… until sum exceeds R4 (unsigned)

MOV R1,#0 ; The sum

MOV R2,#1

LP ADD R1, R1, R2

ADD R2, R2, #1

CMP R1, R4 ; computes R1 – R4, sets flags

BLS LP ; LS = unsigned Lower or Same (CF=0 or Z=1)

; use LE for signed Lesser or Equal

Sum to a Limit

Multiplication

● The ARM v4 ISA has 6 multiplication instructions.

● Does not include “multiply by a constant”● Why several?

– Should product be 32 bits or 64 bits?

– Are the input values considered signed?

32-Bit Products

● Fact: Since the product stored is the low-order 32 bits of the true product, signed and unsigned variations would give same result. So not separate instructions.

● MUL instruction: Two registers' values multiplied, low-order 32 bits stored in destination register.

● MLA (multiply and accumulate). The low order 32-bits of the product are added to a 3rd register and stored in a 4th register.

● Eg: MLA R4, R1, R2, R3 ; R4 = R1*R2 + R3

64-Bit Product (Long Multiply)

● Results are stored in a pair of registers.● The “accumulate” version has the product added onto the 64-bit

value in a pair of registers.● SMULL – signed long multiply● UMULL – unsigned long multiply● UMLAL - unsigned long multiply accumulate● SMLAL – signed long multiply accumulate● Ex: UMLAL R1, R2, R3, R4 means

(R1, R2) ← (R1, R2) + R3*R4 with unsigned math

– Above, R1 is the least significant 32 bits

Documents

Assembly Language - UNBowen/courses/2253-2017/slides/03-arm-assembly.pdf · Assembly Language Some insane ... Book Fig 1.12 ... v4 DCDU 40 If v1 is at 3000, then v2 starts at 3001