24
IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

  • View
    235

  • Download
    6

Embed Size (px)

Citation preview

Page 1: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

IA32 programming for Linux

Concepts and requirements for writing Linux assembly language

programs for Pentium CPUs

Page 2: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

A source program’s format

• Source-file: a pure ASCII-character textfile

• Is created using a text-editor (such as ‘vi’)

• You cannot use a ‘word processor’ (why?)

• Program consists of series of ‘statements’

• Each program-statement fits on one line

• Program-statements all have same layout

• Design in 1950s was for IBM punch-cards

Page 3: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Statement Layout (1950s)

• Each ‘statement’ was comprised of four ‘fields’• Fields appear in a prescribed left-to-right order• These four fields were named (in order):

-- the ‘label’ field-- the ‘opcode’ field-- the ‘operand’ field-- the ‘comment’ field

• In many cases some fields could be left blank• Extreme case (very useful): whole line is blank!

Page 4: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The ‘as’ program

• The ‘assembler’ is a computer program

• It accepts a specified text-file as its input

• It must be able to ‘parse’ each statement

• It can produce onscreen ‘error messages’

• It can generate an ELF-format output file

• (That file is known as an ‘object module’)

• It can also generate a ‘listing file’ (optional)

Page 5: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The ‘label’ field

• A label is a ‘symbol’ followed by a colon (‘:’)• The programmer invents his own ‘symbols’• Symbols can use letters and digits, plus a very

small number of ‘special’ characters ( ‘.’, ‘_’, ‘$’ )• A ‘symbol’ is allowed to be of arbitrarily length • The Linux assembler (‘as’) was designed for

translating source-text produced by a high-level language compiler (such as ‘cc’)

• But humans can also write such files directly

Page 6: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The ‘opcode’ field

• Opcodes are predefined symbols that are recognized by the GNU assembler

• There are two categories of ‘opcodes’ (called ‘instructions’ and ‘directives’)

• ‘Instructions’ represent operations that the CPU is able to perform (e.g., ‘add’, ‘inc’)

• ‘Directives’ are commands that guide the work of the assembler (e.g., ‘.globl’, ‘.int’)

Page 7: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Instructions vs Directives

• Each ‘instruction’ gets translated by ‘as’ into a machine-language statement that will be fetched and executed by the CPU when the program runs (i.e., at ‘runtime’)

• Each ‘directive’ modifies the behavior of the assembler (i.e., at ‘assembly time’)

• With GNU assembly language, they are easy to distinguish: directives begin with ‘.’

Page 8: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

A list of the Pentium opcodes

• An ‘official’ list of the instruction codes can be found in Intel’s programmer manuals:

http://developer.intel.com

• But it’s three volumes, nearly 1000 pages (it describes ‘everything’ about Pentiums)

• An ‘unofficial’ list of (most) Intel instruction codes can fit on one sheet, front and back:

http://www.jegerlehner/intel/

Page 9: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The AT&T syntax

• The GNU assembler uses AT&T syntax (instead of official Intel/Microsoft syntax) so the opcode names differ slightly from names that you will see on those lists:

Intel-syntax AT&T-syntax--------------- ---------------------- ADD addb/addw/addl INC incb/incw/incl CMP

cmpb/cmpw/cmpl

Page 10: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The UNIX culture

• Linux is intended to be a version of UNIX (so that UNIX-trained users already know Linux)

• UNIX was developed at AT&T (in early 1970s) and AT&T’s computers were built by DEC, thus UNIX users learned DEC’s assembley language

• Intel was early ally of DEC’s competitor, IBM, which deliberately used ‘incompatible’ designs

• Also: an ‘East Coast’ versus ‘West Coast’ thing (California, versus New York and New Jersey)

Page 11: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Bytes, Words, Longwords

• CPU Instructions usually operate on data-items• Only certain sizes of data are supported:

BYTE: one byte consists of 8 bits

WORD: consists of two bytes (16 bits)

LONGWORD: uses four bytes (32 bits)• With AT&T’s syntax, an instruction’s name also

incorporates its effective data-size (as a suffix) • With Intel syntax, data-size usually isn’t explicit,

but is inferred by context (i.e., from operands)

Page 12: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The ‘operand’ field

• Operands can be of several types:

-- a CPU register may hold the datum

-- a memory location may hold the datum

-- an instruction can have ‘built-in’ data

-- frequently there are multiple data-items

-- and sometimes there are no data-items• An instruction’s operands usually are ‘explicit’,

but in a few cases they also could be ‘implicit’

Page 13: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Examples of operands

• Some instruction that have two operands:movl %ebx, %ecxaddl $4, %esp

• Some instructions that have one operand:incl %eaxpushl $fmt

• An instruction that lacks explicit operands:ret

Page 14: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The ‘comment’ field

• An assembly language program often can be hard for a human being to understand

• Even a program’s author may not be able to recall his programming idea after awhile

• So programmer ‘comments’ can be vital

• A comments begin with the ‘#’ character

• The assembler disregards all comments (but they will appear in program listings)

Page 15: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

‘Directives’

• Sometimes called ‘pseudo-instructions’

• They tell the assembler what to do

• The assembler will recognize them

• Their names begin with a dot (‘.’)

• Examples: ‘.section’, ‘.global’, ‘.int,’ …

• The names of valid directives appears in the table-of-contents of the GNU manual

Page 16: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

New program example

• Let’s look at a demo program (‘squares.s’)• It prints out a mathematical table showing some

numbers and their squares• But it doesn’t use any multiplications!• It uses an algorithm based on algebra:

(n+1)2 - n2 = n + n + 1

If you already know the square of a given number n , you can get the square of the

next number n+1 by just doing additions

Page 17: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Visualizing the algorithm idean

n

(n + 1)2 = n2 + 2n + 1

Page 18: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

A program with a ‘loop’

• Here’s our program idea (expressed in C)int num = 1, val = 1;do {

printf( “ %d %d \n”, num, val );val += num + num + 1;num += 1;}

while ( num <= 20 );

Page 19: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Some new ‘directives’

• ‘.equ’ – equates a symbol to a value:.equ MAX, 20

• ‘.globl’ – just an alternative for ‘.global’:.globl main

Page 20: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Some new ‘instructions’

• ‘inc’ – adds one to the specified operand:incl arg

• ‘cmp’ – compares two specified operands:cmpl $max, arg

• ‘jle’ – jump (to a specified instruction) if condition ‘less than or equal to’ is true:

jle again

Page 21: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

Comparisons can be ‘tricky’

• It’s easy to get confused by AT&T syntax:

mov $5, %eax

while: inc %eax

cmp $5, %eax

jle while

(e.g., will this loop ever finish executing?)

• REMEMBER: ‘compare’ means ‘subtract’

Page 22: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

The FLAGS register

OF

DF

IF

TF

SF

ZF

0AF

0PF

1CF

Legend: ZF = Zero FlagSF = Sign FlagCF = Carry FlagPF = Parity FlagOF = Overflow FlagAF = Auxiliary Flag

Page 23: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

In-class exercise #1

• How would you modify the source-code for the ‘squares’ program so that it prints out a larger table (i.e., more than 20 lines)?

• How many squares can you display on the screen before your program starts to show ‘wrong’ entries?

Page 24: IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

In-class exercise #2

• Can you write a program that prints out a table showing powers of 2 (it’s useful for computer science students to keep handy)

• Can you see how to do it without using any ‘multiply’ operations – just additions?

• Hint: study the ‘squares.s’ source-code

• Then write your own ‘powers.s’ solution

• Turn in printouts (source and its output)